kaori
A C++ library for barcode extraction and matching
|
Search for barcodes with segmented mismatches. More...
#include <MismatchTrie.hpp>
Classes | |
struct | Result |
Result of the segmented search. More... | |
Public Member Functions | |
SegmentedMismatches () | |
SegmentedMismatches (std::array< int, num_segments > segments, DuplicateAction duplicates) | |
Result | search (const char *search_seq, const std::array< int, num_segments > &max_mismatches) const |
Public Member Functions inherited from kaori::MismatchTrie | |
MismatchTrie () | |
MismatchTrie (size_t barcode_length, DuplicateAction duplicates) | |
AddStatus | add (const char *barcode_seq) |
size_t | get_length () const |
int | size () const |
void | optimize () |
Additional Inherited Members | |
Static Public Attributes inherited from kaori::MismatchTrie | |
static constexpr int | STATUS_MISSING = -1 |
static constexpr int | STATUS_AMBIGUOUS = -2 |
Search for barcodes with segmented mismatches.
This MismatchTrie
subclass will search for the best match to known sequences in a barcode pool. However, the distribution of mismatches is restricted in different segments of the sequence, e.g., 1 mismatch in the first 4 bp, 3 mismatches for the next 10 bp, and so on. The intention is to enable searching for concatenations of variable region sequences (and barcodes), where each segment is subject to a different number of mismatches.
num_segments | Number of segments to consider. |
|
inline |
Default constructor. This is only provided to enable composition, the resulting object should not be used until it is copy-assigned to a properly constructed instance.
|
inline |
segments | Length of each segment of the sequence. Each entry should be positive and the sum should be equal to the total length of the barcode sequence. |
duplicates | How duplicate sequences across add() calls should be handled. |
|
inline |
[in] | search_seq | Pointer to a character array containing a sequence to use for searching the barcode pool. This is assumed to be of length equal to get_length() and is typically derived from a read. |
max_mismatches | Maximum number of mismatches for each segment. Each entry should be non-negative. |
Result
containing the index of the barcode sequence where the number of mismatches in each segment is less than or equal to max_mismatches
.Result::index
.MismatchTrie::STATUS_AMBIGUOUS
is reported.max_mismatches
condition, MismatchTrie::STATUS_MISSING
is reported.