kaori
A C++ library for barcode extraction and matching
|
Scan a read sequence for the template sequence. More...
#include <ScanTemplate.hpp>
Classes | |
struct | State |
Details on the current match to the read sequence. More... | |
Public Member Functions | |
ScanTemplate ()=default | |
ScanTemplate (const char *template_seq, SeqLength template_length, SearchStrand strand) | |
State | initialize (const char *read_seq, SeqLength read_length) const |
void | next (State &state) const |
const std::vector< std::pair< SeqLength, SeqLength > > & | forward_variable_regions () const |
const std::vector< std::pair< SeqLength, SeqLength > > & | reverse_variable_regions () const |
const std::vector< std::pair< SeqLength, SeqLength > > & | variable_regions (bool reverse) const |
Scan a read sequence for the template sequence.
When searching for barcodes, kaori first searches for a "template sequence" in the read sequence. The template sequence contains one or more variable regions flanked by constant regions. The template is realized into the "vector sequence" (i.e., the final sequence on the vector construct) by replacing each variable region with one sequence from a pool of barcodes.
This class will scan the read sequence to find a location that matches the constant regions of the template, give or take any number of substitutions. Multiple locations on the read may match the template, provided next()
is called repeatedly. For efficiency, the search itself is done after converting all base sequences into a bit encoding. The maximum size of this encoding is determined at compile-time by the max_length
template parameter.
Once a match is found, the sequence of the read at each variable region can be matched against a pool of known barcode sequences. See other classes like SimpleBarcodeSearch
and SegmentedBarcodeSearch
for details.
max_size_ | Maximum length of the template sequence. |
|
default |
Default constructor. This is only provided to enable composition, the resulting object should not be used until it is copy-assigned to a properly constructed instance.
|
inline |
[in] | template_seq | Pointer to a character array containing the template sequence. Constant sequences should only contain A , C , G or T (or their lower-case equivalents). Variable regions should be marked with - . |
template_length | Length of the array pointed to by template_seq . This should be positive and less than or equal to max_size_ . | |
strand | Strand(s) of the read sequence to search. |
|
inline |
|
inline |
Begin a new search for the template in a read sequence.
[in] | read_seq | Pointer to an array containing the read sequence. |
read_length | Length of the read sequence. |
finished
member is false
, it should be passed to next()
before accessing its other members. If true
, the read sequence was too short for any match to be found.
|
inline |
Find the next match in the read sequence. The first invocation will search for a match at position 0; this can be repeatedly called until match.finished
is true
.
state | A state object produced by initialize() . On return, state is updated with the details of the current match at a particular position on the read sequence. |
|
inline |
|
inline |
reverse | Whether to return variable regions on the reverse-complemented template. |
forward_variable_regions()
or reverse_variable_regions()
depending on reverse
.