kaori
A C++ library for barcode extraction and matching
|
Scan a read sequence for the template sequence. More...
#include <ScanTemplate.hpp>
Classes | |
struct | State |
Details on the current match to the read sequence. More... | |
Public Member Functions | |
ScanTemplate () | |
ScanTemplate (const char *template_seq, size_t template_length, SearchStrand strand) | |
State | initialize (const char *read_seq, size_t read_length) const |
void | next (State &state) const |
template<bool reverse = false> | |
const std::vector< std::pair< int, int > > & | variable_regions () const |
Scan a read sequence for the template sequence.
When searching for barcodes, kaori first searches for a "template sequence" in the read sequence. The template sequence contains constant regions interspersed with one or more variable regions. The template is realized into the full barcoding element by replacing each variable region with one sequence from the corresponding pool of barcodes.
This class will scan read sequence to find a location that matches the constant regions of the template, give or take any number of substitutions. Multiple locations on the read may match the template, provided next()
is called repeatedly. For efficiency, the search itself is done after converting all base sequences into a bit encoding. The maximum size of this encoding is determined at compile-time by the max_length
template parameter.
Once a match is found, the sequence of the read at each variable region can be matched against a pool of known barcode sequences. See the BarcodePool
class for details.
max_size | Maximum length of the template sequence. |
|
inline |
Default constructor. This is only provided to enable composition, the resulting object should not be used until it is copy-assigned to a properly constructed instance.
|
inline |
[in] | template_seq | Pointer to a character array containing the template sequence. Constant sequences should only contain A , C , G or T (or their lower-case equivalents). Variable regions should be marked with - . |
template_length | Length of the array pointed to by template_seq . This should be less than or equal to max_size . | |
strand | Strand(s) of the read sequence to search. |
|
inline |
Begin a new search for the template in a read sequence.
[in] | read_seq | Pointer to an array containing the read sequence. |
read_length | Length of the read sequence. |
|
inline |
Find the next match in the read sequence. The first invocation will search for a match at position 0; this can be repeatedly called until match.finished
is true
.
state | A State object produced by initialize() . On return, state is updated with the details of the current match at a particular position on the read sequence. |
|
inline |
Extract details about the variable regions in the template sequence.
reverse | Should we return the coordinates of the variable regions when searching on the reverse strand? |
reverse = true
, coordinates are reported after reverse-complementing the template sequence.