Jing, Junmei; Burden, Conrad; Foret, Sylvain; Wilson, Susan
The D2 statistic is defined as the number of word matches of prespecified length k, with up to t mismatches, shared between two given sequences. This statistic finds its application in alignment-free comparisons of biological sequences. It has two main advantages over alignment-based methods for nucleotide and amino-acid sequence comparisons, such as BLAST (basic local alignment search tool). These are (i) D2 does not assume that homologous segments are contiguous, and (ii) the algorithm is...[Show more]
Items in Open Research are protected by copyright, with all rights reserved, unless otherwise indicated.