Skip navigation
Skip navigation

Alignment-free sequence comparison for biologically realistic sequences of moderate length

Burden, Conrad J; Jing, Junmei; Wilson, Susan R


The D2 statistic, defined as the number of matches of words of some pre-specified length k, is a computationally fast alignment-free measure of biological sequence similarity. However there is some debate about its suitability for this purpose as the variability in D2 may be dominated by the terms that reflect the noise in each of the single sequences only. We examine the extent of the problem and the effectiveness of overcoming it by using two mean-centred variants of this statistic, D2* and...[Show more]

CollectionsANU Research Publications
Date published: 2012
Type: Journal article
Source: Statistical Applications in Genetics and Molecular Biology 11.1 (2012):1-28
DOI: 10.2202/1544-6115.1724


File Description SizeFormat Image
Burden et al Alignment-free sequence 2012.pdf1.46 MBAdobe PDFThumbnail

Items in Open Research are protected by copyright, with all rights reserved, unless otherwise indicated.

Updated:  20 July 2017/ Responsible Officer:  University Librarian/ Page Contact:  Library Systems & Web Coordinator