Skip navigation
Skip navigation

Alignment-free sequence comparison for biologically realistic sequences of moderate length

Burden, Conrad J; Jing, Junmei; Wilson, Susan R

Description

The D2 statistic, defined as the number of matches of words of some pre-specified length k, is a computationally fast alignment-free measure of biological sequence similarity. However there is some debate about its suitability for this purpose as the variability in D2 may be dominated by the terms that reflect the noise in each of the single sequences only. We examine the extent of the problem and the effectiveness of overcoming it by using two mean-centred variants of this statistic, D2* and...[Show more]

CollectionsANU Research Publications
Date published: 2012
Type: Journal article
URI: http://hdl.handle.net/1885/11466
Source: Statistical Applications in Genetics and Molecular Biology 11.1 (2012):1-28
DOI: 10.2202/1544-6115.1724

Download

File Description SizeFormat Image
Burden et al Alignment-free sequence 2012.pdf1.46 MBAdobe PDFThumbnail


Items in Open Research are protected by copyright, with all rights reserved, unless otherwise indicated.

Updated:  23 August 2018/ Responsible Officer:  University Librarian/ Page Contact:  Library Systems & Web Coordinator