Skip navigation
Skip navigation

Asymptotic behaviour and optimal word size for exact and approximate word matches between random sequences

Foret, Sylvain; Burden, Conrad; Kantorovitz, Miriam R


BACKGROUND: The number of k-words shared between two sequences is a simple and effcient alignment-free sequence comparison method. This statistic, D2, has been used for the clustering of EST sequences. Sequence comparison based on D2 is extremely fast, its runtime is proportional to the size of the sequences under scrutiny, whereas alignment-based comparisons have a worst-case run time proportional to the square of the size. Recent studies have tackled the rigorous study of the statistical...[Show more]

CollectionsANU Research Publications
Date published: 2006-12-18
Type: Journal article
Source: Proceedings in Applied Mathematics and Mechanics
DOI: 10.1186/1471-2105-7-S5-S21


File Description SizeFormat Image
Foret_Asymptotic2006.pdf284.77 kBAdobe PDFThumbnail

Items in Open Research are protected by copyright, with all rights reserved, unless otherwise indicated.

Updated:  19 May 2020/ Responsible Officer:  University Librarian/ Page Contact:  Library Systems & Web Coordinator