Asymptotic Behaviour of k -Word Matches Between Two Uniformly Distributed Sequences

Date

2007

Authors

Kantorovitz, Miriam
Booth, Hilary
Burden, Conrad
Wilson, Susan

Journal Title

Journal ISSN

Volume Title

Publisher

Applied Probability Trust

Abstract

Given two sequences of length n over a finite alphabet A of size \A\ = d, the D2 statistic is the number of k-letter word matches between the two sequences. This statistic is used in bioinformatics for EST sequence database searches. Under the assumption

Description

Keywords

Keywords: Count vector; k-word matches; Sequence comparison; Stein's method

Citation

Source

Journal of Applied Probability

Type

Journal article

Book Title

Entity type

Access Statement

License Rights

Restricted until

2037-12-31