Pathological rate matrices: from primates to pathogens

dc.contributor.authorSchranz, Harold
dc.contributor.authorEasteal, Simon
dc.contributor.authorHuttley, Gavin Austin
dc.contributor.authorYap, Von Bing
dc.contributor.authorKnight, Rob
dc.date.accessioned2009-04-22T23:53:53Zen_US
dc.date.accessioned2010-12-20T06:02:50Z
dc.date.available2009-04-22T23:53:53Zen_US
dc.date.available2010-12-20T06:02:50Z
dc.date.issued2008-12-19en_US
dc.date.updated2016-02-24T10:26:42Z
dc.description.abstractBACKGROUND: Continuous-time Markov models allow flexible, parametrically succinct descriptions of sequence divergence. Non-reversible forms of these models are more biologically realistic but are challenging to develop. The instantaneous rate matrices defined for these models are typically transformed into substitution probability matrices using a matrix exponentiation algorithm that employs eigendecomposition, but this algorithm has characteristic vulnerabilities that lead to significant errors when a rate matrix possesses certain 'pathological' properties. Here we tested whether pathological rate matrices exist in nature, and consider the suitability of different algorithms to their computation. RESULTS: We used concatenated protein coding gene alignments from microbial genomes, primate genomes and independent intron alignments from primate genomes. The Taylor series expansion and eigendecomposition matrix exponentiation algorithms were compared to the less widely employed, but more robust, Padé with scaling and squaring algorithm for nucleotide, dinucleotide, codon and trinucleotide rate matrices. Pathological dinucleotide and trinucleotide matrices were evident in the microbial data set, affecting the eigendecomposition and Taylor algorithms respectively. Even using a conservative estimate of matrix error (occurrence of an invalid probability), both Taylor and eigendecomposition algorithms exhibited substantial error rates: ~100% of all exonic trinucleotide matrices were pathological to the Taylor algorithm while ~10% of codon positions 1 and 2 dinucleotide matrices and intronic trinucleotide matrices, and ~30% of codon matrices were pathological to eigendecomposition. The majority of Taylor algorithm errors derived from occurrence of multiple unobserved states. A small number of negative probabilities were detected from the Padé algorithm on trinucleotide matrices that were attributable to machine precision. Although the Padé algorithm does not facilitate caching of intermediate results, it was up to 3× faster than eigendecomposition on the same matrices. CONCLUSION: Development of robust software for computing non-reversible dinucleotide, codon and higher evolutionary models requires implementation of the Padé with scaling and squaring algorithm.
dc.format10 pages
dc.identifier.citationBMC Bioinformatics 9:550 (2008)
dc.identifier.issn1471-2105en_US
dc.identifier.urihttp://hdl.handle.net/10440/125en_US
dc.identifier.urihttp://digitalcollections.anu.edu.au/handle/10440/125
dc.publisherBioMed Central
dc.rightsThis is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
dc.sourceBMC Bioinformatics
dc.source.urihttp://www.biomedcentral.com/content/pdf/1471-2105-9-550.pdfen_US
dc.source.urihttp://www.biomedcentral.com/1471-2105/9/550en_US
dc.subjectKeywords: Eigen decomposition; Evolutionary models; Exponentiation algorithms; Intermediate results; Protein-coding genes; Scaling and squaring; Sequence divergences; Taylor series expansions; Errors; Evolutionary algorithms; Genes; Mammals; Markov processes; Nucle
dc.titlePathological rate matrices: from primates to pathogens
dc.typeJournal article
dcterms.dateAccepted2008-12-19en_US
local.bibliographicCitation.issue1
local.bibliographicCitation.lastpage10
local.bibliographicCitation.startpage1
local.contributor.affiliationSchranz, Harold, John Curtin School of Medical Research, Division of Immunology and Geneticsen_US
local.contributor.affiliationEasteal, Simon, John Curtin School of Medical Research, Division of Molecular Medicineen_US
local.contributor.affiliationHuttley, Gavin Austin, John Curtin School of Medical Research, Division of Molecular Bioscienceen_US
local.contributor.affiliationYap, Von Bing, National University of Singaporeen_US
local.contributor.affiliationKnight, Rob, University of Coloradoen_US
local.contributor.authoruidu4347201en_US
local.contributor.authoruidu8200596en_US
local.contributor.authoruidu9800703en_US
local.contributor.authoruidE32799en_US
local.contributor.authoruidE29368en_US
local.identifier.absfor069999 - Biological Sciences not elsewhere classified
local.identifier.ariespublicationu4020362xPUB130en_US
local.identifier.citationvolume9
local.identifier.doi10.1186/1471-2105-9-550
local.identifier.scopusID2-s2.0-60649085995
local.identifier.thomsonID000263974000002
local.type.statusPublished Versionen_US

Downloads

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Schranz_Pathological2008.pdf
Size:
326.39 KB
Format:
Adobe Portable Document Format