Abstract
Background: Parsimony methods are widely used in molecular evolution to estimate the most plausible phylogeny for a set of characters. Sankoff parsimony determines the minimum number of changes required in a given phylogeny when a cost is associated to transitions between character states. Although optimizations exist to reduce the computations in the number of taxa, the original algorithm takes time O(n2) in the number of states, making it impractical for large values of n. Results: In this study we introduce an optimization of Sankoff parsimony for the reconstruction of ancestral states when ultrametric or additive cost matrices are used. We analyzed its performance for randomly generated matrices, Jukes-Cantor and Kimura's two-parameter models of DNA evolution, and in the reconstruction of elongation factor-1α and ancestral metabolic states of a group of eukaryotes, showing that in all cases the execution time is significantly less than with the original implementation. Conclusion: The algorithms here presented provide a fast computation of Sankoff parsimony for a given phylogeny. Problems where the number of states is large, such as reconstruction of ancestral metabolism, are particularly adequate for this optimization. Since we are reducing the computations required to calculate the parsimony cost of a single tree, our method can be combined with optimizations in the number of taxa that aim at finding the most parsimonious tree.
Original language | English (US) |
---|---|
Article number | 51 |
Journal | BMC BIOINFORMATICS |
Volume | 10 |
DOIs | |
State | Published - Feb 7 2009 |
Externally published | Yes |
Bibliographical note
Funding Information:The authors would like to thank Mercè Llabrés for her comments on an early version of this work, and two anonymous reviewers for critically reading the manuscript and providing valuable suggestions. JC was supported by Grant-in-Aid for JSPS Fellows from the Ministry of Education, Culture, Sports, Science and Technology (MEXT), Japan, No. 20·08086. GV was partially supported by the Spanish DGI project MTM2006-07773 COMGRIO. KI and TG were partially supported by a grant of the Genome Network Project from MEXT, Japan.
ASJC Scopus subject areas
- Structural Biology
- Biochemistry
- Molecular Biology
- Computer Science Applications
- Applied Mathematics