Phylogenetic inference is widely used in evolutionary biology, aiming to find evolutionary relationships between different species and report the result in the form of a phylogenetic tree (phylogeny). There are several statistical methods used for phylogenetic inference. In this review, the method of maximum likelihood for phylogenetic reconstruction is presented. This technique consists of finding the likelihood of multiple candidate phylogenies, and report the one with the highest likelihood as a representative of the evolutionary relationships of a group of species. In this paper, the likelihood calculation of a phylogeny from multiple-species DNA sequences is reviewed. Also, some key DNA mutation models to calculate transition probabilities between nucleotides are presented. Such transition probabilities are used in the likelihood calculation of a given phylogeny. A simple example is shown to illustrate the necessary steps to infer a phylogeny, as well as the most common software for maximum likelihood inference for larger DNA alignments.
Resúmen
La inferencia filogenética es ampliamente utilizada en biología evolutiva, la cual tiene el objetivo de encontrar las relaciones evolutivas entre diferentes especies y representarlas en la forma de un árbol filogenético (o filogenia). Existen varios métodos estadísticos para la inferencia filogenética. En esta revisión se presenta la máxima verosimilitud como modelo de reconstrucción filogenética, método que consiste en calcular la verosimilitud de múltiples filogenias candidatas y reportar aquella con el valor máximo como la filogenia representativa de un grupo de organismos. En la presente revisión se explica cómo se calcula la verosimilitud de una filogenia a partir de secuencias de ADN provenientes de varias especies. También se presentan modelos de mutación de ADN para calcular probabilidades de transición entre nucleótidos, los cuales son usados en la estimación de la verosimilitud. Se muestra también un ejemplo ilustrativo sencillo que aplica los pasos necesarios para inferir una filogenia y se explica el software más usado para inferencia bajo máxima verosimilitud para alineamientos de ADN más grandes.
Ene | Feb | Mar | Abr | May | Jun | Jul | Ago | Sept | Oct | Nov | Dic |
---|---|---|---|---|---|---|---|---|---|---|---|
- | - | - | - | - | - | - | 9 | 5 | 24 | 8 | 10 |
Ene | Feb | Mar | Abr | May | Jun | Jul | Ago | Sept | Oct | Nov | Dic |
---|---|---|---|---|---|---|---|---|---|---|---|
9 | 79 | 73 | 92 | 99 | 97 | 83 | 87 | 102 | 104 | 174 | 78 |
Ene | Feb | Mar | Abr | May | Jun | Jul | Ago | Sept | Oct | Nov | Dic |
---|---|---|---|---|---|---|---|---|---|---|---|
61 | 79 | 83 | 68 | 190 | 147 | 116 | 129 | 132 | 211 | 197 | 141 |
Ene | Feb | Mar | Abr | May | Jun | Jul | Ago | Sept | Oct | Nov | Dic |
---|---|---|---|---|---|---|---|---|---|---|---|
133 | 135 | 134 | 138 | 121 | - | - | - | - | - | - | - |
1. Brocchieri, L. (2001). Phylogenetic Inferences from Molecular Sequences: Review and Critique. Theoretical Population Biology, 59(1), 27-40. Https://doi.org/10.1006/tpbi.2000.1485
2. Bronham, L. & Penny, D. (2003). The modern molecular clock. Nature Reviews Genetics, 4, 216-224. Https://doi.org/10.1038/nrg1020
3. Duchen, P., Alfaro, M., Rolland, J., Salamin, N. & Silvestro, D. (2020). On the effect of asymmetrical trait inheritance on models of trait evolution. Systematic Biology (In press). https://doi.org/10.1093/sysbio/syaa055
4. Edwards, A. & Cavalli-Sforza, L. (1963). The reconstruction of evolution. Annals of Human Genetics, 27, 106-106.
5. Felsenstein, J. (1973). Maximum likelihood and minimum-steps methods for estimating evolutionary trees from data on discrete characters. Systematic Biology, 22, 240-249. https://doi.org/10.1093/sysbio/22.3.240
6. Felsenstein, J. (2004). Inferring phylogenies, vol. 2. Sunderland, Massachusetts: Sinauer associates.
7. Felsenstein, J. (2019). PHYLIP (phylogeny inference package) version 3.698. Recuperado de https://evolution.genetics.washington.edu/phylip.html
8. Felsenstein, J. & Churchill, G. A. (1996). A Hidden Markov Model approach to variation among sites in rate of evolution. Molecular Biology and Evolution, 13, 93-104. Https://doi.org/10.1093/oxfordjournals.molbev.a025575
9. Guindon, S., Dufayard, J.-F., Lefort, V., Anisimova, M., Hordijk, W. & Gascuel, O. (2010). New algorithms and methods to estimate maximum-likelihood phylogenies: Assessing the performance of PhyML 3.0. Systematic Biology, 59, 307-321. Https://doi.org/10.1093/sysbio/syq010
10. Hall, B. G. (2013). Building phylogenetic trees from molecular data with MEGA. Molecular Biology and Evolution, 30, 1229-1235. Https://doi.org/10.1093/molbev/mst012
11. Hasegawa, M., Kishino, H. & Yano, T.-a. (1985). Dating of the human-ape splitting by a molecular clock of mitochondrial DNA. Journal of Molecular Evolution, 22, 160-174. Https://doi.org/10.1007/BF02101694
12. Jukes, T. H. & Cantor, C. R. (1969). Evolution of protein molecules. En M. Munro (Ed.), Mammalian protein metabolism (pp. 21-132), vol. 3. New York: Academic Press.
13. Kimura, M. (1980). A simple method for estimating evolutionary rates of base substitutions through comparative studies of nucleotide sequences. Journal of Molecular Evolution, 16, 111-120. Https://doi.org/10.1007/BF01731581
14. Lanave, C., Preparata, G., Sacone, C. & Serio, G. (1984). A new method for calculating evolutionary substitution rates. Journal of Molecular Evolution, 20, 86-93. Https://doi.org/10.1007/BF02101990
15. Pagel, M. (1999). Inferring the historical patterns of biological evolution. Nature, 401, 877-884. Https://doi.org/10.1093/sysbio/syr124
16. Pattengale, N., Alipour, M., Bininda-Emonds, O., Moret, B. & Stamatakis, A. (2010). How many bootstrap replicates are necessary? Journal of Computational Biology, 17, 337-354. Https://doi.org/10.1089/cmb.2009.0179
17. Peña, C. (2011). Métodos de inferencia filogenética. Revista Peruana de Biología, 18, 265-267.
18. Stamatakis, A. (2014). RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies. Bioinformatics, 30, 1312-1313. Https://doi.org/10.1093/bioinformatics/btu033
19. Swofford, D. L. (2002). PAUP: phylogenetic analysis using parsimony, version 4.0 b10. Https://doi.org/10.1111/j.0014-3820.2002.tb00191.x
20. Tamura, K. & Nei, M. (1993). Estimation of the number of nucleotide substitutions in the control region of mitochondrial DNA in humans and chimpanzees. Molecular Biology and Evolution, 10, 512-526. https://doi.org/10.1093/oxfordjournals.molbev.a040023
21. Tavaré, S. (1986). Some probabilistic and statistical problems in the analysis of DNA sequences. Lectures on Mathematics in the Life Sciences, 17, 57-86.