Dong, Yanghe2024-10-272024-10-27https://hdl.handle.net/1885/733721968Deposited by the author 27.10.24Phylogenetic inference, which reconstructs evolutionary trees from DNA or amino acid sequences, is crucial for understanding the evolutionary histories of species on Earth. Model selection is a fundamental step in this process, determining the best-fit model for the data. However, classic maximum likelihood-based methods for model selection are computationally intensive. This study introduces a machine learning-based framework for amino acid model selection, consisting of three components: protFinder for selecting the best-fit substitution model, RHASFinder for identifying the appropriate rate heterogeneity model, and protFFinder for determining the use of empirical pre-estimated frequencies. Our framework is an order of magnitude faster than the widely used ModelFinder, while maintaining comparable accuracy.enphylogeneticsamino acidmodel selectionrate heterogeneityneural networkPhylogenetic Model Selection via Machine Learning202410.25911/G27V-Q356