Maxwell Sanderford, Sudip Sharma, Glen Stecher, Michael Suleski, Jun Liu, Jieping Ye, Sudhir Kumar
{"title":"MyESL: A Software for Evolutionary Sparse Learning in Molecular Phylogenetics and Genomics.","authors":"Maxwell Sanderford, Sudip Sharma, Glen Stecher, Michael Suleski, Jun Liu, Jieping Ye, Sudhir Kumar","doi":"10.1093/molbev/msaf224","DOIUrl":null,"url":null,"abstract":"<p><p>Evolutionary sparse learning uses supervised machine learning to build evolutionary models where genomic sites loci are parameters. It uses the Least Absolute Shrinkage and Selection Operator with bi-level sparsity to connect a specific phylogenetic hypothesis with sequence variation across genomic loci. The MyESL software addresses the need for open-source tools to perform evolutionary sparse learning analyses, offering features to preprocess input phylogenomic alignments, post-process output models to generate molecular evolutionary metrics, and make Least Absolute Shrinkage and Selection Operator regression adaptable and efficient for phylogenetic trees and alignments. The core of MyESL, which constructs models with logistic regressions using bi-level sparsity, is written in C++. Its input data preprocessing and result post-processing tools are developed in Python. Compared to other tools, MyESL is more computationally efficient and provides evolution-friendly inputs and outputs. These features have already enabled the use of MyESL in two phylogenomic applications, one to identify outlier sequences and fragile clades in inferred phylogenies and another to build genetic models of convergent traits. In addition to the use in a Python environment, MyESL is available as a standalone executable compatible across multiple platforms, which can be directly integrated into scripts and third-party software. The source code, executable, and documentation for MyESL are openly accessible at https://github.com/kumarlabgit/MyESL.</p>","PeriodicalId":18730,"journal":{"name":"Molecular biology and evolution","volume":" ","pages":""},"PeriodicalIF":5.3000,"publicationDate":"2025-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12498521/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Molecular biology and evolution","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1093/molbev/msaf224","RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOCHEMISTRY & MOLECULAR BIOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Evolutionary sparse learning uses supervised machine learning to build evolutionary models where genomic sites loci are parameters. It uses the Least Absolute Shrinkage and Selection Operator with bi-level sparsity to connect a specific phylogenetic hypothesis with sequence variation across genomic loci. The MyESL software addresses the need for open-source tools to perform evolutionary sparse learning analyses, offering features to preprocess input phylogenomic alignments, post-process output models to generate molecular evolutionary metrics, and make Least Absolute Shrinkage and Selection Operator regression adaptable and efficient for phylogenetic trees and alignments. The core of MyESL, which constructs models with logistic regressions using bi-level sparsity, is written in C++. Its input data preprocessing and result post-processing tools are developed in Python. Compared to other tools, MyESL is more computationally efficient and provides evolution-friendly inputs and outputs. These features have already enabled the use of MyESL in two phylogenomic applications, one to identify outlier sequences and fragile clades in inferred phylogenies and another to build genetic models of convergent traits. In addition to the use in a Python environment, MyESL is available as a standalone executable compatible across multiple platforms, which can be directly integrated into scripts and third-party software. The source code, executable, and documentation for MyESL are openly accessible at https://github.com/kumarlabgit/MyESL.
期刊介绍:
Molecular Biology and Evolution
Journal Overview:
Publishes research at the interface of molecular (including genomics) and evolutionary biology
Considers manuscripts containing patterns, processes, and predictions at all levels of organization: population, taxonomic, functional, and phenotypic
Interested in fundamental discoveries, new and improved methods, resources, technologies, and theories advancing evolutionary research
Publishes balanced reviews of recent developments in genome evolution and forward-looking perspectives suggesting future directions in molecular evolution applications.