Timothy M. Ghaly, Vaheesan Rajabal, Dylan Russell, Elena Colombi, Sasha G. Tetu
{"title":"EcoFoldDB:蛋白质结构引导的生态相关微生物性状在宏基因组尺度上的功能分析。","authors":"Timothy M. Ghaly, Vaheesan Rajabal, Dylan Russell, Elena Colombi, Sasha G. Tetu","doi":"10.1111/1462-2920.70178","DOIUrl":null,"url":null,"abstract":"<p>Microbial communities are fundamental to planetary health and ecosystem processes. High-throughput metagenomic sequencing has provided unprecedented insights into the structure and function of these communities. However, functionally profiling metagenomes remains constrained due to the limited sensitivity of existing sequence homology-based methods to annotate evolutionarily divergent genes. Protein structure, more conserved than sequence and intrinsically tied to molecular function, offers a solution. Capitalising on recent breakthroughs in structural bioinformatics, we present EcoFoldDB, a database of protein structures curated for ecologically relevant microbial traits, and its companion pipeline, <i>EcoFoldDB-annotate</i>, which leverages Foldseek with the ProstT5 protein language model for rapid structural homology searching directly from sequence data. <i>EcoFoldDB-annotate</i> outperforms state-of-the-art sequence-based methods in annotating metagenomic proteins, in terms of sensitivity and precision. To demonstrate its utility and scalability, we performed structure-guided functional profiling of 32 million proteins encoded by 8000 high-quality metagenome-assembled genomes from the global soil microbiome. <i>EcoFoldDB-annotate</i> could resolve the phylogenetic partitioning of important nitrogen cycling pathways, from taxonomically restricted nitrifiers to more widespread denitrifiers, as well as identifying novel, uncultivated bacterial taxa enriched in plant growth-promoting traits. We anticipate that EcoFoldDB will enable researchers to extract ecological insights from environmental genomes and metagenomes and accelerate discoveries in microbial ecology.</p>","PeriodicalId":11898,"journal":{"name":"Environmental microbiology","volume":"27 9","pages":""},"PeriodicalIF":4.0000,"publicationDate":"2025-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12441223/pdf/","citationCount":"0","resultStr":"{\"title\":\"EcoFoldDB: Protein Structure-Guided Functional Profiling of Ecologically Relevant Microbial Traits at the Metagenome Scale\",\"authors\":\"Timothy M. Ghaly, Vaheesan Rajabal, Dylan Russell, Elena Colombi, Sasha G. Tetu\",\"doi\":\"10.1111/1462-2920.70178\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>Microbial communities are fundamental to planetary health and ecosystem processes. High-throughput metagenomic sequencing has provided unprecedented insights into the structure and function of these communities. However, functionally profiling metagenomes remains constrained due to the limited sensitivity of existing sequence homology-based methods to annotate evolutionarily divergent genes. Protein structure, more conserved than sequence and intrinsically tied to molecular function, offers a solution. Capitalising on recent breakthroughs in structural bioinformatics, we present EcoFoldDB, a database of protein structures curated for ecologically relevant microbial traits, and its companion pipeline, <i>EcoFoldDB-annotate</i>, which leverages Foldseek with the ProstT5 protein language model for rapid structural homology searching directly from sequence data. <i>EcoFoldDB-annotate</i> outperforms state-of-the-art sequence-based methods in annotating metagenomic proteins, in terms of sensitivity and precision. To demonstrate its utility and scalability, we performed structure-guided functional profiling of 32 million proteins encoded by 8000 high-quality metagenome-assembled genomes from the global soil microbiome. <i>EcoFoldDB-annotate</i> could resolve the phylogenetic partitioning of important nitrogen cycling pathways, from taxonomically restricted nitrifiers to more widespread denitrifiers, as well as identifying novel, uncultivated bacterial taxa enriched in plant growth-promoting traits. We anticipate that EcoFoldDB will enable researchers to extract ecological insights from environmental genomes and metagenomes and accelerate discoveries in microbial ecology.</p>\",\"PeriodicalId\":11898,\"journal\":{\"name\":\"Environmental microbiology\",\"volume\":\"27 9\",\"pages\":\"\"},\"PeriodicalIF\":4.0000,\"publicationDate\":\"2025-09-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12441223/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Environmental microbiology\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://enviromicro-journals.onlinelibrary.wiley.com/doi/10.1111/1462-2920.70178\",\"RegionNum\":2,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"MICROBIOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Environmental microbiology","FirstCategoryId":"99","ListUrlMain":"https://enviromicro-journals.onlinelibrary.wiley.com/doi/10.1111/1462-2920.70178","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MICROBIOLOGY","Score":null,"Total":0}
EcoFoldDB: Protein Structure-Guided Functional Profiling of Ecologically Relevant Microbial Traits at the Metagenome Scale
Microbial communities are fundamental to planetary health and ecosystem processes. High-throughput metagenomic sequencing has provided unprecedented insights into the structure and function of these communities. However, functionally profiling metagenomes remains constrained due to the limited sensitivity of existing sequence homology-based methods to annotate evolutionarily divergent genes. Protein structure, more conserved than sequence and intrinsically tied to molecular function, offers a solution. Capitalising on recent breakthroughs in structural bioinformatics, we present EcoFoldDB, a database of protein structures curated for ecologically relevant microbial traits, and its companion pipeline, EcoFoldDB-annotate, which leverages Foldseek with the ProstT5 protein language model for rapid structural homology searching directly from sequence data. EcoFoldDB-annotate outperforms state-of-the-art sequence-based methods in annotating metagenomic proteins, in terms of sensitivity and precision. To demonstrate its utility and scalability, we performed structure-guided functional profiling of 32 million proteins encoded by 8000 high-quality metagenome-assembled genomes from the global soil microbiome. EcoFoldDB-annotate could resolve the phylogenetic partitioning of important nitrogen cycling pathways, from taxonomically restricted nitrifiers to more widespread denitrifiers, as well as identifying novel, uncultivated bacterial taxa enriched in plant growth-promoting traits. We anticipate that EcoFoldDB will enable researchers to extract ecological insights from environmental genomes and metagenomes and accelerate discoveries in microbial ecology.
期刊介绍:
Environmental Microbiology provides a high profile vehicle for publication of the most innovative, original and rigorous research in the field. The scope of the Journal encompasses the diversity of current research on microbial processes in the environment, microbial communities, interactions and evolution and includes, but is not limited to, the following:
the structure, activities and communal behaviour of microbial communities
microbial community genetics and evolutionary processes
microbial symbioses, microbial interactions and interactions with plants, animals and abiotic factors
microbes in the tree of life, microbial diversification and evolution
population biology and clonal structure
microbial metabolic and structural diversity
microbial physiology, growth and survival
microbes and surfaces, adhesion and biofouling
responses to environmental signals and stress factors
modelling and theory development
pollution microbiology
extremophiles and life in extreme and unusual little-explored habitats
element cycles and biogeochemical processes, primary and secondary production
microbes in a changing world, microbially-influenced global changes
evolution and diversity of archaeal and bacterial viruses
new technological developments in microbial ecology and evolution, in particular for the study of activities of microbial communities, non-culturable microorganisms and emerging pathogens