Fotis A Baltoumas, Evangelos Karatzas, David Paez-Espino, Nefeli K Venetsianou, Eleni Aplakidou, Anastasis Oulas, Robert D Finn, Sergey Ovchinnikov, Evangelos Pafilis, Nikos C Kyrpides, Georgios A Pavlopoulos
{"title":"从蛋白质家族水平探索微生物功能生物多样性--从元基因组序列读数到注释蛋白质群。","authors":"Fotis A Baltoumas, Evangelos Karatzas, David Paez-Espino, Nefeli K Venetsianou, Eleni Aplakidou, Anastasis Oulas, Robert D Finn, Sergey Ovchinnikov, Evangelos Pafilis, Nikos C Kyrpides, Georgios A Pavlopoulos","doi":"10.3389/fbinf.2023.1157956","DOIUrl":null,"url":null,"abstract":"<p><p>Metagenomics has enabled accessing the genetic repertoire of natural microbial communities. Metagenome shotgun sequencing has become the method of choice for studying and classifying microorganisms from various environments. To this end, several methods have been developed to process and analyze the sequence data from raw reads to end-products such as predicted protein sequences or families. In this article, we provide a thorough review to simplify such processes and discuss the alternative methodologies that can be followed in order to explore biodiversity at the protein family level. We provide details for analysis tools and we comment on their scalability as well as their advantages and disadvantages. Finally, we report the available data repositories and recommend various approaches for protein family annotation related to phylogenetic distribution, structure prediction and metadata enrichment.</p>","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":"3 ","pages":"1157956"},"PeriodicalIF":2.8000,"publicationDate":"2023-03-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10029925/pdf/","citationCount":"0","resultStr":"{\"title\":\"Exploring microbial functional biodiversity at the protein family level-From metagenomic sequence reads to annotated protein clusters.\",\"authors\":\"Fotis A Baltoumas, Evangelos Karatzas, David Paez-Espino, Nefeli K Venetsianou, Eleni Aplakidou, Anastasis Oulas, Robert D Finn, Sergey Ovchinnikov, Evangelos Pafilis, Nikos C Kyrpides, Georgios A Pavlopoulos\",\"doi\":\"10.3389/fbinf.2023.1157956\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Metagenomics has enabled accessing the genetic repertoire of natural microbial communities. Metagenome shotgun sequencing has become the method of choice for studying and classifying microorganisms from various environments. To this end, several methods have been developed to process and analyze the sequence data from raw reads to end-products such as predicted protein sequences or families. In this article, we provide a thorough review to simplify such processes and discuss the alternative methodologies that can be followed in order to explore biodiversity at the protein family level. We provide details for analysis tools and we comment on their scalability as well as their advantages and disadvantages. Finally, we report the available data repositories and recommend various approaches for protein family annotation related to phylogenetic distribution, structure prediction and metadata enrichment.</p>\",\"PeriodicalId\":73066,\"journal\":{\"name\":\"Frontiers in bioinformatics\",\"volume\":\"3 \",\"pages\":\"1157956\"},\"PeriodicalIF\":2.8000,\"publicationDate\":\"2023-03-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10029925/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Frontiers in bioinformatics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.3389/fbinf.2023.1157956\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2023/1/1 0:00:00\",\"PubModel\":\"eCollection\",\"JCR\":\"Q2\",\"JCRName\":\"MATHEMATICAL & COMPUTATIONAL BIOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Frontiers in bioinformatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3389/fbinf.2023.1157956","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2023/1/1 0:00:00","PubModel":"eCollection","JCR":"Q2","JCRName":"MATHEMATICAL & COMPUTATIONAL BIOLOGY","Score":null,"Total":0}
Exploring microbial functional biodiversity at the protein family level-From metagenomic sequence reads to annotated protein clusters.
Metagenomics has enabled accessing the genetic repertoire of natural microbial communities. Metagenome shotgun sequencing has become the method of choice for studying and classifying microorganisms from various environments. To this end, several methods have been developed to process and analyze the sequence data from raw reads to end-products such as predicted protein sequences or families. In this article, we provide a thorough review to simplify such processes and discuss the alternative methodologies that can be followed in order to explore biodiversity at the protein family level. We provide details for analysis tools and we comment on their scalability as well as their advantages and disadvantages. Finally, we report the available data repositories and recommend various approaches for protein family annotation related to phylogenetic distribution, structure prediction and metadata enrichment.