ZhanDong Li, QingLan Ma, Hao Li, Lin Lu, Lei Chen, Wei Guo, KaiYan Feng, Tao Huang, Yu-Dong Cai
{"title":"Identification of Key Features Pivotal to the Characteristics and Functions of Gut Bacteria Taxa through Machine Learning Methods.","authors":"ZhanDong Li, QingLan Ma, Hao Li, Lin Lu, Lei Chen, Wei Guo, KaiYan Feng, Tao Huang, Yu-Dong Cai","doi":"10.2174/0115665232367064250630202337","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Gut bacteria critically influence digestion, facilitate the breakdown of complex food substances, aid in essential nutrient synthesis, and contribute to immune system balance. However, current knowledge regarding intestinal bacteria remains insufficient.</p><p><strong>Objective: </strong>This study aims to discover essential differences for different intestinal bacteria.</p><p><strong>Methods: </strong>This study was conducted by investigating a total of 1478 gut bacterial samples comprising 235 Actinobacteria, 447 Bacteroidetes, and 796 Firmicutes, by utilizing sophisticated machine learning algorithms. By building on the dataset provided by Chen et al., we engaged sophisticated machine learning techniques to further investigate and analyze the gut bacterial samples. Each sample in the dataset was described by 993 unique features associated with gut bacteria, including 342 features annotated by the Antibiotic Resistance Genes Database, Comprehensive Antibiotic Research Database, Kyoto Encyclopedia of Genes and Genomes, and Virulence Factors of Pathogenic Bacteria. We employed incremental feature selection methods within a computational framework to identify the optimal features for classification.</p><p><strong>Results: </strong>Eleven feature ranking algorithms selected several key features as pivotal to the characteristics and functions of gut bacteria. These features appear to facilitate the identification of specific gut bacterial species. Additionally, we established quantitative rules for identifying Actinobacteria, Bacteroidetes, and Firmicutes.</p><p><strong>Conclusion: </strong>This research underscores the significant potential of machine learning in studying gut microbes and enhances our understanding of the multifaceted roles of gut bacteria.</p>","PeriodicalId":10798,"journal":{"name":"Current gene therapy","volume":" ","pages":""},"PeriodicalIF":3.3000,"publicationDate":"2025-07-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Current gene therapy","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.2174/0115665232367064250630202337","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"GENETICS & HEREDITY","Score":null,"Total":0}
引用次数: 0
Abstract
Background: Gut bacteria critically influence digestion, facilitate the breakdown of complex food substances, aid in essential nutrient synthesis, and contribute to immune system balance. However, current knowledge regarding intestinal bacteria remains insufficient.
Objective: This study aims to discover essential differences for different intestinal bacteria.
Methods: This study was conducted by investigating a total of 1478 gut bacterial samples comprising 235 Actinobacteria, 447 Bacteroidetes, and 796 Firmicutes, by utilizing sophisticated machine learning algorithms. By building on the dataset provided by Chen et al., we engaged sophisticated machine learning techniques to further investigate and analyze the gut bacterial samples. Each sample in the dataset was described by 993 unique features associated with gut bacteria, including 342 features annotated by the Antibiotic Resistance Genes Database, Comprehensive Antibiotic Research Database, Kyoto Encyclopedia of Genes and Genomes, and Virulence Factors of Pathogenic Bacteria. We employed incremental feature selection methods within a computational framework to identify the optimal features for classification.
Results: Eleven feature ranking algorithms selected several key features as pivotal to the characteristics and functions of gut bacteria. These features appear to facilitate the identification of specific gut bacterial species. Additionally, we established quantitative rules for identifying Actinobacteria, Bacteroidetes, and Firmicutes.
Conclusion: This research underscores the significant potential of machine learning in studying gut microbes and enhances our understanding of the multifaceted roles of gut bacteria.
期刊介绍:
Current Gene Therapy is a bi-monthly peer-reviewed journal aimed at academic and industrial scientists with an interest in major topics concerning basic research and clinical applications of gene and cell therapy of diseases. Cell therapy manuscripts can also include application in diseases when cells have been genetically modified. Current Gene Therapy publishes full-length/mini reviews and original research on the latest developments in gene transfer and gene expression analysis, vector development, cellular genetic engineering, animal models and human clinical applications of gene and cell therapy for the treatment of diseases.
Current Gene Therapy publishes reviews and original research containing experimental data on gene and cell therapy. The journal also includes manuscripts on technological advances, ethical and regulatory considerations of gene and cell therapy. Reviews should provide the reader with a comprehensive assessment of any area of experimental biology applied to molecular medicine that is not only of significance within a particular field of gene therapy and cell therapy but also of interest to investigators in other fields. Authors are encouraged to provide their own assessment and vision for future advances. Reviews are also welcome on late breaking discoveries on which substantial literature has not yet been amassed. Such reviews provide a forum for sharply focused topics of recent experimental investigations in gene therapy primarily to make these results accessible to both clinical and basic researchers. Manuscripts containing experimental data should be original data, not previously published.