Huan Yang, David Xie, Ping Wei, Jinzhan Ge, Yudong Li
{"title":"A case study of ChatGPT-assisted building of a microbiome-based machine learning model for biologists.","authors":"Huan Yang, David Xie, Ping Wei, Jinzhan Ge, Yudong Li","doi":"10.1128/jmbe.00082-25","DOIUrl":null,"url":null,"abstract":"<p><p>Machine learning is a widespread technology that is shaping how biologists interact with data. However, there are many practical challenges in teaching machine learning to biology students, who often do not have a strong programming background. To address these challenges, we present an educational study utilizing publicly available salivary microbiome data sets to develop a machine learning model using Python. With the assistance of ChatGPT, most students successfully built a simple random forest model. Evaluation metrics, such as accuracy and area under the curve, indicated that the overall performance of the model was favorable and accurately predicted oral malodor diseases. This work establishes a pedagogical framework for integrating machine learning into biology curricula, bridging the gap between data science and life science education.</p>","PeriodicalId":46416,"journal":{"name":"Journal of Microbiology & Biology Education","volume":" ","pages":"e0008225"},"PeriodicalIF":1.5000,"publicationDate":"2025-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Microbiology & Biology Education","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1128/jmbe.00082-25","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"EDUCATION, SCIENTIFIC DISCIPLINES","Score":null,"Total":0}
引用次数: 0
Abstract
Machine learning is a widespread technology that is shaping how biologists interact with data. However, there are many practical challenges in teaching machine learning to biology students, who often do not have a strong programming background. To address these challenges, we present an educational study utilizing publicly available salivary microbiome data sets to develop a machine learning model using Python. With the assistance of ChatGPT, most students successfully built a simple random forest model. Evaluation metrics, such as accuracy and area under the curve, indicated that the overall performance of the model was favorable and accurately predicted oral malodor diseases. This work establishes a pedagogical framework for integrating machine learning into biology curricula, bridging the gap between data science and life science education.