使用ChatGPT作为培训非程序员生成基因组序列分析代码的工具。

IF 1.2 4区 教育学 Q4 BIOCHEMISTRY & MOLECULAR BIOLOGY
Haley A Delcher, Enas S Alsatari, Adeyeye I Haastrup, Sayema Naaz, Lydia A Hayes-Guastella, Autumn M McDaniel, Olivia G Clark, Devin M Katerski, Francois O Prinsloo, Olivia R Roberts, Meredith A Shaddix, Bridgette N Sullivan, Isabella M Swan, Emily M Hartsell, Jeffrey D DeMeis, Sunita S Paudel, Glen M Borchert
{"title":"使用ChatGPT作为培训非程序员生成基因组序列分析代码的工具。","authors":"Haley A Delcher, Enas S Alsatari, Adeyeye I Haastrup, Sayema Naaz, Lydia A Hayes-Guastella, Autumn M McDaniel, Olivia G Clark, Devin M Katerski, Francois O Prinsloo, Olivia R Roberts, Meredith A Shaddix, Bridgette N Sullivan, Isabella M Swan, Emily M Hartsell, Jeffrey D DeMeis, Sunita S Paudel, Glen M Borchert","doi":"10.1002/bmb.21899","DOIUrl":null,"url":null,"abstract":"<p><p>Today, due to the size of many genomes and the increasingly large sizes of sequencing files, independently analyzing sequencing data is largely impossible for a biologist with little to no programming expertise. As such, biologists are typically faced with the dilemma of either having to spend a significant amount of time and effort to learn how to program themselves or having to identify (and rely on) an available computer scientist to analyze large sequence data sets. That said, the advent of AI-powered programs like ChatGPT may offer a means of circumventing the disconnect between biologists and their analysis of genomic data critically important to their field. The work detailed herein demonstrates how implementing ChatGPT into an existing Course-based Undergraduate Research Experience curriculum can provide a means for equipping biology students with no programming expertise the power to generate their own programs and allow those students to carry out a publishable, comprehensive analysis of real-world Next Generation Sequencing (NGS) datasets. Relying solely on the students' biology background as a prompt for directing ChatGPT to generate Python codes, we found students could readily generate programs able to deal with and analyze NGS datasets greater than 10 gigabytes. In summary, we believe that integrating ChatGPT into education can help bridge a critical gap between biology and computer science and may prove similarly beneficial in other disciplines. Additionally, ChatGPT can provide biological researchers with powerful new tools capable of mediating NGS dataset analysis to help accelerate major new advances in the field.</p>","PeriodicalId":8830,"journal":{"name":"Biochemistry and Molecular Biology Education","volume":" ","pages":""},"PeriodicalIF":1.2000,"publicationDate":"2025-05-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Using ChatGPT as a tool for training nonprogrammers to generate genomic sequence analysis code.\",\"authors\":\"Haley A Delcher, Enas S Alsatari, Adeyeye I Haastrup, Sayema Naaz, Lydia A Hayes-Guastella, Autumn M McDaniel, Olivia G Clark, Devin M Katerski, Francois O Prinsloo, Olivia R Roberts, Meredith A Shaddix, Bridgette N Sullivan, Isabella M Swan, Emily M Hartsell, Jeffrey D DeMeis, Sunita S Paudel, Glen M Borchert\",\"doi\":\"10.1002/bmb.21899\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Today, due to the size of many genomes and the increasingly large sizes of sequencing files, independently analyzing sequencing data is largely impossible for a biologist with little to no programming expertise. As such, biologists are typically faced with the dilemma of either having to spend a significant amount of time and effort to learn how to program themselves or having to identify (and rely on) an available computer scientist to analyze large sequence data sets. That said, the advent of AI-powered programs like ChatGPT may offer a means of circumventing the disconnect between biologists and their analysis of genomic data critically important to their field. The work detailed herein demonstrates how implementing ChatGPT into an existing Course-based Undergraduate Research Experience curriculum can provide a means for equipping biology students with no programming expertise the power to generate their own programs and allow those students to carry out a publishable, comprehensive analysis of real-world Next Generation Sequencing (NGS) datasets. Relying solely on the students' biology background as a prompt for directing ChatGPT to generate Python codes, we found students could readily generate programs able to deal with and analyze NGS datasets greater than 10 gigabytes. In summary, we believe that integrating ChatGPT into education can help bridge a critical gap between biology and computer science and may prove similarly beneficial in other disciplines. Additionally, ChatGPT can provide biological researchers with powerful new tools capable of mediating NGS dataset analysis to help accelerate major new advances in the field.</p>\",\"PeriodicalId\":8830,\"journal\":{\"name\":\"Biochemistry and Molecular Biology Education\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":1.2000,\"publicationDate\":\"2025-05-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Biochemistry and Molecular Biology Education\",\"FirstCategoryId\":\"95\",\"ListUrlMain\":\"https://doi.org/10.1002/bmb.21899\",\"RegionNum\":4,\"RegionCategory\":\"教育学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"BIOCHEMISTRY & MOLECULAR BIOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Biochemistry and Molecular Biology Education","FirstCategoryId":"95","ListUrlMain":"https://doi.org/10.1002/bmb.21899","RegionNum":4,"RegionCategory":"教育学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"BIOCHEMISTRY & MOLECULAR BIOLOGY","Score":null,"Total":0}
引用次数: 0

摘要

今天,由于许多基因组的大小和越来越大的测序文件的大小,独立分析测序数据对于几乎没有编程专业知识的生物学家来说基本上是不可能的。因此,生物学家通常面临这样的困境:要么必须花费大量的时间和精力来学习如何编程,要么必须识别(并依赖)可用的计算机科学家来分析大型序列数据集。也就是说,像ChatGPT这样由人工智能驱动的程序的出现,可能会提供一种绕过生物学家和他们对其领域至关重要的基因组数据分析之间脱节的方法。本文详细介绍了如何将ChatGPT实施到现有的基于课程的本科研究经验课程中,为没有编程专业知识的生物学学生提供了一种方法,使他们能够生成自己的程序,并允许这些学生对现实世界的下一代测序(NGS)数据集进行可发表的全面分析。仅依靠学生的生物学背景作为指导ChatGPT生成Python代码的提示,我们发现学生可以很容易地生成能够处理和分析大于10 gb的NGS数据集的程序。总之,我们相信将ChatGPT整合到教育中可以帮助弥合生物学和计算机科学之间的关键差距,并可能在其他学科中证明同样有益。此外,ChatGPT可以为生物学研究人员提供强大的新工具,能够调解NGS数据集分析,以帮助加速该领域的重大新进展。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Using ChatGPT as a tool for training nonprogrammers to generate genomic sequence analysis code.

Today, due to the size of many genomes and the increasingly large sizes of sequencing files, independently analyzing sequencing data is largely impossible for a biologist with little to no programming expertise. As such, biologists are typically faced with the dilemma of either having to spend a significant amount of time and effort to learn how to program themselves or having to identify (and rely on) an available computer scientist to analyze large sequence data sets. That said, the advent of AI-powered programs like ChatGPT may offer a means of circumventing the disconnect between biologists and their analysis of genomic data critically important to their field. The work detailed herein demonstrates how implementing ChatGPT into an existing Course-based Undergraduate Research Experience curriculum can provide a means for equipping biology students with no programming expertise the power to generate their own programs and allow those students to carry out a publishable, comprehensive analysis of real-world Next Generation Sequencing (NGS) datasets. Relying solely on the students' biology background as a prompt for directing ChatGPT to generate Python codes, we found students could readily generate programs able to deal with and analyze NGS datasets greater than 10 gigabytes. In summary, we believe that integrating ChatGPT into education can help bridge a critical gap between biology and computer science and may prove similarly beneficial in other disciplines. Additionally, ChatGPT can provide biological researchers with powerful new tools capable of mediating NGS dataset analysis to help accelerate major new advances in the field.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Biochemistry and Molecular Biology Education
Biochemistry and Molecular Biology Education 生物-生化与分子生物学
CiteScore
2.60
自引率
14.30%
发文量
99
审稿时长
6-12 weeks
期刊介绍: The aim of BAMBED is to enhance teacher preparation and student learning in Biochemistry, Molecular Biology, and related sciences such as Biophysics and Cell Biology, by promoting the world-wide dissemination of educational materials. BAMBED seeks and communicates articles on many topics, including: Innovative techniques in teaching and learning. New pedagogical approaches. Research in biochemistry and molecular biology education. Reviews on emerging areas of Biochemistry and Molecular Biology to provide background for the preparation of lectures, seminars, student presentations, dissertations, etc. Historical Reviews describing "Paths to Discovery". Novel and proven laboratory experiments that have both skill-building and discovery-based characteristics. Reviews of relevant textbooks, software, and websites. Descriptions of software for educational use. Descriptions of multimedia materials such as tutorials on various aspects of biochemistry and molecular biology.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信