使用决策树预测 COVID 病例数:初学者教程

Lucy Moctezuma Tan, Lorena Benitez, Florentine van Nouhuijs, Faye Orcales, Allen Kim, Ross Campbell, Megumi Fuse, Pleuni S Pennings
{"title":"使用决策树预测 COVID 病例数:初学者教程","authors":"Lucy Moctezuma Tan, Lorena Benitez, Florentine van Nouhuijs, Faye Orcales, Allen Kim, Ross Campbell, Megumi Fuse, Pleuni S Pennings","doi":"10.1101/2023.12.19.572463","DOIUrl":null,"url":null,"abstract":"Machine learning (ML) makes it possible to analyze large volumes of data and is an important tool in biomedical research. The use of ML methods can lead to improvements in diagnosis, treatment, and prevention of diseases. During the COVID pandemic, ML methods were used for predictions at the patient and community levels. Given the ubiquity of ML, it is important that future doctors, researchers and teachers get acquainted with ML and its contributions to research. Our goal is to make it easier for students and their professors to learn about ML. The learning module we present here is based on a small but relevant COVID dataset, videos, annotated code and the use of cloud computing platforms. The benefit of cloud computing platforms is that students do not have to set up a coding environment on their computer. This saves time and is also an important democratization factor, allowing students to use old or borrowed computers (e.g., from a library), tablets or Chromebooks. As a result, this will benefit colleges geared toward underserved populations with limited computing infrastructure. We developed a beginner-friendly module focused on learning the basics of decision trees by applying them to COVID tabular data. It introduces students to basic terminology used in supervised ML and its relevance to research. The module includes two Python notebooks with pre-written code, one with practice exercises and another with its solutions. Our experience with biology students at San Francisco State University suggests that the material increases interest in ML.","PeriodicalId":501568,"journal":{"name":"bioRxiv - Scientific Communication and Education","volume":"32 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2023-12-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Using a decision tree to predict COVID case numbers: a tutorial for beginners\",\"authors\":\"Lucy Moctezuma Tan, Lorena Benitez, Florentine van Nouhuijs, Faye Orcales, Allen Kim, Ross Campbell, Megumi Fuse, Pleuni S Pennings\",\"doi\":\"10.1101/2023.12.19.572463\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Machine learning (ML) makes it possible to analyze large volumes of data and is an important tool in biomedical research. The use of ML methods can lead to improvements in diagnosis, treatment, and prevention of diseases. During the COVID pandemic, ML methods were used for predictions at the patient and community levels. Given the ubiquity of ML, it is important that future doctors, researchers and teachers get acquainted with ML and its contributions to research. Our goal is to make it easier for students and their professors to learn about ML. The learning module we present here is based on a small but relevant COVID dataset, videos, annotated code and the use of cloud computing platforms. The benefit of cloud computing platforms is that students do not have to set up a coding environment on their computer. This saves time and is also an important democratization factor, allowing students to use old or borrowed computers (e.g., from a library), tablets or Chromebooks. As a result, this will benefit colleges geared toward underserved populations with limited computing infrastructure. We developed a beginner-friendly module focused on learning the basics of decision trees by applying them to COVID tabular data. It introduces students to basic terminology used in supervised ML and its relevance to research. The module includes two Python notebooks with pre-written code, one with practice exercises and another with its solutions. Our experience with biology students at San Francisco State University suggests that the material increases interest in ML.\",\"PeriodicalId\":501568,\"journal\":{\"name\":\"bioRxiv - Scientific Communication and Education\",\"volume\":\"32 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-12-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"bioRxiv - Scientific Communication and Education\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1101/2023.12.19.572463\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"bioRxiv - Scientific Communication and Education","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1101/2023.12.19.572463","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

机器学习(ML)可以分析大量数据,是生物医学研究的重要工具。使用 ML 方法可以改善疾病的诊断、治疗和预防。在 COVID 大流行期间,ML 方法被用于患者和社区层面的预测。鉴于 ML 无处不在,未来的医生、研究人员和教师必须熟悉 ML 及其对研究的贡献。我们的目标是让学生和他们的教授更容易了解 ML。我们在此介绍的学习模块基于一个小型但相关的 COVID 数据集、视频、带注释的代码以及云计算平台的使用。云计算平台的好处在于,学生无需在自己的计算机上设置编码环境。这不仅节省了时间,也是一个重要的民主化因素,学生可以使用旧电脑或借来的电脑(如从图书馆借来的电脑)、平板电脑或 Chromebook。因此,这将有利于面向计算机基础设施有限、服务不足的人群的学院。我们开发了一个适合初学者的模块,重点是通过将决策树应用于 COVID 表格数据来学习决策树的基础知识。它向学生介绍了监督式 ML 中使用的基本术语及其与研究的相关性。该模块包括两个预写代码的 Python 笔记本,一个是练习题,另一个是解答题。我们对旧金山州立大学生物系学生的经验表明,这些材料提高了学生对 ML 的兴趣。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Using a decision tree to predict COVID case numbers: a tutorial for beginners
Machine learning (ML) makes it possible to analyze large volumes of data and is an important tool in biomedical research. The use of ML methods can lead to improvements in diagnosis, treatment, and prevention of diseases. During the COVID pandemic, ML methods were used for predictions at the patient and community levels. Given the ubiquity of ML, it is important that future doctors, researchers and teachers get acquainted with ML and its contributions to research. Our goal is to make it easier for students and their professors to learn about ML. The learning module we present here is based on a small but relevant COVID dataset, videos, annotated code and the use of cloud computing platforms. The benefit of cloud computing platforms is that students do not have to set up a coding environment on their computer. This saves time and is also an important democratization factor, allowing students to use old or borrowed computers (e.g., from a library), tablets or Chromebooks. As a result, this will benefit colleges geared toward underserved populations with limited computing infrastructure. We developed a beginner-friendly module focused on learning the basics of decision trees by applying them to COVID tabular data. It introduces students to basic terminology used in supervised ML and its relevance to research. The module includes two Python notebooks with pre-written code, one with practice exercises and another with its solutions. Our experience with biology students at San Francisco State University suggests that the material increases interest in ML.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信