用响应面方法增强机器学习的有限数据集:一个基于聚乙烯醇丁醛合成的化学本科教学案例

IF 2.9 3区 教育学 Q2 CHEMISTRY, MULTIDISCIPLINARY
Yingying Liu, , , Liang Gao*, , , Zican Yang, , , Huatang Zhang*, , , Jizhong Chen*, , , Jiayuan Xu, , and , Kai Yin, 
{"title":"用响应面方法增强机器学习的有限数据集:一个基于聚乙烯醇丁醛合成的化学本科教学案例","authors":"Yingying Liu,&nbsp;, ,&nbsp;Liang Gao*,&nbsp;, ,&nbsp;Zican Yang,&nbsp;, ,&nbsp;Huatang Zhang*,&nbsp;, ,&nbsp;Jizhong Chen*,&nbsp;, ,&nbsp;Jiayuan Xu,&nbsp;, and ,&nbsp;Kai Yin,&nbsp;","doi":"10.1021/acs.jchemed.5c00505","DOIUrl":null,"url":null,"abstract":"<p >This paper presents a novel Course-Based Undergraduate Research Experience (CURE) that addresses a critical challenge in chemical education: teaching students to generate appropriate data for machine learning (ML) applications rather than relying on precurated data sets. Through polyvinyl butyral (PVB) synthesis experiments, 22 third-year chemistry students learned to integrate Response Surface Methodology (RSM) with Support Vector Regression (SVR) algorithms to overcome the inherent data scarcity challenge in undergraduate laboratories. The course guided students through a progressive four-module framework: (1) designing experiments using Central Composite Design principles, (2) conducting PVB synthesis with varied parameters and characterizing products, (3) developing RSM models to establish mathematical relationships between synthesis parameters and material properties, and (4) using these models to generate augmented data sets for ML training. Exit survey results indicate that students self-report significant improvements in their technical competencies, particularly in experimental design (45%, <i>n</i> = 10), machine learning fundamentals (61%, <i>n</i> = 13), and data processing skills (55%, <i>n</i> = 12). Student-developed hybrid RSM-ML models achieved strong predictive performance (<i>R</i><sup>2</sup> values range from 0.78 to 0.95 for testing data sets), significantly outperforming either methodology used independently. This integrated approach prepares students for modern research environments by developing both experimental and computational competencies. The framework is adaptable to other chemical systems.</p>","PeriodicalId":43,"journal":{"name":"Journal of Chemical Education","volume":"102 10","pages":"4424–4434"},"PeriodicalIF":2.9000,"publicationDate":"2025-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Using Response Surface Methodology To Enhance Limited Data Sets for Machine Learning: A Polyvinyl Butyral Synthesis-Based Chemistry Undergraduate Teaching Case\",\"authors\":\"Yingying Liu,&nbsp;, ,&nbsp;Liang Gao*,&nbsp;, ,&nbsp;Zican Yang,&nbsp;, ,&nbsp;Huatang Zhang*,&nbsp;, ,&nbsp;Jizhong Chen*,&nbsp;, ,&nbsp;Jiayuan Xu,&nbsp;, and ,&nbsp;Kai Yin,&nbsp;\",\"doi\":\"10.1021/acs.jchemed.5c00505\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p >This paper presents a novel Course-Based Undergraduate Research Experience (CURE) that addresses a critical challenge in chemical education: teaching students to generate appropriate data for machine learning (ML) applications rather than relying on precurated data sets. Through polyvinyl butyral (PVB) synthesis experiments, 22 third-year chemistry students learned to integrate Response Surface Methodology (RSM) with Support Vector Regression (SVR) algorithms to overcome the inherent data scarcity challenge in undergraduate laboratories. The course guided students through a progressive four-module framework: (1) designing experiments using Central Composite Design principles, (2) conducting PVB synthesis with varied parameters and characterizing products, (3) developing RSM models to establish mathematical relationships between synthesis parameters and material properties, and (4) using these models to generate augmented data sets for ML training. Exit survey results indicate that students self-report significant improvements in their technical competencies, particularly in experimental design (45%, <i>n</i> = 10), machine learning fundamentals (61%, <i>n</i> = 13), and data processing skills (55%, <i>n</i> = 12). Student-developed hybrid RSM-ML models achieved strong predictive performance (<i>R</i><sup>2</sup> values range from 0.78 to 0.95 for testing data sets), significantly outperforming either methodology used independently. This integrated approach prepares students for modern research environments by developing both experimental and computational competencies. The framework is adaptable to other chemical systems.</p>\",\"PeriodicalId\":43,\"journal\":{\"name\":\"Journal of Chemical Education\",\"volume\":\"102 10\",\"pages\":\"4424–4434\"},\"PeriodicalIF\":2.9000,\"publicationDate\":\"2025-09-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Chemical Education\",\"FirstCategoryId\":\"92\",\"ListUrlMain\":\"https://pubs.acs.org/doi/10.1021/acs.jchemed.5c00505\",\"RegionNum\":3,\"RegionCategory\":\"教育学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"CHEMISTRY, MULTIDISCIPLINARY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Chemical Education","FirstCategoryId":"92","ListUrlMain":"https://pubs.acs.org/doi/10.1021/acs.jchemed.5c00505","RegionNum":3,"RegionCategory":"教育学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"CHEMISTRY, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0

摘要

本文提出了一种新颖的基于课程的本科研究体验(CURE),解决了化学教育中的一个关键挑战:教学生为机器学习(ML)应用生成适当的数据,而不是依赖于预先设定的数据集。通过聚乙烯醇丁醛(PVB)合成实验,22名化学三年级学生学会了将响应面法(RSM)与支持向量回归(SVR)算法相结合,以克服本科实验室固有的数据稀缺性挑战。该课程通过一个渐进的四模块框架指导学生:(1)使用中心复合设计原则设计实验;(2)使用不同参数进行PVB合成并表征产品;(3)开发RSM模型以建立合成参数与材料性能之间的数学关系;(4)使用这些模型生成用于ML训练的增强数据集。退出调查结果表明,学生自我报告的技术能力有了显著提高,特别是在实验设计(45%,n = 10)、机器学习基础(61%,n = 13)和数据处理技能(55%,n = 12)方面。学生开发的混合RSM-ML模型实现了强大的预测性能(测试数据集的R2值范围为0.78至0.95),显著优于独立使用的任何一种方法。这种综合方法通过培养学生的实验和计算能力为现代研究环境做好准备。该框架适用于其他化学体系。
本文章由计算机程序翻译,如有差异,请以英文原文为准。

Using Response Surface Methodology To Enhance Limited Data Sets for Machine Learning: A Polyvinyl Butyral Synthesis-Based Chemistry Undergraduate Teaching Case

Using Response Surface Methodology To Enhance Limited Data Sets for Machine Learning: A Polyvinyl Butyral Synthesis-Based Chemistry Undergraduate Teaching Case

This paper presents a novel Course-Based Undergraduate Research Experience (CURE) that addresses a critical challenge in chemical education: teaching students to generate appropriate data for machine learning (ML) applications rather than relying on precurated data sets. Through polyvinyl butyral (PVB) synthesis experiments, 22 third-year chemistry students learned to integrate Response Surface Methodology (RSM) with Support Vector Regression (SVR) algorithms to overcome the inherent data scarcity challenge in undergraduate laboratories. The course guided students through a progressive four-module framework: (1) designing experiments using Central Composite Design principles, (2) conducting PVB synthesis with varied parameters and characterizing products, (3) developing RSM models to establish mathematical relationships between synthesis parameters and material properties, and (4) using these models to generate augmented data sets for ML training. Exit survey results indicate that students self-report significant improvements in their technical competencies, particularly in experimental design (45%, n = 10), machine learning fundamentals (61%, n = 13), and data processing skills (55%, n = 12). Student-developed hybrid RSM-ML models achieved strong predictive performance (R2 values range from 0.78 to 0.95 for testing data sets), significantly outperforming either methodology used independently. This integrated approach prepares students for modern research environments by developing both experimental and computational competencies. The framework is adaptable to other chemical systems.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Journal of Chemical Education
Journal of Chemical Education 化学-化学综合
CiteScore
5.60
自引率
50.00%
发文量
465
审稿时长
6.5 months
期刊介绍: The Journal of Chemical Education is the official journal of the Division of Chemical Education of the American Chemical Society, co-published with the American Chemical Society Publications Division. Launched in 1924, the Journal of Chemical Education is the world’s premier chemical education journal. The Journal publishes peer-reviewed articles and related information as a resource to those in the field of chemical education and to those institutions that serve them. JCE typically addresses chemical content, activities, laboratory experiments, instructional methods, and pedagogies. The Journal serves as a means of communication among people across the world who are interested in the teaching and learning of chemistry. This includes instructors of chemistry from middle school through graduate school, professional staff who support these teaching activities, as well as some scientists in commerce, industry, and government.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信