{"title":"用响应面方法增强机器学习的有限数据集:一个基于聚乙烯醇丁醛合成的化学本科教学案例","authors":"Yingying Liu, , , Liang Gao*, , , Zican Yang, , , Huatang Zhang*, , , Jizhong Chen*, , , Jiayuan Xu, , and , Kai Yin, ","doi":"10.1021/acs.jchemed.5c00505","DOIUrl":null,"url":null,"abstract":"<p >This paper presents a novel Course-Based Undergraduate Research Experience (CURE) that addresses a critical challenge in chemical education: teaching students to generate appropriate data for machine learning (ML) applications rather than relying on precurated data sets. Through polyvinyl butyral (PVB) synthesis experiments, 22 third-year chemistry students learned to integrate Response Surface Methodology (RSM) with Support Vector Regression (SVR) algorithms to overcome the inherent data scarcity challenge in undergraduate laboratories. The course guided students through a progressive four-module framework: (1) designing experiments using Central Composite Design principles, (2) conducting PVB synthesis with varied parameters and characterizing products, (3) developing RSM models to establish mathematical relationships between synthesis parameters and material properties, and (4) using these models to generate augmented data sets for ML training. Exit survey results indicate that students self-report significant improvements in their technical competencies, particularly in experimental design (45%, <i>n</i> = 10), machine learning fundamentals (61%, <i>n</i> = 13), and data processing skills (55%, <i>n</i> = 12). Student-developed hybrid RSM-ML models achieved strong predictive performance (<i>R</i><sup>2</sup> values range from 0.78 to 0.95 for testing data sets), significantly outperforming either methodology used independently. This integrated approach prepares students for modern research environments by developing both experimental and computational competencies. The framework is adaptable to other chemical systems.</p>","PeriodicalId":43,"journal":{"name":"Journal of Chemical Education","volume":"102 10","pages":"4424–4434"},"PeriodicalIF":2.9000,"publicationDate":"2025-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Using Response Surface Methodology To Enhance Limited Data Sets for Machine Learning: A Polyvinyl Butyral Synthesis-Based Chemistry Undergraduate Teaching Case\",\"authors\":\"Yingying Liu, , , Liang Gao*, , , Zican Yang, , , Huatang Zhang*, , , Jizhong Chen*, , , Jiayuan Xu, , and , Kai Yin, \",\"doi\":\"10.1021/acs.jchemed.5c00505\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p >This paper presents a novel Course-Based Undergraduate Research Experience (CURE) that addresses a critical challenge in chemical education: teaching students to generate appropriate data for machine learning (ML) applications rather than relying on precurated data sets. Through polyvinyl butyral (PVB) synthesis experiments, 22 third-year chemistry students learned to integrate Response Surface Methodology (RSM) with Support Vector Regression (SVR) algorithms to overcome the inherent data scarcity challenge in undergraduate laboratories. The course guided students through a progressive four-module framework: (1) designing experiments using Central Composite Design principles, (2) conducting PVB synthesis with varied parameters and characterizing products, (3) developing RSM models to establish mathematical relationships between synthesis parameters and material properties, and (4) using these models to generate augmented data sets for ML training. Exit survey results indicate that students self-report significant improvements in their technical competencies, particularly in experimental design (45%, <i>n</i> = 10), machine learning fundamentals (61%, <i>n</i> = 13), and data processing skills (55%, <i>n</i> = 12). Student-developed hybrid RSM-ML models achieved strong predictive performance (<i>R</i><sup>2</sup> values range from 0.78 to 0.95 for testing data sets), significantly outperforming either methodology used independently. This integrated approach prepares students for modern research environments by developing both experimental and computational competencies. The framework is adaptable to other chemical systems.</p>\",\"PeriodicalId\":43,\"journal\":{\"name\":\"Journal of Chemical Education\",\"volume\":\"102 10\",\"pages\":\"4424–4434\"},\"PeriodicalIF\":2.9000,\"publicationDate\":\"2025-09-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Chemical Education\",\"FirstCategoryId\":\"92\",\"ListUrlMain\":\"https://pubs.acs.org/doi/10.1021/acs.jchemed.5c00505\",\"RegionNum\":3,\"RegionCategory\":\"教育学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"CHEMISTRY, MULTIDISCIPLINARY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Chemical Education","FirstCategoryId":"92","ListUrlMain":"https://pubs.acs.org/doi/10.1021/acs.jchemed.5c00505","RegionNum":3,"RegionCategory":"教育学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"CHEMISTRY, MULTIDISCIPLINARY","Score":null,"Total":0}
Using Response Surface Methodology To Enhance Limited Data Sets for Machine Learning: A Polyvinyl Butyral Synthesis-Based Chemistry Undergraduate Teaching Case
This paper presents a novel Course-Based Undergraduate Research Experience (CURE) that addresses a critical challenge in chemical education: teaching students to generate appropriate data for machine learning (ML) applications rather than relying on precurated data sets. Through polyvinyl butyral (PVB) synthesis experiments, 22 third-year chemistry students learned to integrate Response Surface Methodology (RSM) with Support Vector Regression (SVR) algorithms to overcome the inherent data scarcity challenge in undergraduate laboratories. The course guided students through a progressive four-module framework: (1) designing experiments using Central Composite Design principles, (2) conducting PVB synthesis with varied parameters and characterizing products, (3) developing RSM models to establish mathematical relationships between synthesis parameters and material properties, and (4) using these models to generate augmented data sets for ML training. Exit survey results indicate that students self-report significant improvements in their technical competencies, particularly in experimental design (45%, n = 10), machine learning fundamentals (61%, n = 13), and data processing skills (55%, n = 12). Student-developed hybrid RSM-ML models achieved strong predictive performance (R2 values range from 0.78 to 0.95 for testing data sets), significantly outperforming either methodology used independently. This integrated approach prepares students for modern research environments by developing both experimental and computational competencies. The framework is adaptable to other chemical systems.
期刊介绍:
The Journal of Chemical Education is the official journal of the Division of Chemical Education of the American Chemical Society, co-published with the American Chemical Society Publications Division. Launched in 1924, the Journal of Chemical Education is the world’s premier chemical education journal. The Journal publishes peer-reviewed articles and related information as a resource to those in the field of chemical education and to those institutions that serve them. JCE typically addresses chemical content, activities, laboratory experiments, instructional methods, and pedagogies. The Journal serves as a means of communication among people across the world who are interested in the teaching and learning of chemistry. This includes instructors of chemistry from middle school through graduate school, professional staff who support these teaching activities, as well as some scientists in commerce, industry, and government.