{"title":"将主成分分析引入分析课程的开源可视化编程软件","authors":"Tai-Sheng Yeh*, ","doi":"10.1021/acs.jchemed.4c0031110.1021/acs.jchemed.4c00311","DOIUrl":null,"url":null,"abstract":"<p >With the increasing complexity of analytical data nowadays, great reliance on statistical and chemometric software is quite common for scientists. Powerful open-source software, such as Python, R, and the commercial software MATLAB, demands good coding skills. Writing original code could be challenging for students with no prior programming experience. Orange Data Mining is a Python based visual programming software that has been used widely in many scientific publications. Principal component analysis (PCA) is one of the most common exploratory data analysis techniques with applications in outlier detection, dimensionality reduction, graphical clustering, and classification. By using a program workflow based on widgets (a computational unit within Orange), the task of PCA can be done very quickly. The same workflow could be used for different types of analytical data without the need for reprogramming again. The application of Orange Data Mining software to PCA exploratory analysis of sugar NIR spectral data from a portable NIR spectrometer will be demonstrated. Further data sets including multivariate coffee composition data, instant coffee FTIR spectra, vegetable oil fatty acid composition, and vegetable oil NMR spectra were given as Supporting Information to enhance the learning of software through repetition. From the demonstration, it can be easily seen how Orange Data Mining software will be useful for introducing PCA to the analytical curriculum.</p>","PeriodicalId":43,"journal":{"name":"Journal of Chemical Education","volume":"102 4","pages":"1428–1435 1428–1435"},"PeriodicalIF":2.5000,"publicationDate":"2025-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Open-Source Visual Programming Software for Introducing Principal Component Analysis to the Analytical Curriculum\",\"authors\":\"Tai-Sheng Yeh*, \",\"doi\":\"10.1021/acs.jchemed.4c0031110.1021/acs.jchemed.4c00311\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p >With the increasing complexity of analytical data nowadays, great reliance on statistical and chemometric software is quite common for scientists. Powerful open-source software, such as Python, R, and the commercial software MATLAB, demands good coding skills. Writing original code could be challenging for students with no prior programming experience. Orange Data Mining is a Python based visual programming software that has been used widely in many scientific publications. Principal component analysis (PCA) is one of the most common exploratory data analysis techniques with applications in outlier detection, dimensionality reduction, graphical clustering, and classification. By using a program workflow based on widgets (a computational unit within Orange), the task of PCA can be done very quickly. The same workflow could be used for different types of analytical data without the need for reprogramming again. The application of Orange Data Mining software to PCA exploratory analysis of sugar NIR spectral data from a portable NIR spectrometer will be demonstrated. Further data sets including multivariate coffee composition data, instant coffee FTIR spectra, vegetable oil fatty acid composition, and vegetable oil NMR spectra were given as Supporting Information to enhance the learning of software through repetition. From the demonstration, it can be easily seen how Orange Data Mining software will be useful for introducing PCA to the analytical curriculum.</p>\",\"PeriodicalId\":43,\"journal\":{\"name\":\"Journal of Chemical Education\",\"volume\":\"102 4\",\"pages\":\"1428–1435 1428–1435\"},\"PeriodicalIF\":2.5000,\"publicationDate\":\"2025-03-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Chemical Education\",\"FirstCategoryId\":\"92\",\"ListUrlMain\":\"https://pubs.acs.org/doi/10.1021/acs.jchemed.4c00311\",\"RegionNum\":3,\"RegionCategory\":\"教育学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"CHEMISTRY, MULTIDISCIPLINARY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Chemical Education","FirstCategoryId":"92","ListUrlMain":"https://pubs.acs.org/doi/10.1021/acs.jchemed.4c00311","RegionNum":3,"RegionCategory":"教育学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"CHEMISTRY, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0
摘要
在分析数据日益复杂的今天,科学家对统计和化学计量软件的依赖是相当普遍的。强大的开源软件,如Python、R和商业软件MATLAB,需要良好的编码技能。对于没有编程经验的学生来说,编写原始代码可能是一项挑战。Orange Data Mining是一个基于Python的可视化编程软件,已在许多科学出版物中广泛使用。主成分分析(PCA)是一种最常见的探索性数据分析技术,应用于离群值检测、降维、图形聚类和分类。通过使用基于小部件(Orange中的计算单元)的程序工作流,可以非常快速地完成PCA任务。相同的工作流可以用于不同类型的分析数据,而无需再次重新编程。本文将展示Orange Data Mining软件在便携式近红外光谱仪糖近红外光谱数据PCA探索性分析中的应用。进一步提供多元咖啡成分数据、速溶咖啡FTIR光谱、植物油脂肪酸组成、植物油核磁共振光谱等数据集作为辅助信息,通过重复来增强软件的学习能力。从演示中,可以很容易地看到Orange数据挖掘软件将如何有助于将PCA引入分析课程。
Open-Source Visual Programming Software for Introducing Principal Component Analysis to the Analytical Curriculum
With the increasing complexity of analytical data nowadays, great reliance on statistical and chemometric software is quite common for scientists. Powerful open-source software, such as Python, R, and the commercial software MATLAB, demands good coding skills. Writing original code could be challenging for students with no prior programming experience. Orange Data Mining is a Python based visual programming software that has been used widely in many scientific publications. Principal component analysis (PCA) is one of the most common exploratory data analysis techniques with applications in outlier detection, dimensionality reduction, graphical clustering, and classification. By using a program workflow based on widgets (a computational unit within Orange), the task of PCA can be done very quickly. The same workflow could be used for different types of analytical data without the need for reprogramming again. The application of Orange Data Mining software to PCA exploratory analysis of sugar NIR spectral data from a portable NIR spectrometer will be demonstrated. Further data sets including multivariate coffee composition data, instant coffee FTIR spectra, vegetable oil fatty acid composition, and vegetable oil NMR spectra were given as Supporting Information to enhance the learning of software through repetition. From the demonstration, it can be easily seen how Orange Data Mining software will be useful for introducing PCA to the analytical curriculum.
期刊介绍:
The Journal of Chemical Education is the official journal of the Division of Chemical Education of the American Chemical Society, co-published with the American Chemical Society Publications Division. Launched in 1924, the Journal of Chemical Education is the world’s premier chemical education journal. The Journal publishes peer-reviewed articles and related information as a resource to those in the field of chemical education and to those institutions that serve them. JCE typically addresses chemical content, activities, laboratory experiments, instructional methods, and pedagogies. The Journal serves as a means of communication among people across the world who are interested in the teaching and learning of chemistry. This includes instructors of chemistry from middle school through graduate school, professional staff who support these teaching activities, as well as some scientists in commerce, industry, and government.