Matheus Máximo-Canadas, Julio Cesar Duarte, Jakler Nichele, Leonardo Santos de Brito Alves, Luiz Octavio Vieira Pereira, Rogerio Ramos and Itamar Borges Jr.*,
{"title":"从不同实验中建立一致数据集的系统和通用机器学习方法:应用于甲烷的导热性","authors":"Matheus Máximo-Canadas, Julio Cesar Duarte, Jakler Nichele, Leonardo Santos de Brito Alves, Luiz Octavio Vieira Pereira, Rogerio Ramos and Itamar Borges Jr.*, ","doi":"10.1021/acsengineeringau.5c0000110.1021/acsengineeringau.5c00001","DOIUrl":null,"url":null,"abstract":"<p >Experimental data from different sources present challenges due to variability and noise from various experimental conditions, apparatuses, and environmental factors. In this work, we propose a general method to address these challenges to build a consistent data set. As a case study, we analyze experimental data sets of methane’s thermal conductivity across the liquid, vapor, and supercritical phases. The method is based on machine learning (ML) techniques, which consistently integrate data from various experimental sources. It feeds raw data compiled by the National Institute of Standards and Technology (NIST) database to different ML algorithms to achieve this purpose. Our findings indicate that ML models yield predictions closer to the NIST’s processed data than to the original raw experimental data used to train the models. This demonstrates the models’ generalization from heterogeneous, noisy, and untreated data sets. While our approach does not eliminate preprocessing, it suggests that ML can autonomously handle noisy data, providing a faster and cost-effective alternative to traditional pre- and postprocessing methods. By guiding the refinement of labor-intensive methods, ML proves adaptable for real-time data, enabling immediate adjustments and revolutionizing industrial and scientific optimizations. Therefore, the proposed ML approach is general and efficient in handling complex and heterogeneous data to deliver reliable predictions without extensive preprocessing.</p>","PeriodicalId":29804,"journal":{"name":"ACS Engineering Au","volume":"5 3","pages":"226–233 226–233"},"PeriodicalIF":5.1000,"publicationDate":"2025-03-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://pubs.acs.org/doi/epdf/10.1021/acsengineeringau.5c00001","citationCount":"0","resultStr":"{\"title\":\"A Systematic and General Machine Learning Approach to Build a Consistent Data Set from Different Experiments: Application to the Thermal Conductivity of Methane\",\"authors\":\"Matheus Máximo-Canadas, Julio Cesar Duarte, Jakler Nichele, Leonardo Santos de Brito Alves, Luiz Octavio Vieira Pereira, Rogerio Ramos and Itamar Borges Jr.*, \",\"doi\":\"10.1021/acsengineeringau.5c0000110.1021/acsengineeringau.5c00001\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p >Experimental data from different sources present challenges due to variability and noise from various experimental conditions, apparatuses, and environmental factors. In this work, we propose a general method to address these challenges to build a consistent data set. As a case study, we analyze experimental data sets of methane’s thermal conductivity across the liquid, vapor, and supercritical phases. The method is based on machine learning (ML) techniques, which consistently integrate data from various experimental sources. It feeds raw data compiled by the National Institute of Standards and Technology (NIST) database to different ML algorithms to achieve this purpose. Our findings indicate that ML models yield predictions closer to the NIST’s processed data than to the original raw experimental data used to train the models. This demonstrates the models’ generalization from heterogeneous, noisy, and untreated data sets. While our approach does not eliminate preprocessing, it suggests that ML can autonomously handle noisy data, providing a faster and cost-effective alternative to traditional pre- and postprocessing methods. By guiding the refinement of labor-intensive methods, ML proves adaptable for real-time data, enabling immediate adjustments and revolutionizing industrial and scientific optimizations. Therefore, the proposed ML approach is general and efficient in handling complex and heterogeneous data to deliver reliable predictions without extensive preprocessing.</p>\",\"PeriodicalId\":29804,\"journal\":{\"name\":\"ACS Engineering Au\",\"volume\":\"5 3\",\"pages\":\"226–233 226–233\"},\"PeriodicalIF\":5.1000,\"publicationDate\":\"2025-03-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://pubs.acs.org/doi/epdf/10.1021/acsengineeringau.5c00001\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ACS Engineering Au\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://pubs.acs.org/doi/10.1021/acsengineeringau.5c00001\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"ENGINEERING, CHEMICAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACS Engineering Au","FirstCategoryId":"1085","ListUrlMain":"https://pubs.acs.org/doi/10.1021/acsengineeringau.5c00001","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, CHEMICAL","Score":null,"Total":0}
A Systematic and General Machine Learning Approach to Build a Consistent Data Set from Different Experiments: Application to the Thermal Conductivity of Methane
Experimental data from different sources present challenges due to variability and noise from various experimental conditions, apparatuses, and environmental factors. In this work, we propose a general method to address these challenges to build a consistent data set. As a case study, we analyze experimental data sets of methane’s thermal conductivity across the liquid, vapor, and supercritical phases. The method is based on machine learning (ML) techniques, which consistently integrate data from various experimental sources. It feeds raw data compiled by the National Institute of Standards and Technology (NIST) database to different ML algorithms to achieve this purpose. Our findings indicate that ML models yield predictions closer to the NIST’s processed data than to the original raw experimental data used to train the models. This demonstrates the models’ generalization from heterogeneous, noisy, and untreated data sets. While our approach does not eliminate preprocessing, it suggests that ML can autonomously handle noisy data, providing a faster and cost-effective alternative to traditional pre- and postprocessing methods. By guiding the refinement of labor-intensive methods, ML proves adaptable for real-time data, enabling immediate adjustments and revolutionizing industrial and scientific optimizations. Therefore, the proposed ML approach is general and efficient in handling complex and heterogeneous data to deliver reliable predictions without extensive preprocessing.
期刊介绍:
)ACS Engineering Au is an open access journal that reports significant advances in chemical engineering applied chemistry and energy covering fundamentals processes and products. The journal's broad scope includes experimental theoretical mathematical computational chemical and physical research from academic and industrial settings. Short letters comprehensive articles reviews and perspectives are welcome on topics that include:Fundamental research in such areas as thermodynamics transport phenomena (flow mixing mass & heat transfer) chemical reaction kinetics and engineering catalysis separations interfacial phenomena and materialsProcess design development and intensification (e.g. process technologies for chemicals and materials synthesis and design methods process intensification multiphase reactors scale-up systems analysis process control data correlation schemes modeling machine learning Artificial Intelligence)Product research and development involving chemical and engineering aspects (e.g. catalysts plastics elastomers fibers adhesives coatings paper membranes lubricants ceramics aerosols fluidic devices intensified process equipment)Energy and fuels (e.g. pre-treatment processing and utilization of renewable energy resources; processing and utilization of fuels; properties and structure or molecular composition of both raw fuels and refined products; fuel cells hydrogen batteries; photochemical fuel and energy production; decarbonization; electrification; microwave; cavitation)Measurement techniques computational models and data on thermo-physical thermodynamic and transport properties of materials and phase equilibrium behaviorNew methods models and tools (e.g. real-time data analytics multi-scale models physics informed machine learning models machine learning enhanced physics-based models soft sensors high-performance computing)