Exploring how base model combination affects the results of a “stacking” ensemble machine learning model: An applied study on optimization of heteroatom doped carbon data

IF 5.9 3区 材料科学 Q2 CHEMISTRY, PHYSICAL
Krittapong Deshsorn , Weekit Sirisaksoontorn , Wisit Hirunpinyopas , Pawin Iamprasertkun
{"title":"Exploring how base model combination affects the results of a “stacking” ensemble machine learning model: An applied study on optimization of heteroatom doped carbon data","authors":"Krittapong Deshsorn ,&nbsp;Weekit Sirisaksoontorn ,&nbsp;Wisit Hirunpinyopas ,&nbsp;Pawin Iamprasertkun","doi":"10.1016/j.flatc.2025.100827","DOIUrl":null,"url":null,"abstract":"<div><div>This study explores stack models for electrochemical analysis, incorporating base models (decision trees, linear regression, and k-nearest neighbors) and a meta-model. It reveals that the order of stacking base models affects predictions, often yielding multiple solutions. To address this “uncertainty,” a novel “sorting” technique was applied during meta-model training. This approach significantly reduced model uncertainty, achieving the most accurate predictions and minimizing order deviations (mean absolute error of 37.92388; standard deviation reduced from 6.19 × 10<sup>−15</sup> to 0). The refined model was applied to analyze synergies in electrochemical and material properties using feature importance tools, such as SHAP, Feature Permutation Importance (FPI), and Partial Dependence Plots (PDP). Key insights for heteroatom-doped carbon supercapacitors suggest maximizing surface area and nitrogen, sulfur, and boron doping while minimizing current density and acidic electrolyte concentration. Optimal oxygen and phosphorus doping levels were ∼ 15 % and ∼ 2.5 %, respectively. FPI ranked nitrogen &gt; surface area &gt; electrolyte concentration &gt; oxygen &gt; current density &gt; defect ratio &gt; sulfur &gt; boron &gt; phosphorus. PDP revealed that dual heteroatom doping (e.g., nitrogen and oxygen) may outperform doping with five heteroatoms. These findings enhance machine learning's reliability in materials science, offering pathways for efficient synthesis and optimization in two-dimensional materials.</div></div>","PeriodicalId":316,"journal":{"name":"FlatChem","volume":"50 ","pages":"Article 100827"},"PeriodicalIF":5.9000,"publicationDate":"2025-01-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"FlatChem","FirstCategoryId":"88","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2452262725000212","RegionNum":3,"RegionCategory":"材料科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"CHEMISTRY, PHYSICAL","Score":null,"Total":0}
引用次数: 0

Abstract

This study explores stack models for electrochemical analysis, incorporating base models (decision trees, linear regression, and k-nearest neighbors) and a meta-model. It reveals that the order of stacking base models affects predictions, often yielding multiple solutions. To address this “uncertainty,” a novel “sorting” technique was applied during meta-model training. This approach significantly reduced model uncertainty, achieving the most accurate predictions and minimizing order deviations (mean absolute error of 37.92388; standard deviation reduced from 6.19 × 10−15 to 0). The refined model was applied to analyze synergies in electrochemical and material properties using feature importance tools, such as SHAP, Feature Permutation Importance (FPI), and Partial Dependence Plots (PDP). Key insights for heteroatom-doped carbon supercapacitors suggest maximizing surface area and nitrogen, sulfur, and boron doping while minimizing current density and acidic electrolyte concentration. Optimal oxygen and phosphorus doping levels were ∼ 15 % and ∼ 2.5 %, respectively. FPI ranked nitrogen > surface area > electrolyte concentration > oxygen > current density > defect ratio > sulfur > boron > phosphorus. PDP revealed that dual heteroatom doping (e.g., nitrogen and oxygen) may outperform doping with five heteroatoms. These findings enhance machine learning's reliability in materials science, offering pathways for efficient synthesis and optimization in two-dimensional materials.

Abstract Image

探索基模型组合如何影响“堆叠”集成机器学习模型的结果:杂原子掺杂碳数据优化的应用研究
本研究探讨了电化学分析的堆栈模型,包括基本模型(决策树、线性回归和k近邻)和元模型。它揭示了基础模型的堆叠顺序会影响预测,通常会产生多个解决方案。为了解决这种“不确定性”,在元模型训练期间应用了一种新的“排序”技术。该方法显著降低了模型的不确定性,实现了最准确的预测和最小的顺序偏差(平均绝对误差为37.92388;标准偏差从6.19 × 10−15降至0)。改进后的模型使用特征重要性工具(如SHAP、特征排列重要性(FPI)和部分依赖图(PDP))来分析电化学和材料性能之间的协同作用。杂原子掺杂碳超级电容器的关键见解建议最大化表面积和氮,硫和硼掺杂,同时最小化电流密度和酸性电解质浓度。氧和磷的最佳掺杂水平分别为~ 15%和~ 2.5%。FPI排名氮气>;表面积>;电解质浓度>;氧气比;电流密度>;缺陷率>;硫在硼比;磷。PDP显示双杂原子掺杂(如氮和氧)可能优于五杂原子掺杂。这些发现增强了机器学习在材料科学中的可靠性,为二维材料的有效合成和优化提供了途径。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
FlatChem
FlatChem Multiple-
CiteScore
8.40
自引率
6.50%
发文量
104
审稿时长
26 days
期刊介绍: FlatChem - Chemistry of Flat Materials, a new voice in the community, publishes original and significant, cutting-edge research related to the chemistry of graphene and related 2D & layered materials. The overall aim of the journal is to combine the chemistry and applications of these materials, where the submission of communications, full papers, and concepts should contain chemistry in a materials context, which can be both experimental and/or theoretical. In addition to original research articles, FlatChem also offers reviews, minireviews, highlights and perspectives on the future of this research area with the scientific leaders in fields related to Flat Materials. Topics of interest include, but are not limited to, the following: -Design, synthesis, applications and investigation of graphene, graphene related materials and other 2D & layered materials (for example Silicene, Germanene, Phosphorene, MXenes, Boron nitride, Transition metal dichalcogenides) -Characterization of these materials using all forms of spectroscopy and microscopy techniques -Chemical modification or functionalization and dispersion of these materials, as well as interactions with other materials -Exploring the surface chemistry of these materials for applications in: Sensors or detectors in electrochemical/Lab on a Chip devices, Composite materials, Membranes, Environment technology, Catalysis for energy storage and conversion (for example fuel cells, supercapacitors, batteries, hydrogen storage), Biomedical technology (drug delivery, biosensing, bioimaging)
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信