基于变分自编码器的数据增强加速多组分铁电体剩余极化预测

IF 5.1 2区 材料科学 Q2 MATERIALS SCIENCE, MULTIDISCIPLINARY
Zixiong Sun, Ruyue Gao, Ping Wang, Xinying Liu, Yujie Bai, Jingyu Luo, Hongyu Yang and Wanbiao Hu
{"title":"基于变分自编码器的数据增强加速多组分铁电体剩余极化预测","authors":"Zixiong Sun, Ruyue Gao, Ping Wang, Xinying Liu, Yujie Bai, Jingyu Luo, Hongyu Yang and Wanbiao Hu","doi":"10.1039/D5TC01781E","DOIUrl":null,"url":null,"abstract":"<p >As potential next-generation power systems, ferroelectric capacitors have been thus widely studied, and artificial intelligence (AI) is becoming an efficient tool for searching new systems. As a key parameter that directly affects the energy storage density (<em>W</em><small><sub>rec</sub></small>) of capacitors, obtaining low remanent polarization (<em>P</em><small><sub>r</sub></small>) is important. To enhance the processing of high-dimensional and nonlinear data and to predict key parameters, this study employs a strategy that integrates data augmentation with feature selection. Based on the atomic structure, electronic configuration, and crystal structure of (K<small><sub>1−<em>x</em>−<em>y</em>−<em>z</em></sub></small>Na<small><sub><em>x</em></sub></small>Ba<small><sub><em>y</em></sub></small>Ca<small><sub><em>z</em></sub></small>)(Nb<small><sub>1−<em>u</em>−<em>v</em>−<em>w</em></sub></small>Zr<small><sub><em>u</em></sub></small>Ti<small><sub><em>v</em></sub></small>)O<small><sub>3</sub></small>, we selected 46 initial features. Subsequently, using a conditional variational autoencoder (CVAE), we synthesized 20 000 new data points from 234 original samples to expand the dataset and verify the credibility of the generated data. Finally, through a machine learning strategy, multiple algorithm models were established for training and prediction <em>P</em><small><sub>r</sub></small>; the determination coefficient (<em>R</em><small><sup>2</sup></small>) of the XGBoost (XGB) model was 0.94 for training and predicting <em>P</em><small><sub>r</sub></small>, and through a series of feature selection processes, ultimately four kinds of key descriptors that affect <em>P</em><small><sub>r</sub></small> were identified: Matyonov–Batsanov electronegativity, Shannon ionic radius, tolerance factor, and core electron distance (Schubert) of A-site elements. The model accurately predicted the properties of two ceramic systems, including samples with elements beyond the original input space, and the model still showed strong predictive ability. This study not only offers valuable insights for enriching sparse datasets in materials science <em>via</em> data augmentation but also demonstrates an effective strategy for accelerating the prediction of remnant polarization in complex ferroelectric systems.</p>","PeriodicalId":84,"journal":{"name":"Journal of Materials Chemistry C","volume":" 32","pages":" 16551-16561"},"PeriodicalIF":5.1000,"publicationDate":"2025-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Accelerating the prediction of remanent polarization in multicomponent ferroelectrics by using variational autoencoder-based data augmentation†\",\"authors\":\"Zixiong Sun, Ruyue Gao, Ping Wang, Xinying Liu, Yujie Bai, Jingyu Luo, Hongyu Yang and Wanbiao Hu\",\"doi\":\"10.1039/D5TC01781E\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p >As potential next-generation power systems, ferroelectric capacitors have been thus widely studied, and artificial intelligence (AI) is becoming an efficient tool for searching new systems. As a key parameter that directly affects the energy storage density (<em>W</em><small><sub>rec</sub></small>) of capacitors, obtaining low remanent polarization (<em>P</em><small><sub>r</sub></small>) is important. To enhance the processing of high-dimensional and nonlinear data and to predict key parameters, this study employs a strategy that integrates data augmentation with feature selection. Based on the atomic structure, electronic configuration, and crystal structure of (K<small><sub>1−<em>x</em>−<em>y</em>−<em>z</em></sub></small>Na<small><sub><em>x</em></sub></small>Ba<small><sub><em>y</em></sub></small>Ca<small><sub><em>z</em></sub></small>)(Nb<small><sub>1−<em>u</em>−<em>v</em>−<em>w</em></sub></small>Zr<small><sub><em>u</em></sub></small>Ti<small><sub><em>v</em></sub></small>)O<small><sub>3</sub></small>, we selected 46 initial features. Subsequently, using a conditional variational autoencoder (CVAE), we synthesized 20 000 new data points from 234 original samples to expand the dataset and verify the credibility of the generated data. Finally, through a machine learning strategy, multiple algorithm models were established for training and prediction <em>P</em><small><sub>r</sub></small>; the determination coefficient (<em>R</em><small><sup>2</sup></small>) of the XGBoost (XGB) model was 0.94 for training and predicting <em>P</em><small><sub>r</sub></small>, and through a series of feature selection processes, ultimately four kinds of key descriptors that affect <em>P</em><small><sub>r</sub></small> were identified: Matyonov–Batsanov electronegativity, Shannon ionic radius, tolerance factor, and core electron distance (Schubert) of A-site elements. The model accurately predicted the properties of two ceramic systems, including samples with elements beyond the original input space, and the model still showed strong predictive ability. This study not only offers valuable insights for enriching sparse datasets in materials science <em>via</em> data augmentation but also demonstrates an effective strategy for accelerating the prediction of remnant polarization in complex ferroelectric systems.</p>\",\"PeriodicalId\":84,\"journal\":{\"name\":\"Journal of Materials Chemistry C\",\"volume\":\" 32\",\"pages\":\" 16551-16561\"},\"PeriodicalIF\":5.1000,\"publicationDate\":\"2025-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Materials Chemistry C\",\"FirstCategoryId\":\"1\",\"ListUrlMain\":\"https://pubs.rsc.org/en/content/articlelanding/2025/tc/d5tc01781e\",\"RegionNum\":2,\"RegionCategory\":\"材料科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"MATERIALS SCIENCE, MULTIDISCIPLINARY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Materials Chemistry C","FirstCategoryId":"1","ListUrlMain":"https://pubs.rsc.org/en/content/articlelanding/2025/tc/d5tc01781e","RegionNum":2,"RegionCategory":"材料科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MATERIALS SCIENCE, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0

摘要

作为潜在的下一代电力系统,铁电电容器得到了广泛的研究,人工智能(AI)正在成为寻找新系统的有效工具。作为直接影响电容器储能密度(Wrec)的关键参数,获得低剩余极化(Pr)至关重要。为了提高对高维非线性数据的处理能力并预测关键参数,本研究采用了一种将数据增强与特征选择相结合的策略。基于(K1−x−y−zNaxBayCaz)(Nb1−u−v−wZruTiv)O3的原子结构、电子构型和晶体结构,我们选择了46个初始特征。随后,使用条件变分自编码器(CVAE),我们从234个原始样本中合成了20,000个新数据点,以扩展数据集并验证生成数据的可信度。最后,通过机器学习策略,建立多个算法模型进行Pr的训练和预测;XGBoost (XGB)模型用于训练和预测Pr的决定系数(R2)为0.94,并通过一系列特征选择过程,最终确定了影响Pr的4种关键描述符:Matyonov-Batsanov电负性、Shannon离子半径、容差因子和a位元素的核心电子距离(Schubert)。该模型准确预测了两种陶瓷体系的性能,包括元素超出原始输入空间的样品,并且该模型仍然显示出较强的预测能力。该研究不仅为通过数据增强丰富材料科学中的稀疏数据集提供了有价值的见解,而且还展示了加速预测复杂铁电系统中残余极化的有效策略。
本文章由计算机程序翻译,如有差异,请以英文原文为准。

Accelerating the prediction of remanent polarization in multicomponent ferroelectrics by using variational autoencoder-based data augmentation†

Accelerating the prediction of remanent polarization in multicomponent ferroelectrics by using variational autoencoder-based data augmentation†

As potential next-generation power systems, ferroelectric capacitors have been thus widely studied, and artificial intelligence (AI) is becoming an efficient tool for searching new systems. As a key parameter that directly affects the energy storage density (Wrec) of capacitors, obtaining low remanent polarization (Pr) is important. To enhance the processing of high-dimensional and nonlinear data and to predict key parameters, this study employs a strategy that integrates data augmentation with feature selection. Based on the atomic structure, electronic configuration, and crystal structure of (K1−xyzNaxBayCaz)(Nb1−uvwZruTiv)O3, we selected 46 initial features. Subsequently, using a conditional variational autoencoder (CVAE), we synthesized 20 000 new data points from 234 original samples to expand the dataset and verify the credibility of the generated data. Finally, through a machine learning strategy, multiple algorithm models were established for training and prediction Pr; the determination coefficient (R2) of the XGBoost (XGB) model was 0.94 for training and predicting Pr, and through a series of feature selection processes, ultimately four kinds of key descriptors that affect Pr were identified: Matyonov–Batsanov electronegativity, Shannon ionic radius, tolerance factor, and core electron distance (Schubert) of A-site elements. The model accurately predicted the properties of two ceramic systems, including samples with elements beyond the original input space, and the model still showed strong predictive ability. This study not only offers valuable insights for enriching sparse datasets in materials science via data augmentation but also demonstrates an effective strategy for accelerating the prediction of remnant polarization in complex ferroelectric systems.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Journal of Materials Chemistry C
Journal of Materials Chemistry C MATERIALS SCIENCE, MULTIDISCIPLINARY-PHYSICS, APPLIED
CiteScore
10.80
自引率
6.20%
发文量
1468
期刊介绍: The Journal of Materials Chemistry is divided into three distinct sections, A, B, and C, each catering to specific applications of the materials under study: Journal of Materials Chemistry A focuses primarily on materials intended for applications in energy and sustainability. Journal of Materials Chemistry B specializes in materials designed for applications in biology and medicine. Journal of Materials Chemistry C is dedicated to materials suitable for applications in optical, magnetic, and electronic devices. Example topic areas within the scope of Journal of Materials Chemistry C are listed below. This list is neither exhaustive nor exclusive. Bioelectronics Conductors Detectors Dielectrics Displays Ferroelectrics Lasers LEDs Lighting Liquid crystals Memory Metamaterials Multiferroics Photonics Photovoltaics Semiconductors Sensors Single molecule conductors Spintronics Superconductors Thermoelectrics Topological insulators Transistors
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信