Tanaporn Na Narong, Zoe N. Zachko, Steven B. Torrisi, Simon J. L. Billinge
{"title":"对 X 射线吸收近缘光谱和线对分布函数进行可解释的多模态机器学习分析","authors":"Tanaporn Na Narong, Zoe N. Zachko, Steven B. Torrisi, Simon J. L. Billinge","doi":"10.1038/s41524-025-01589-3","DOIUrl":null,"url":null,"abstract":"<p>We used interpretable machine learning to combine information from multiple heterogeneous spectra: X-ray absorption near-edge spectra (XANES) and atomic pair distribution functions (PDFs) to extract local structural and chemical environments of transition metal cations in oxides. Random forest models were trained on simulated XANES, PDF, and both combined to extract oxidation state, coordination number, and mean nearest-neighbor bond length. XANES-only models generally outperformed PDF-only models, even for structural tasks, although using the metal’s differential-PDFs (dPDFs) instead of total-PDFs narrowed this gap. When combined with PDFs, information from XANES often dominates the prediction. Our results demonstrate that XANES contains rich structural information and highlight the utility of species-specificity. This interpretable, multimodal approach is quick to implement with suitable databases and offers valuable insights into the relative strengths of different modalities, guiding researchers in experiment design and identifying when combining complementary techniques adds meaningful information to a scientific investigation.</p>","PeriodicalId":19342,"journal":{"name":"npj Computational Materials","volume":"80 1","pages":""},"PeriodicalIF":9.4000,"publicationDate":"2025-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Interpretable multimodal machine learning analysis of X-ray absorption near-edge spectra and pair distribution functions\",\"authors\":\"Tanaporn Na Narong, Zoe N. Zachko, Steven B. Torrisi, Simon J. L. Billinge\",\"doi\":\"10.1038/s41524-025-01589-3\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>We used interpretable machine learning to combine information from multiple heterogeneous spectra: X-ray absorption near-edge spectra (XANES) and atomic pair distribution functions (PDFs) to extract local structural and chemical environments of transition metal cations in oxides. Random forest models were trained on simulated XANES, PDF, and both combined to extract oxidation state, coordination number, and mean nearest-neighbor bond length. XANES-only models generally outperformed PDF-only models, even for structural tasks, although using the metal’s differential-PDFs (dPDFs) instead of total-PDFs narrowed this gap. When combined with PDFs, information from XANES often dominates the prediction. Our results demonstrate that XANES contains rich structural information and highlight the utility of species-specificity. This interpretable, multimodal approach is quick to implement with suitable databases and offers valuable insights into the relative strengths of different modalities, guiding researchers in experiment design and identifying when combining complementary techniques adds meaningful information to a scientific investigation.</p>\",\"PeriodicalId\":19342,\"journal\":{\"name\":\"npj Computational Materials\",\"volume\":\"80 1\",\"pages\":\"\"},\"PeriodicalIF\":9.4000,\"publicationDate\":\"2025-04-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"npj Computational Materials\",\"FirstCategoryId\":\"88\",\"ListUrlMain\":\"https://doi.org/10.1038/s41524-025-01589-3\",\"RegionNum\":1,\"RegionCategory\":\"材料科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"CHEMISTRY, PHYSICAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"npj Computational Materials","FirstCategoryId":"88","ListUrlMain":"https://doi.org/10.1038/s41524-025-01589-3","RegionNum":1,"RegionCategory":"材料科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CHEMISTRY, PHYSICAL","Score":null,"Total":0}
引用次数: 0
摘要
我们使用可解释的机器学习来结合来自多种异质光谱的信息:X 射线吸收近缘光谱 (XANES) 和原子对分布函数 (PDF),以提取氧化物中过渡金属阳离子的局部结构和化学环境。随机森林模型是在模拟 XANES 和 PDF 的基础上进行训练的,两者结合可提取氧化态、配位数和平均近邻键长度。仅 XANES 模型的表现普遍优于仅 PDF 模型,即使在结构任务方面也是如此,不过使用金属的差分 PDF(dPDF)而不是总 PDF 缩小了这一差距。当与 PDF 结合使用时,来自 XANES 的信息往往在预测中占主导地位。我们的研究结果表明,XANES 包含丰富的结构信息,并突出了物种特异性的效用。这种可解释的多模态方法可通过合适的数据库快速实施,并能为不同模态的相对优势提供有价值的见解,从而指导研究人员进行实验设计,并确定何时将互补技术相结合能为科学研究增加有意义的信息。
Interpretable multimodal machine learning analysis of X-ray absorption near-edge spectra and pair distribution functions
We used interpretable machine learning to combine information from multiple heterogeneous spectra: X-ray absorption near-edge spectra (XANES) and atomic pair distribution functions (PDFs) to extract local structural and chemical environments of transition metal cations in oxides. Random forest models were trained on simulated XANES, PDF, and both combined to extract oxidation state, coordination number, and mean nearest-neighbor bond length. XANES-only models generally outperformed PDF-only models, even for structural tasks, although using the metal’s differential-PDFs (dPDFs) instead of total-PDFs narrowed this gap. When combined with PDFs, information from XANES often dominates the prediction. Our results demonstrate that XANES contains rich structural information and highlight the utility of species-specificity. This interpretable, multimodal approach is quick to implement with suitable databases and offers valuable insights into the relative strengths of different modalities, guiding researchers in experiment design and identifying when combining complementary techniques adds meaningful information to a scientific investigation.
期刊介绍:
npj Computational Materials is a high-quality open access journal from Nature Research that publishes research papers applying computational approaches for the design of new materials and enhancing our understanding of existing ones. The journal also welcomes papers on new computational techniques and the refinement of current approaches that support these aims, as well as experimental papers that complement computational findings.
Some key features of npj Computational Materials include a 2-year impact factor of 12.241 (2021), article downloads of 1,138,590 (2021), and a fast turnaround time of 11 days from submission to the first editorial decision. The journal is indexed in various databases and services, including Chemical Abstracts Service (ACS), Astrophysics Data System (ADS), Current Contents/Physical, Chemical and Earth Sciences, Journal Citation Reports/Science Edition, SCOPUS, EI Compendex, INSPEC, Google Scholar, SCImago, DOAJ, CNKI, and Science Citation Index Expanded (SCIE), among others.