Qinyang Li, Rongzhi Dong, Nicholas Miklaucic, Jeffrey Hu, Sadman Sadeed Omee, Lai Wei, Sourin Dey, Ming Hu, Jianjun Hu
{"title":"In context learning foundation models for materials property prediction with small datasets","authors":"Qinyang Li, Rongzhi Dong, Nicholas Miklaucic, Jeffrey Hu, Sadman Sadeed Omee, Lai Wei, Sourin Dey, Ming Hu, Jianjun Hu","doi":"10.1038/s41524-026-02089-8","DOIUrl":"https://doi.org/10.1038/s41524-026-02089-8","url":null,"abstract":"Foundation models (FMs) have recently shown remarkable in-context learning (ICL) capabilities across diverse scientific domains. In this work, we introduce a unified in-context learning foundation model (ICL-FM) framework for materials property prediction that integrates both composition-based and structure-aware representations. The proposed approach couples the pretrained TabPFN transformer with graph neural network (GNN)-derived embeddings and our novel MagpieEX descriptors. MagpieEX augments traditional features with cation-anion interaction data to explicitly measure bond ionicity and charge-transfer asymmetry, capturing interatomic bonding characteristics that influence vibrational and thermal transport properties. Comprehensive experiments on the MatBench benchmark suite and a standalone lattice thermal conductivity (LTC) dataset demonstrate that ICL-FM achieves competitive or superior performance to state-of-the-art (SOTA) models with significantly reduced training costs. Remarkably, the training-free ICL-FM outperformed sophisticated SOTA GNN models in five out of six representative composition-based tasks, including a significant 9.93% improvement in phonon frequency prediction. On the LTC dataset, the FM effectively models complex phenomena such as phonon-phonon scattering and atomic mass contrast. t-SNE analysis reveals that the FM acts as a physics-aware feature refiner, transforming raw, disjoint feature clusters into continuous manifolds with gradual property transitions. This restructured latent space enhances interpolative prediction accuracy while aligning learned representations with underlying physical laws. This study establishes ICL-FM as a generalizable, data-efficient paradigm for materials informatics.","PeriodicalId":19342,"journal":{"name":"npj Computational Materials","volume":"85 1","pages":""},"PeriodicalIF":9.7,"publicationDate":"2026-04-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147751788","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"材料科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Vision language models for scientific image analysis: an evaluation highlighting opportunities and challenges","authors":"Prateek Verma, Minh-Hao Van, Xintao Wu","doi":"10.1038/s41524-026-02069-y","DOIUrl":"https://doi.org/10.1038/s41524-026-02069-y","url":null,"abstract":"Recent advancements in vision language models (VLMs) have opened new avenues for analyzing complex visual data. Models such as ChatGPT, Gemini, Llama and LLaVA have gained prominence for their ability to process both visual and textual data, excelling in tasks like natural image captioning, visual question answering (VQA), and reasoning. Similarly, the Segment Anything Model (SAM) by Meta has demonstrated remarkable segmentation capabilities. Given the importance of microscopy images in fields like biology, medicine, and materials science—where visual data is often analyzed alongside textual information from captions, reports, or literature—it is critical to evaluate the effectiveness of these models on such data. This study assesses the capabilities of ChatGPT-5, Gemini-2.5Pro, Llama-3.2V, LLaVA-1.5 and SAM-2 on classification, segmentation, counting, and VQA tasks using microscopy images. ChatGPT and Gemini excelled in comprehending microscopy images, while SAM performed well in object isolation. Although their performance falls short of domain expert accuracy, particularly when faced with complexities such as impurities, overlaps, and irrelevant artifacts, these models show clear gains compared to prior versions. These findings highlight the promise of VLMs in scientific image analysis and the need for further advancements to meet the demands of expert-level tasks.","PeriodicalId":19342,"journal":{"name":"npj Computational Materials","volume":"18 1","pages":""},"PeriodicalIF":9.7,"publicationDate":"2026-04-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147734081","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"材料科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Hongjian Yuan, Genmao Zhuang, Yang Ren, Hong Wang, Jian Hui
{"title":"Machine learning-assisted high-throughput screening of superlattice-like O-PCM thin films","authors":"Hongjian Yuan, Genmao Zhuang, Yang Ren, Hong Wang, Jian Hui","doi":"10.1038/s41524-026-02091-0","DOIUrl":"https://doi.org/10.1038/s41524-026-02091-0","url":null,"abstract":"Non-volatile optical phase change materials with superlattice-like structures (O-PCMs-SLL) offer transformative application potential for photonic integration with low optical loss and enhanced thermal stability. However, the optimization of O-PCMs-SLL is complicated by a high-dimensional design space in which key structural parameters (e.g., modulation period, layer thickness, deposition sequence) exhibit complex interdependencies. Here, by integrating machine learning with high-throughput method, we establish a data-driven framework for rapid screening of O-PCMs-SLL materials, effectively navigating the vast combinatorial design space. A pre-trained model, built on AI-ready high-throughput experimental data, decodes the composition-structure-process-optical performance relationship for Ge-Sb-X (Te, Sn, Se) SLL thin films. Through a dual-phase learning architecture that strategically combines transfer learning and active learning, the pre-trained model can be adaptively navigated to extrapolated domains, accommodating variations in nanoscale periodicity of the SLL structure. This framework achieves an 85% reduction in data requirements for accurate optimization in the extrapolated space, demonstrating a 274-fold efficiency enhancement in discovery rate when benchmarked against conventional trial-and-error approaches. Our work highlights the pivotal role of AI-informed guidance within high-throughput experimental data-driven materials discovery and provides a generalizable blueprint for accelerated exploration of complex multi-component functional materials.\u0000\u0000The alternative text for this image may have been generated using AI.","PeriodicalId":19342,"journal":{"name":"npj Computational Materials","volume":"145 1","pages":""},"PeriodicalIF":9.7,"publicationDate":"2026-04-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147734082","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"材料科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Diffusion-Wetting: a universal molecular relation","authors":"Lorenzo Agosta, Mikhail Dzugutov","doi":"10.1038/s41524-026-02079-w","DOIUrl":"https://doi.org/10.1038/s41524-026-02079-w","url":null,"abstract":"Quantifying wettability at the nanoscale remains challenging because apparent macroscopic contact angles average over multiple surface-specific effects such as roughness, chemical heterogeneity, and defect/pinning, thereby obscuring the underlying microscopic hydrophilic or hydrophobic response. We derive an analytical relation linking the microscopic water contact angle to the lateral diffusion of interfacial molecules, establishing a quantitative connection between water dynamics and the wetting behavior. Molecular dynamics simulations confirm that the ratio of interfacial to bulk diffusion uniquely determines the contact angle across the full hydrophilic-hydrophobic spectrum. This diffusion-based formulation eliminates the need for droplet geometries or free-energy sampling, enabling quantitative assessment of wetting directly from equilibrium molecular dynamics trajectories, which for simple interfaces can be remarkably short. The approach provides a universal and efficient route to evaluate surface affinity in reactive, defective, or confined environments.","PeriodicalId":19342,"journal":{"name":"npj Computational Materials","volume":"25 1","pages":""},"PeriodicalIF":9.7,"publicationDate":"2026-04-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147734086","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"材料科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Data-efficient machine-learning of complex Fe–Mo intermetallics using domain knowledge of chemistry and crystallography","authors":"Mariano Forti, Alesya Malakhova, Yury Lysogorskiy, Wenhao Zhang, Jean-Claude Crivello, Jean-Marc Joubert, Ralf Drautz, Thomas Hammerschmidt","doi":"10.1038/s41524-026-02070-5","DOIUrl":"https://doi.org/10.1038/s41524-026-02070-5","url":null,"abstract":"Atomistic simulations of multi-component systems require accurate descriptions of interatomic interactions to resolve energy differences between competing phases. Particularly challenging are topologically close-packed (TCP) phases with structural similarities and nearly-degenerate different site occupations even in binary systems like Fe–Mo. In this work, data-efficient machine-learning (ML) models are presented that address this challenge by using features with domain knowledge of chemistry and crystallography, enabling accurate and robust predictions for the complex TCP phases R, M, P, and δ with 11–14 WS after training on simple TCP phases A15, σ, χ, μ, C14, C15, and C36 with 2–5 Wyckoff sites (WS). Several ML models based on kernel-ridge regression, multilayer perceptrons, and random forests are trained on fewer than 300 DFT calculations for the simple TCP phases in the Fe–Mo system. Model performance is shown to improve systematically with increasing use of domain knowledge, reaching uncertainties below 25 meV/atom for the predicted convex hulls of the complex TCP phases and showing excellent agreement with DFT verification. Complementary X-ray diffraction experiments and Rietveld analysis are conducted for a Fe–Mo R-phase sample. The measured WS occupancies show excellent agreement with ML-model predictions obtained using the Bragg-Williams approximation at the same temperature.","PeriodicalId":19342,"journal":{"name":"npj Computational Materials","volume":"25 1","pages":""},"PeriodicalIF":9.7,"publicationDate":"2026-04-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147734083","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"材料科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Spin-valley-mismatched altermagnet for giant tunneling magnetoresistance","authors":"Kun Yan, Yizhi Hu, Wei-Hua Xiao, Xiaolong Zou, Xiaobin Chen, Wenhui Duan","doi":"10.1038/s41524-026-02083-0","DOIUrl":"https://doi.org/10.1038/s41524-026-02083-0","url":null,"abstract":"Altermagnet-based heterojunctions have demonstrated magnetoresistive effects in experiments, however, a predictive theoretical model for non-ferromagnetic structures has remained elusive. In this work, we develop a tunneling-based spin-transport theory that explicitly incorporates the transverse-wavevector (k∥)-dependent spin polarization of an altermagnet’s transport channels, enabling the prediction of giant tunneling magnetoresistance (TMR). Based on the theory, we predict that the altermagnet KV2Se2O can reach the extreme limit of magnetoresistance. By performing first-principles transport calculations, we verify that magnetic tunnel junctions using the metallic KV2Se2O as the electrodes and few-layer MgO as the spacer exhibit zero-bias magnetoresistance larger than 7.57 × 107%, which is robust against the bias and thickness of the spacer. Our research provides a quantitative design principle for next-generation spin-electronic devices and establishes KV2Se2O/MgO/KV2Se2O as a leading candidate material system for room-temperature ultra-high-density non-volatile memory.","PeriodicalId":19342,"journal":{"name":"npj Computational Materials","volume":"96 1","pages":""},"PeriodicalIF":9.7,"publicationDate":"2026-04-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147734085","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"材料科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Open electrolyte database generated via an automated molecular dynamics simulation framework","authors":"Kou Nakamura, Norio Takenaka, Masatoshi Hanai, Yuna Oikawa, Ryo Tamura, Koji Tsuda, Masanobu Nakayama, Junichiro Shiomi, Atsuo Yamada","doi":"10.1038/s41524-026-02093-y","DOIUrl":"https://doi.org/10.1038/s41524-026-02093-y","url":null,"abstract":"Modern battery technologies demand electrolytes that simultaneously deliver multiple functions tailored to diverse applications. Achieving such multi-objective optimization remains fundamentally challenging, as many key electrolyte properties are intrinsically in competition. Data-driven approaches provide a systematic route to navigating the vast compositional space of electrolyte systems; however, their effectiveness critically depends on the availability of comprehensive datasets that capture not only descriptors derived from isolated molecular structures and properties but also structural and physicochemical characteristics of electrolytes as integrated solutions. Here, we report an open electrolyte database comprising approximately 5600 electrolyte formulations, generated using a fully automated, high-throughput molecular dynamics simulation framework. The dataset spans diverse combinations of solvents, salts, and concentrations, and provides unified descriptions of electrolyte structures and physicochemical properties. To facilitate data exploration and utilization, we implement a web-based graphical user interface (https://oedb.jp) that enables interactive browsing and comparison of electrolyte compositions together with their associated descriptors, while also making the database accessible to LLM-based agents for data reference. This Open Electrolyte Database for Batteries (OEDB) establishes a foundation for data-driven electrolyte design grounded in structure–property relationships at the electrolyte level.","PeriodicalId":19342,"journal":{"name":"npj Computational Materials","volume":"65 1","pages":""},"PeriodicalIF":9.7,"publicationDate":"2026-04-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147734084","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"材料科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"MOFBuilder: automated end-to-end modeling of MOF dynamics for high-throughput screening","authors":"Chenxi Li, Mårten S. G. Ahlquist","doi":"10.1038/s41524-026-02086-x","DOIUrl":"https://doi.org/10.1038/s41524-026-02086-x","url":null,"abstract":"The vast chemical design space of Metal-Organic Frameworks (MOFs) offers unparalleled opportunities for targeted materials design, yet computational screening remains largely restricted to static structure derived from the CIF file. We introduce MOFBuilder, a modular end-to-end pipeline that leverages molecular-level identities to automatically generate chemically consistent, molecular dynamics (MD) ready MOF models, flexibly supporting periodic, defective, cluster, and slab representations. By eliminating the manual effort typically required for model preparation, the pipeline enables a seamless construction of complex systems ranging from large-scale bio-hybrid interfaces to functionalized high-throughput libraries. As proof of the need for high-throughput dynamic modeling, we show that dynamic screening is necessary to circumvent the “Porosity Paradox\". Several functionalized UiO-66 variants classified as non-porous by static geometric analysis exhibit significant CO2 uptake through gate-opening mechanisms captured only via MD. By enabling the high-throughput generation of consistent, dynamic datasets, MOFBuilder addresses a critical gap in discovery pipelines and provides the foundation for more predictive, data-driven materials design.\u0000\u0000The alternative text for this image may have been generated using AI.","PeriodicalId":19342,"journal":{"name":"npj Computational Materials","volume":"27 1","pages":""},"PeriodicalIF":9.7,"publicationDate":"2026-04-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147709303","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"材料科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Explainable GNN framework guided by local chemical features to predict binding energies in bimetallic alloys","authors":"A. F. Usuga, C. S. Praveen, A. Comas-Vives","doi":"10.1038/s41524-026-02045-6","DOIUrl":"https://doi.org/10.1038/s41524-026-02045-6","url":null,"abstract":"Adsorption energies are key catalytic descriptors that reveal adsorbate-site interactions on heterogeneous catalysts. However, their computation via DFT is time-consuming, limiting high-throughput screening. This work presents a machine learning (ML) methodology based on graph representations of local adsorption sites, using a Graph Neural Network (GNN) with per-atom local descriptors derived from accessible physicochemical properties. The approach is evaluated on two bimetallic datasets. The first includes AB-type bimetallic flat surfaces with varying A:B ratios, predicting binding energies for small monodentate adsorbates (C, N, O, S, H) with MSEs of 0.073/0.181 eV2 (train/test). The second dataset comprises reaction energies of key intermediates for CO2 hydrogenation on Ni-Ga-based surfaces. The GNN model achieves an impressive performance (MSE: 0.001/0.002 (train/test) eV2) on complex atomic configurations, even bidentate ones. Beyond predictive performance, clustering analysis provides an explainable framework, showing how structural and electronic descriptors can rationally guide catalyst design and deepen understanding of adsorbate-metal interactions.","PeriodicalId":19342,"journal":{"name":"npj Computational Materials","volume":"85 1","pages":""},"PeriodicalIF":9.7,"publicationDate":"2026-04-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147709302","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"材料科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Basita Das, William E. Heymann, Yueming Wang, Uwe Rau, Thomas Kirchartz, Tonio Buonassisi
{"title":"High-throughput parameter estimation from experimental data using Bayesian Inference with accelerated sampling","authors":"Basita Das, William E. Heymann, Yueming Wang, Uwe Rau, Thomas Kirchartz, Tonio Buonassisi","doi":"10.1038/s41524-026-01995-1","DOIUrl":"https://doi.org/10.1038/s41524-026-01995-1","url":null,"abstract":"BIAS (Bayesian Inference with Accelerated Sampling) is a high-throughput parameter estimation framework designed to rapidly infer the root causes of device underperformance in real time. It integrates a deep neural network surrogate model with accelerated Markov Chain Monte Carlo sampling to efficiently explore high-dimensional parameter spaces and identify needle-like regions corresponding to the ground truth values of key physical parameters. BIAS is scalable to complex systems and has been used to infer eight underlying parameters in perovskite solar cell stacks with a combined speedup of 4800× from the surrogate model and accelerated sampling, compared to conventional Bayesian inference methods. Its rapid and robust inference capabilities render it suitable for integration into high-throughput fabrication workflows, enabling real-time feedback that links process variations to changes in material properties and their impact on device performance. By embedding BIAS in high-throughput fabrication cycles, researchers can accelerate the transition from novel materials to devices and obtain real-time insight into how novel materials' properties translate in the context of a device and the root cause of performance limitations.","PeriodicalId":19342,"journal":{"name":"npj Computational Materials","volume":"13 1","pages":""},"PeriodicalIF":9.7,"publicationDate":"2026-04-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147685156","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"材料科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}