Prediction of the space group and cell volume by training a convolutional neural network with primitive 'ideal' diffraction profiles and its application to 'real' experimental data.
IF 2.8 3区 材料科学Q1 Biochemistry, Genetics and Molecular Biology
{"title":"Prediction of the space group and cell volume by training a convolutional neural network with primitive 'ideal' diffraction profiles and its application to 'real' experimental data.","authors":"Hiroyuki Ozaki, Naoya Ishida, Tetsu Kiyobayashi","doi":"10.1107/S1600576725002419","DOIUrl":null,"url":null,"abstract":"<p><p>This study describes a deep learning approach to predict the space group and unit-cell volume of inorganic crystals from their powder X-ray diffraction profiles. Using an inorganic crystallographic database, convolutional neural network (CNN) models were successfully constructed with the δ-function-like 'ideal' X-ray diffraction profiles derived solely from the intrinsic properties of the crystal structure, which are dependent on neither the incident X-ray wavelength nor the line shape of the profiles. We examined how the statistical metrics (<i>e.g.</i> the prediction accuracy, precision and recall) are influenced by the ensemble averaging technique and the multi-task learning approach; six CNN models were created from an identical data set for the former, and the space group classification was coupled with the unit-cell volume prediction in a CNN architecture for the latter. The CNN models trained in the 'ideal' world were tested with 'real' X-ray profiles for eleven materials such as TiO<sub>2</sub>, LiNiO<sub>2</sub> and LiMnO<sub>2</sub>. While the models mostly fared well in the 'real' world, the cases at odds were scrutinized to elucidate the causes of the mismatch. Specifically for Li<sub>2</sub>MnO<sub>3</sub>, detailed crystallographic considerations revealed that the mismatch can stem from the state of the specific material and/or from the quality of the experimental data, and not from the CNN models. The present study demonstrates that we can obviate the need for emulating experimental diffraction profiles in training CNN models to elicit structural information, thereby focusing efforts on further improvements.</p>","PeriodicalId":14950,"journal":{"name":"Journal of Applied Crystallography","volume":"58 Pt 3","pages":"718-730"},"PeriodicalIF":2.8000,"publicationDate":"2025-04-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12135985/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Applied Crystallography","FirstCategoryId":"88","ListUrlMain":"https://doi.org/10.1107/S1600576725002419","RegionNum":3,"RegionCategory":"材料科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/6/1 0:00:00","PubModel":"eCollection","JCR":"Q1","JCRName":"Biochemistry, Genetics and Molecular Biology","Score":null,"Total":0}
引用次数: 0
Abstract
This study describes a deep learning approach to predict the space group and unit-cell volume of inorganic crystals from their powder X-ray diffraction profiles. Using an inorganic crystallographic database, convolutional neural network (CNN) models were successfully constructed with the δ-function-like 'ideal' X-ray diffraction profiles derived solely from the intrinsic properties of the crystal structure, which are dependent on neither the incident X-ray wavelength nor the line shape of the profiles. We examined how the statistical metrics (e.g. the prediction accuracy, precision and recall) are influenced by the ensemble averaging technique and the multi-task learning approach; six CNN models were created from an identical data set for the former, and the space group classification was coupled with the unit-cell volume prediction in a CNN architecture for the latter. The CNN models trained in the 'ideal' world were tested with 'real' X-ray profiles for eleven materials such as TiO2, LiNiO2 and LiMnO2. While the models mostly fared well in the 'real' world, the cases at odds were scrutinized to elucidate the causes of the mismatch. Specifically for Li2MnO3, detailed crystallographic considerations revealed that the mismatch can stem from the state of the specific material and/or from the quality of the experimental data, and not from the CNN models. The present study demonstrates that we can obviate the need for emulating experimental diffraction profiles in training CNN models to elicit structural information, thereby focusing efforts on further improvements.
期刊介绍:
Many research topics in condensed matter research, materials science and the life sciences make use of crystallographic methods to study crystalline and non-crystalline matter with neutrons, X-rays and electrons. Articles published in the Journal of Applied Crystallography focus on these methods and their use in identifying structural and diffusion-controlled phase transformations, structure-property relationships, structural changes of defects, interfaces and surfaces, etc. Developments of instrumentation and crystallographic apparatus, theory and interpretation, numerical analysis and other related subjects are also covered. The journal is the primary place where crystallographic computer program information is published.