Gabe Guo, Judah Goldfeder, Ling Lan, Aniv Ray, Albert Hanming Yang, Boyuan Chen, Simon J. L. Billinge, Hod Lipson
{"title":"Towards end-to-end structure determination from x-ray diffraction data using deep learning","authors":"Gabe Guo, Judah Goldfeder, Ling Lan, Aniv Ray, Albert Hanming Yang, Boyuan Chen, Simon J. L. Billinge, Hod Lipson","doi":"10.1038/s41524-024-01401-8","DOIUrl":null,"url":null,"abstract":"<p>Powder crystallography is the experimental science of determining the structure of molecules provided in crystalline-powder form, by analyzing their x-ray diffraction (XRD) patterns. Since many materials are readily available as crystalline powder, powder crystallography is of growing usefulness to many fields. However, powder crystallography does not have an analytically known solution, and therefore the structural inference typically involves a laborious process of iterative design, structural refinement, and domain knowledge of skilled experts. A key obstacle to fully automating the inference process computationally has been formulating the problem in an end-to-end quantitative form that is suitable for machine learning, while capturing the ambiguities around molecule orientation, symmetries, and reconstruction resolution. Here we present an ML approach for structure determination from powder diffraction data. It works by estimating the electron density in a unit cell using a variational coordinate-based deep neural network. We demonstrate the approach on computed powder x-ray diffraction (PXRD), along with partial chemical composition information, as input. When evaluated on theoretically simulated data for the cubic and trigonal crystal systems, the system achieves up to 93.4% average similarity (as measured by structural similarity index) with the ground truth on unseen materials, both with known and partially-known chemical composition information, showing great promise for successful structure solution even from degraded and incomplete input data. The approach does not presuppose a crystalline structure and the approach are readily extended to other situations such as nanomaterials and textured samples, paving the way to reconstruction of yet unresolved nanostructures.</p>","PeriodicalId":19342,"journal":{"name":"npj Computational Materials","volume":"67 1","pages":""},"PeriodicalIF":9.4000,"publicationDate":"2024-09-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"npj Computational Materials","FirstCategoryId":"88","ListUrlMain":"https://doi.org/10.1038/s41524-024-01401-8","RegionNum":1,"RegionCategory":"材料科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CHEMISTRY, PHYSICAL","Score":null,"Total":0}
引用次数: 0
Abstract
Powder crystallography is the experimental science of determining the structure of molecules provided in crystalline-powder form, by analyzing their x-ray diffraction (XRD) patterns. Since many materials are readily available as crystalline powder, powder crystallography is of growing usefulness to many fields. However, powder crystallography does not have an analytically known solution, and therefore the structural inference typically involves a laborious process of iterative design, structural refinement, and domain knowledge of skilled experts. A key obstacle to fully automating the inference process computationally has been formulating the problem in an end-to-end quantitative form that is suitable for machine learning, while capturing the ambiguities around molecule orientation, symmetries, and reconstruction resolution. Here we present an ML approach for structure determination from powder diffraction data. It works by estimating the electron density in a unit cell using a variational coordinate-based deep neural network. We demonstrate the approach on computed powder x-ray diffraction (PXRD), along with partial chemical composition information, as input. When evaluated on theoretically simulated data for the cubic and trigonal crystal systems, the system achieves up to 93.4% average similarity (as measured by structural similarity index) with the ground truth on unseen materials, both with known and partially-known chemical composition information, showing great promise for successful structure solution even from degraded and incomplete input data. The approach does not presuppose a crystalline structure and the approach are readily extended to other situations such as nanomaterials and textured samples, paving the way to reconstruction of yet unresolved nanostructures.
粉末结晶学是一门通过分析结晶粉末状分子的 X 射线衍射 (XRD) 图样来确定其结构的实验科学。由于许多材料都是现成的结晶粉末,粉末结晶学在许多领域的用处越来越大。然而,粉末结晶学并没有已知的分析解决方案,因此结构推断通常需要熟练专家的反复设计、结构完善和领域知识等费力的过程。计算推断过程完全自动化的一个关键障碍是,如何以适合机器学习的端到端定量形式来表述问题,同时捕捉围绕分子取向、对称性和重构分辨率的模糊性。在此,我们介绍一种从粉末衍射数据中确定结构的 ML 方法。它的工作原理是使用基于变异坐标的深度神经网络估算单元格中的电子密度。我们以计算的粉末 X 射线衍射 (PXRD) 以及部分化学成分信息作为输入,对该方法进行了演示。在对立方和三方晶系的理论模拟数据进行评估时,该系统在已知和部分已知化学成分信息的未见材料上与基本真相的平均相似度(以结构相似度指数衡量)高达 93.4%,这表明即使从退化和不完整的输入数据中成功求解结构也大有可为。该方法并不预设晶体结构,而且很容易扩展到纳米材料和纹理样品等其他情况,为重建尚未解决的纳米结构铺平了道路。
期刊介绍:
npj Computational Materials is a high-quality open access journal from Nature Research that publishes research papers applying computational approaches for the design of new materials and enhancing our understanding of existing ones. The journal also welcomes papers on new computational techniques and the refinement of current approaches that support these aims, as well as experimental papers that complement computational findings.
Some key features of npj Computational Materials include a 2-year impact factor of 12.241 (2021), article downloads of 1,138,590 (2021), and a fast turnaround time of 11 days from submission to the first editorial decision. The journal is indexed in various databases and services, including Chemical Abstracts Service (ACS), Astrophysics Data System (ADS), Current Contents/Physical, Chemical and Earth Sciences, Journal Citation Reports/Science Edition, SCOPUS, EI Compendex, INSPEC, Google Scholar, SCImago, DOAJ, CNKI, and Science Citation Index Expanded (SCIE), among others.