A representation of crystal systems and space groups based on the variance of atomic positions (VAP): Case of 2D materials

IF 3.3 3区 化学 Q2 CHEMISTRY, INORGANIC & NUCLEAR
R. Botella
{"title":"A representation of crystal systems and space groups based on the variance of atomic positions (VAP): Case of 2D materials","authors":"R. Botella","doi":"10.1016/j.solidstatesciences.2025.108061","DOIUrl":null,"url":null,"abstract":"<div><div>Material representation is an active topic of computational materials science. It is especially useful for crystal structure prediction as well as related properties such as formation energy and band structure. Material representations aim at encoding the material structure in a format that can be well understood by machine learning (ML) algorithms. Herein, a new material representation is proposed that is both compact and easily usable by ML algorithms. In the proposed representation, the connectivity and symmetry considerations of a material are made implicit, assimilating the unit cell of a material to a 3D cloud of points. Accordingly, one way to describe such a distribution of values is through the study of their variance. In the case of atomic positions, we obtain the variance of atomic positions (VAP). The VAP representations of 6176 2D materials from the 2DMatPedia database are computed and studied. After visual inspection, the VAP representations values obtained are sorted by crystal systems and space groups to be classified through ML. K-nearest neighbors (KNN) and random forest (RF) algorithms are trained and tested for crystal system and space group classification. Despite the visual inspection not showing VAP-crystal system or VAP-space group correlation, classification accuracies range from ca. 92 % to ca. 97 %, and from 96 % to 98 % for pairwise crystal system and pairwise space group classifications, respectively. Multi-classification accuracies are also high, ranging from 79 % to 88 % for crystal systems, and from 73 % to 82 % for space groups. To deepen the understanding of the classification process, accuracy maps are analyzed, uncovering a bias of the dataset. The influence of this bias on the classification accuracies is discussed.</div></div>","PeriodicalId":432,"journal":{"name":"Solid State Sciences","volume":"169 ","pages":"Article 108061"},"PeriodicalIF":3.3000,"publicationDate":"2025-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Solid State Sciences","FirstCategoryId":"92","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1293255825002390","RegionNum":3,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"CHEMISTRY, INORGANIC & NUCLEAR","Score":null,"Total":0}
引用次数: 0

Abstract

Material representation is an active topic of computational materials science. It is especially useful for crystal structure prediction as well as related properties such as formation energy and band structure. Material representations aim at encoding the material structure in a format that can be well understood by machine learning (ML) algorithms. Herein, a new material representation is proposed that is both compact and easily usable by ML algorithms. In the proposed representation, the connectivity and symmetry considerations of a material are made implicit, assimilating the unit cell of a material to a 3D cloud of points. Accordingly, one way to describe such a distribution of values is through the study of their variance. In the case of atomic positions, we obtain the variance of atomic positions (VAP). The VAP representations of 6176 2D materials from the 2DMatPedia database are computed and studied. After visual inspection, the VAP representations values obtained are sorted by crystal systems and space groups to be classified through ML. K-nearest neighbors (KNN) and random forest (RF) algorithms are trained and tested for crystal system and space group classification. Despite the visual inspection not showing VAP-crystal system or VAP-space group correlation, classification accuracies range from ca. 92 % to ca. 97 %, and from 96 % to 98 % for pairwise crystal system and pairwise space group classifications, respectively. Multi-classification accuracies are also high, ranging from 79 % to 88 % for crystal systems, and from 73 % to 82 % for space groups. To deepen the understanding of the classification process, accuracy maps are analyzed, uncovering a bias of the dataset. The influence of this bias on the classification accuracies is discussed.

Abstract Image

基于原子位置方差(VAP)的晶体系统和空间群的表示:以二维材料为例
材料表征是计算材料科学的一个活跃课题。它对于预测晶体结构以及相关的性质如形成能和能带结构特别有用。材料表示旨在以机器学习(ML)算法可以很好地理解的格式对材料结构进行编码。在此,提出了一种新的材料表示,它既紧凑又易于机器学习算法使用。在提出的表示中,材料的连通性和对称性考虑是隐式的,将材料的单位细胞同化为三维点云。因此,描述这种值分布的一种方法是研究它们的方差。对于原子位置,我们得到了原子位置方差(VAP)。计算和研究了2DMatPedia数据库中6176种二维材料的VAP表示。目视检测后,将得到的VAP表示值按晶体系统和空间群进行排序,通过ML进行分类。训练k近邻(KNN)和随机森林(RF)算法,并对晶体系统和空间群分类进行测试。尽管目视检查未显示vap -晶体系统或vap -空间群相关性,分类精度范围为约92%至约97%,两两晶体系统和两两空间群分类分别为96%至98%。多重分类的准确度也很高,晶体系统的准确度从79%到88%不等,空间群的准确度从73%到82%不等。为了加深对分类过程的理解,我们分析了准确率图,揭示了数据集的偏差。讨论了这种偏差对分类精度的影响。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Solid State Sciences
Solid State Sciences 化学-无机化学与核化学
CiteScore
6.60
自引率
2.90%
发文量
214
审稿时长
27 days
期刊介绍: Solid State Sciences is the journal for researchers from the broad solid state chemistry and physics community. It publishes key articles on all aspects of solid state synthesis, structure-property relationships, theory and functionalities, in relation with experiments. Key topics for stand-alone papers and special issues: -Novel ways of synthesis, inorganic functional materials, including porous and glassy materials, hybrid organic-inorganic compounds and nanomaterials -Physical properties, emphasizing but not limited to the electrical, magnetical and optical features -Materials related to information technology and energy and environmental sciences. The journal publishes feature articles from experts in the field upon invitation. Solid State Sciences - your gateway to energy-related materials.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信