CrysAtom: Distributed Representation of Atoms for Crystal Property Prediction

Shrimon Mukherjee, Madhusudan Ghosh, Partha Basuchowdhuri
{"title":"CrysAtom: Distributed Representation of Atoms for Crystal Property Prediction","authors":"Shrimon Mukherjee, Madhusudan Ghosh, Partha Basuchowdhuri","doi":"arxiv-2409.04737","DOIUrl":null,"url":null,"abstract":"Application of artificial intelligence (AI) has been ubiquitous in the growth\nof research in the areas of basic sciences. Frequent use of machine learning\n(ML) and deep learning (DL) based methodologies by researchers has resulted in\nsignificant advancements in the last decade. These techniques led to notable\nperformance enhancements in different tasks such as protein structure\nprediction, drug-target binding affinity prediction, and molecular property\nprediction. In material science literature, it is well-known that crystalline\nmaterials exhibit topological structures. Such topological structures may be\nrepresented as graphs and utilization of graph neural network (GNN) based\napproaches could help encoding them into an augmented representation space.\nPrimarily, such frameworks adopt supervised learning techniques targeted\ntowards downstream property prediction tasks on the basis of electronic\nproperties (formation energy, bandgap, total energy, etc.) and crystalline\nstructures. Generally, such type of frameworks rely highly on the handcrafted\natom feature representations along with the structural representations. In this\npaper, we propose an unsupervised framework namely, CrysAtom, using untagged\ncrystal data to generate dense vector representation of atoms, which can be\nutilized in existing GNN-based property predictor models to accurately predict\nimportant properties of crystals. Empirical results show that our dense\nrepresentation embeds chemical properties of atoms and enhance the performance\nof the baseline property predictor models significantly.","PeriodicalId":501234,"journal":{"name":"arXiv - PHYS - Materials Science","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-09-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - PHYS - Materials Science","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.04737","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Application of artificial intelligence (AI) has been ubiquitous in the growth of research in the areas of basic sciences. Frequent use of machine learning (ML) and deep learning (DL) based methodologies by researchers has resulted in significant advancements in the last decade. These techniques led to notable performance enhancements in different tasks such as protein structure prediction, drug-target binding affinity prediction, and molecular property prediction. In material science literature, it is well-known that crystalline materials exhibit topological structures. Such topological structures may be represented as graphs and utilization of graph neural network (GNN) based approaches could help encoding them into an augmented representation space. Primarily, such frameworks adopt supervised learning techniques targeted towards downstream property prediction tasks on the basis of electronic properties (formation energy, bandgap, total energy, etc.) and crystalline structures. Generally, such type of frameworks rely highly on the handcrafted atom feature representations along with the structural representations. In this paper, we propose an unsupervised framework namely, CrysAtom, using untagged crystal data to generate dense vector representation of atoms, which can be utilized in existing GNN-based property predictor models to accurately predict important properties of crystals. Empirical results show that our dense representation embeds chemical properties of atoms and enhance the performance of the baseline property predictor models significantly.
CrysAtom:用于晶体性质预测的原子分布式表示法
在基础科学领域的研究发展中,人工智能(AI)的应用无处不在。研究人员频繁使用基于机器学习(ML)和深度学习(DL)的方法,在过去十年中取得了显著进步。这些技术在蛋白质结构预测、药物目标结合亲和力预测和分子特性预测等不同任务中带来了显著的性能提升。在材料科学文献中,众所周知,晶体材料表现出拓扑结构。这种拓扑结构可以用图来表示,利用基于图神经网络(GNN)的方法可以帮助将拓扑结构编码到增强的表示空间中。一般来说,这类框架高度依赖于手工制作的原子特征表征和结构表征。在本文中,我们提出了一种无监督框架,即 CrysAtom,利用未标记的晶体数据生成原子的密集矢量表示,可用于现有的基于 GNN 的性质预测模型,以准确预测晶体的重要性质。实证结果表明,我们的密集表示嵌入了原子的化学性质,显著提高了基线性质预测模型的性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信