{"title":"CrysAtom: Distributed Representation of Atoms for Crystal Property Prediction","authors":"Shrimon Mukherjee, Madhusudan Ghosh, Partha Basuchowdhuri","doi":"arxiv-2409.04737","DOIUrl":null,"url":null,"abstract":"Application of artificial intelligence (AI) has been ubiquitous in the growth\nof research in the areas of basic sciences. Frequent use of machine learning\n(ML) and deep learning (DL) based methodologies by researchers has resulted in\nsignificant advancements in the last decade. These techniques led to notable\nperformance enhancements in different tasks such as protein structure\nprediction, drug-target binding affinity prediction, and molecular property\nprediction. In material science literature, it is well-known that crystalline\nmaterials exhibit topological structures. Such topological structures may be\nrepresented as graphs and utilization of graph neural network (GNN) based\napproaches could help encoding them into an augmented representation space.\nPrimarily, such frameworks adopt supervised learning techniques targeted\ntowards downstream property prediction tasks on the basis of electronic\nproperties (formation energy, bandgap, total energy, etc.) and crystalline\nstructures. Generally, such type of frameworks rely highly on the handcrafted\natom feature representations along with the structural representations. In this\npaper, we propose an unsupervised framework namely, CrysAtom, using untagged\ncrystal data to generate dense vector representation of atoms, which can be\nutilized in existing GNN-based property predictor models to accurately predict\nimportant properties of crystals. Empirical results show that our dense\nrepresentation embeds chemical properties of atoms and enhance the performance\nof the baseline property predictor models significantly.","PeriodicalId":501234,"journal":{"name":"arXiv - PHYS - Materials Science","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-09-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - PHYS - Materials Science","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.04737","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Application of artificial intelligence (AI) has been ubiquitous in the growth
of research in the areas of basic sciences. Frequent use of machine learning
(ML) and deep learning (DL) based methodologies by researchers has resulted in
significant advancements in the last decade. These techniques led to notable
performance enhancements in different tasks such as protein structure
prediction, drug-target binding affinity prediction, and molecular property
prediction. In material science literature, it is well-known that crystalline
materials exhibit topological structures. Such topological structures may be
represented as graphs and utilization of graph neural network (GNN) based
approaches could help encoding them into an augmented representation space.
Primarily, such frameworks adopt supervised learning techniques targeted
towards downstream property prediction tasks on the basis of electronic
properties (formation energy, bandgap, total energy, etc.) and crystalline
structures. Generally, such type of frameworks rely highly on the handcrafted
atom feature representations along with the structural representations. In this
paper, we propose an unsupervised framework namely, CrysAtom, using untagged
crystal data to generate dense vector representation of atoms, which can be
utilized in existing GNN-based property predictor models to accurately predict
important properties of crystals. Empirical results show that our dense
representation embeds chemical properties of atoms and enhance the performance
of the baseline property predictor models significantly.