SemiH: DFT Hamiltonian neural network training with semi-supervised learning

IF 4.6 2区物理与天体物理 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Machine Learning Science and Technology Pub Date : 2024-09-02 DOI:10.1088/2632-2153/ad7227

Yucheol Cho, Guenseok Choi, Gyeongdo Ham, Mincheol Shin, Daeshik Kim

{"title":"SemiH: DFT Hamiltonian neural network training with semi-supervised learning","authors":"Yucheol Cho, Guenseok Choi, Gyeongdo Ham, Mincheol Shin, Daeshik Kim","doi":"10.1088/2632-2153/ad7227","DOIUrl":null,"url":null,"abstract":"Over the past decades, density functional theory (DFT) calculations have been utilized in various fields such as materials science and semiconductor devices. However, due to the inherent nature of DFT calculations, which rigorously consider interactions between atoms, they require significant computational cost. To address this, extensive research has recently focused on training neural networks to replace DFT calculations. However, previous methods for training neural networks necessitated an extensive number of DFT simulations to acquire the ground truth (Hamiltonians). Conversely, when dealing with a limited amount of training data, deep learning models often display increased errors in predicting Hamiltonians and band structures for testing data. This phenomenon poses the potential risk of generating inaccurate physical interpretations, including the emergence of unphysical branches within band structures. To tackle this challenge, we propose a novel deep learning-based method for calculating DFT Hamiltonians, specifically tailored to produce accurate results with limited training data. Our framework not only employs supervised learning with the calculated Hamiltonian but also generates pseudo Hamiltonians (targets for unlabeled data) and trains the neural networks on unlabeled data. Particularly, our approach, which leverages unlabeled data, is noteworthy as it marks the first attempt in the field of neural network Hamiltonians. Our framework showcases the superior performance of our framework compared to the state-of-the-art approach across various datasets, such as MoS2, Bi2Te3, HfO2, and InGaAs. Moreover, our framework demonstrates enhanced generalization performance by effectively utilizing unlabeled data, achieving noteworthy results when evaluated on data more complex than the training set, such as configurations with more atoms and temperature ranges outside the training data.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":"71 1","pages":""},"PeriodicalIF":4.6000,"publicationDate":"2024-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Machine Learning Science and Technology","FirstCategoryId":"101","ListUrlMain":"https://doi.org/10.1088/2632-2153/ad7227","RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Over the past decades, density functional theory (DFT) calculations have been utilized in various fields such as materials science and semiconductor devices. However, due to the inherent nature of DFT calculations, which rigorously consider interactions between atoms, they require significant computational cost. To address this, extensive research has recently focused on training neural networks to replace DFT calculations. However, previous methods for training neural networks necessitated an extensive number of DFT simulations to acquire the ground truth (Hamiltonians). Conversely, when dealing with a limited amount of training data, deep learning models often display increased errors in predicting Hamiltonians and band structures for testing data. This phenomenon poses the potential risk of generating inaccurate physical interpretations, including the emergence of unphysical branches within band structures. To tackle this challenge, we propose a novel deep learning-based method for calculating DFT Hamiltonians, specifically tailored to produce accurate results with limited training data. Our framework not only employs supervised learning with the calculated Hamiltonian but also generates pseudo Hamiltonians (targets for unlabeled data) and trains the neural networks on unlabeled data. Particularly, our approach, which leverages unlabeled data, is noteworthy as it marks the first attempt in the field of neural network Hamiltonians. Our framework showcases the superior performance of our framework compared to the state-of-the-art approach across various datasets, such as MoS₂, Bi₂Te₃, HfO₂, and InGaAs. Moreover, our framework demonstrates enhanced generalization performance by effectively utilizing unlabeled data, achieving noteworthy results when evaluated on data more complex than the training set, such as configurations with more atoms and temperature ranges outside the training data.

查看原文本刊更多论文

SemiH：采用半监督学习的 DFT 汉密尔顿神经网络训练

过去几十年来，密度泛函理论（DFT）计算被广泛应用于材料科学和半导体器件等各个领域。然而，由于密度泛函理论计算严格考虑了原子间的相互作用，其固有的性质决定了计算成本非常高。为了解决这个问题，最近的大量研究集中于训练神经网络来取代 DFT 计算。然而，以前的神经网络训练方法需要进行大量的 DFT 模拟来获取基本事实（哈密顿）。相反，在处理有限的训练数据时，深度学习模型在预测测试数据的哈密顿和带状结构时往往会显示出更大的误差。这种现象带来了产生不准确物理解释的潜在风险，包括在带状结构中出现非物理分支。为了应对这一挑战，我们提出了一种新颖的基于深度学习的 DFT 哈密顿方程计算方法，专门用于在有限的训练数据下得出准确的结果。我们的框架不仅利用计算出的哈密顿数进行监督学习，还生成伪哈密顿数（未标记数据的目标），并在未标记数据上训练神经网络。尤其值得注意的是，我们的方法利用了无标记数据，这标志着神经网络哈密顿方法领域的首次尝试。在 MoS2、Bi2Te3、HfO2 和 InGaAs 等各种数据集上，我们的框架展示了与最先进方法相比的卓越性能。此外，我们的框架通过有效利用未标记数据，提高了泛化性能，在对比训练集更复杂的数据（如训练数据以外的更多原子和温度范围的配置）进行评估时，取得了显著的结果。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Machine Learning Science and Technology Computer Science-Artificial Intelligence

CiteScore

9.10

自引率

4.40%

发文量

审稿时长

5 weeks

期刊介绍： Machine Learning Science and Technology is a multidisciplinary open access journal that bridges the application of machine learning across the sciences with advances in machine learning methods and theory as motivated by physical insights. Specifically, articles must fall into one of the following categories: advance the state of machine learning-driven applications in the sciences or make conceptual, methodological or theoretical advances in machine learning with applications to, inspiration from, or motivated by scientific problems.