Atom typing using graph representation learning: How do models learn chemistry?

Jun Zhang
{"title":"Atom typing using graph representation learning: How do models learn chemistry?","authors":"Jun Zhang","doi":"10.1063/5.0095008","DOIUrl":null,"url":null,"abstract":"Atom typing is the first step for simulating molecules using a force field. Automatic atom typing for an arbitrary molecule is often realized by rule-based algorithms, which have to manually encode rules for all types defined in this force field. These are time-consuming and force field-specific. In this study, a method that is independent of a specific force field based on graph representation learning is established for automatic atom typing. The topology adaptive graph convolution network (TAGCN) is found to be an optimal model. The model does not need manual enumeration of rules but can learn the rules just through training using typed molecules prepared during the development of a force field. The test on the CHARMM general force field gives a typing correctness of 91%. A systematic error of typing by TAGCN is its inability of distinguishing types in rings or acyclic chains. It originates from the fundamental structure of graph neural networks and can be fixed in a trivial way. More importantly, analysis of the rationalization processes of these models using layer-wise relation propagation reveals how TAGCN encodes rules learned during training. Our model is found to be able to type using the local chemical environments, in a way highly in accordance with chemists' intuition.","PeriodicalId":446961,"journal":{"name":"The Journal of chemical physics","volume":"36 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-05-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"The Journal of chemical physics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1063/5.0095008","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 6

Abstract

Atom typing is the first step for simulating molecules using a force field. Automatic atom typing for an arbitrary molecule is often realized by rule-based algorithms, which have to manually encode rules for all types defined in this force field. These are time-consuming and force field-specific. In this study, a method that is independent of a specific force field based on graph representation learning is established for automatic atom typing. The topology adaptive graph convolution network (TAGCN) is found to be an optimal model. The model does not need manual enumeration of rules but can learn the rules just through training using typed molecules prepared during the development of a force field. The test on the CHARMM general force field gives a typing correctness of 91%. A systematic error of typing by TAGCN is its inability of distinguishing types in rings or acyclic chains. It originates from the fundamental structure of graph neural networks and can be fixed in a trivial way. More importantly, analysis of the rationalization processes of these models using layer-wise relation propagation reveals how TAGCN encodes rules learned during training. Our model is found to be able to type using the local chemical environments, in a way highly in accordance with chemists' intuition.
使用图表示学习的原子类型:模型如何学习化学?
原子类型是使用力场模拟分子的第一步。任意分子的自动原子分型通常是通过基于规则的算法实现的,这些算法必须为该力场中定义的所有类型手动编码规则。这些都很耗时,而且是力场专用的。本文建立了一种不依赖于特定力场的基于图表示学习的原子自动分型方法。拓扑自适应图卷积网络(TAGCN)是一种最优模型。该模型不需要手动枚举规则,只需使用在力场发展过程中准备的类型分子进行训练即可学习规则。通过对CHARMM通用力场的测试,其输入正确率达到91%。TAGCN分型的一个系统性错误是它不能区分环状或非环状链的类型。它起源于图神经网络的基本结构,可以用一种简单的方式固定。更重要的是,使用分层关系传播对这些模型的合理化过程进行分析,揭示了TAGCN如何对训练期间学习到的规则进行编码。我们的模型被发现能够使用当地的化学环境进行输入,在某种程度上与化学家的直觉高度一致。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信