Beyond theory driven discovery: introducing hot random search and datum derived structures

IF 3.3 3区 化学 Q2 CHEMISTRY, PHYSICAL
Chris J. Pickard
{"title":"Beyond theory driven discovery: introducing hot random search and datum derived structures","authors":"Chris J. Pickard","doi":"10.1039/d4fd00134f","DOIUrl":null,"url":null,"abstract":"Data driven methods have transformed the prospects of the computational chemical sciences, with machine learned interatomic potentials (MLIPs) speeding up calculations by several orders of magnitude. I reflect on theory driven, as opposed to data driven, discovery based on ab initio random structure searching (AIRSS), and then introduce two new methods which exploit machine learning acceleration. I show how long high throughput anneals, between direct structural relaxation, enabled by ephemeral data derived potentials (EDDPs), can be incorporated into AIRSS to bias the sampling of challenging systems towards low energy configurations. Hot AIRSS (hot-AIRSS) preserves the parallel advantage of random search, while allowing much more complex systems to be tackled. This is demonstrated through searches for complex boron structures in large unit cells. I then show how low energy carbon structures can be directly generated from a single, experimentally determined, diamond structure. An extension to the generation of random sensible structures, candidates are stochastically generated and then optimised to minimise the difference between the EDDP environment vector and that of the reference diamond structure. The distance-based cost function is captured in an actively learned EDDP. Graphite, small nanotubes and caged, fullerene- like, structures emerge from searches using this potential, along with a rich variety of tetrahedral framework structures. Using the same approach, the pyrope, Mg3Al2(SiO4)3, garnet structure is recovered from a low energy AIRSS structure generated in a smaller unit cell with a different chemical composition. The relationship of this approach to modern diffusion model based generative methods is discussed.","PeriodicalId":76,"journal":{"name":"Faraday Discussions","volume":null,"pages":null},"PeriodicalIF":3.3000,"publicationDate":"2024-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Faraday Discussions","FirstCategoryId":"92","ListUrlMain":"https://doi.org/10.1039/d4fd00134f","RegionNum":3,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"CHEMISTRY, PHYSICAL","Score":null,"Total":0}
引用次数: 0

Abstract

Data driven methods have transformed the prospects of the computational chemical sciences, with machine learned interatomic potentials (MLIPs) speeding up calculations by several orders of magnitude. I reflect on theory driven, as opposed to data driven, discovery based on ab initio random structure searching (AIRSS), and then introduce two new methods which exploit machine learning acceleration. I show how long high throughput anneals, between direct structural relaxation, enabled by ephemeral data derived potentials (EDDPs), can be incorporated into AIRSS to bias the sampling of challenging systems towards low energy configurations. Hot AIRSS (hot-AIRSS) preserves the parallel advantage of random search, while allowing much more complex systems to be tackled. This is demonstrated through searches for complex boron structures in large unit cells. I then show how low energy carbon structures can be directly generated from a single, experimentally determined, diamond structure. An extension to the generation of random sensible structures, candidates are stochastically generated and then optimised to minimise the difference between the EDDP environment vector and that of the reference diamond structure. The distance-based cost function is captured in an actively learned EDDP. Graphite, small nanotubes and caged, fullerene- like, structures emerge from searches using this potential, along with a rich variety of tetrahedral framework structures. Using the same approach, the pyrope, Mg3Al2(SiO4)3, garnet structure is recovered from a low energy AIRSS structure generated in a smaller unit cell with a different chemical composition. The relationship of this approach to modern diffusion model based generative methods is discussed.
超越理论驱动的发现:引入热随机搜索和基准衍生结构
数据驱动方法改变了计算化学科学的前景,机器学习原子间势(MLIP)将计算速度提高了几个数量级。与数据驱动相比,我对理论驱动的发现进行了反思,并介绍了两种利用机器学习加速的新方法。我展示了如何通过短暂数据衍生电位(EDDPs)在直接结构弛豫之间进行长时间高通量退火,并将其纳入 AIRSS,从而将具有挑战性的系统取样偏向于低能配置。热 AIRSS(hot-AIRSS)保留了随机搜索的并行优势,同时允许处理更复杂的系统。我将通过搜索大单元中的复杂硼结构来证明这一点。然后,我展示了如何从实验确定的单一金刚石结构直接生成低能碳结构。作为随机合理结构生成的延伸,候选结构是随机生成的,然后进行优化,以最小化 EDDP 环境向量与参考金刚石结构环境向量之间的差异。基于距离的成本函数被捕捉到主动学习的 EDDP 中。通过使用这种势能进行搜索,出现了石墨、小型纳米管和笼状富勒烯结构,以及种类丰富的四面体框架结构。利用同样的方法,从一个化学成分不同的较小单元格中产生的低能量 AIRSS 结构中恢复了石榴石结构 Mg3Al2(SiO4)3。讨论了这种方法与基于现代扩散模型的生成方法之间的关系。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Faraday Discussions
Faraday Discussions 化学-物理化学
自引率
0.00%
发文量
259
期刊介绍: Discussion summary and research papers from discussion meetings that focus on rapidly developing areas of physical chemistry and its interfaces
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信