Accelerating metal–organic framework discovery via synthesisability prediction: the MFD evaluation method for one-class classification models†

IF 6.2 Q1 CHEMISTRY, MULTIDISCIPLINARY
Chi Zhang, Dmytro Antypov, Matthew J. Rosseinsky and Matthew S. Dyer
{"title":"Accelerating metal–organic framework discovery via synthesisability prediction: the MFD evaluation method for one-class classification models†","authors":"Chi Zhang, Dmytro Antypov, Matthew J. Rosseinsky and Matthew S. Dyer","doi":"10.1039/D4DD00161C","DOIUrl":null,"url":null,"abstract":"<p >Machine learning has found wide application in the materials field, particularly in discovering structure–property relationships. However, its potential in predicting synthetic accessibility of materials remains relatively unexplored due to the lack of negative data. In this study, we employ several one-class classification (OCC) approaches to accelerate the development of novel metal–organic framework materials by predicting their synthesisability. The evaluation of OCC model performance poses challenges, as traditional evaluation metrics are not applicable when dealing with a single type of data. To overcome this limitation, we introduce a quantitative approach, the maximum fraction difference (MFD) method, to assess and compare model performance, as well as determine optimal thresholds for effectively distinguishing between positives and negatives. A DeepSVDD model with superior predictive capability is proposed. By combining assessment of synthetic viability with porosity prediction models, a list of 3453 unreported combinations is generated and characterised by predictions of high synthesisability and large pore size. The MFD methodology proposed in this study is intended to provide an effective complementary assessment method for addressing the inherent challenges in evaluating OCC models. The research process, developed models, and predicted results of this study are aimed at helping prioritisation of materials for synthesis.</p>","PeriodicalId":72816,"journal":{"name":"Digital discovery","volume":" 12","pages":" 2509-2522"},"PeriodicalIF":6.2000,"publicationDate":"2024-10-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://pubs.rsc.org/en/content/articlepdf/2024/dd/d4dd00161c?page=search","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Digital discovery","FirstCategoryId":"1085","ListUrlMain":"https://pubs.rsc.org/en/content/articlelanding/2024/dd/d4dd00161c","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CHEMISTRY, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0

Abstract

Machine learning has found wide application in the materials field, particularly in discovering structure–property relationships. However, its potential in predicting synthetic accessibility of materials remains relatively unexplored due to the lack of negative data. In this study, we employ several one-class classification (OCC) approaches to accelerate the development of novel metal–organic framework materials by predicting their synthesisability. The evaluation of OCC model performance poses challenges, as traditional evaluation metrics are not applicable when dealing with a single type of data. To overcome this limitation, we introduce a quantitative approach, the maximum fraction difference (MFD) method, to assess and compare model performance, as well as determine optimal thresholds for effectively distinguishing between positives and negatives. A DeepSVDD model with superior predictive capability is proposed. By combining assessment of synthetic viability with porosity prediction models, a list of 3453 unreported combinations is generated and characterised by predictions of high synthesisability and large pore size. The MFD methodology proposed in this study is intended to provide an effective complementary assessment method for addressing the inherent challenges in evaluating OCC models. The research process, developed models, and predicted results of this study are aimed at helping prioritisation of materials for synthesis.

Abstract Image

通过可合成性预测加速发现金属有机骨架:一类分类模型的MFD评价方法[j]
机器学习在材料领域有广泛的应用,特别是在发现结构-性质关系方面。然而,由于缺乏负面数据,它在预测材料的合成可及性方面的潜力仍然相对未被探索。在这项研究中,我们采用了几种单类分类(OCC)方法,通过预测其可合成性来加速新型金属有机框架材料的开发。由于传统的评估指标在处理单一类型的数据时不适用,因此对OCC模型性能的评估提出了挑战。为了克服这一限制,我们引入了一种定量方法,即最大分数差(MFD)方法,以评估和比较模型的性能,并确定有效区分阳性和阴性的最佳阈值。提出了一种具有较强预测能力的深度svdd模型。通过将合成可行性评估与孔隙度预测模型相结合,生成了3453种未报告的组合,并以高合成性和大孔径预测为特征。本研究提出的MFD方法旨在提供一种有效的补充评估方法,以解决评估OCC模型的固有挑战。本研究的研究过程、开发模型和预测结果旨在帮助合成材料的优先级。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
2.80
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信