PREVAIL: Pre-trained Variational Adversarial Active Learning for Molecular Property Prediction

Linjie Li, Yi Xiao, Dewei Ma, Kai Zheng
{"title":"PREVAIL: Pre-trained Variational Adversarial Active Learning for Molecular Property Prediction","authors":"Linjie Li, Yi Xiao, Dewei Ma, Kai Zheng","doi":"10.1109/CCIS57298.2022.10016422","DOIUrl":null,"url":null,"abstract":"Molecular property prediction is a fundamental task in drug discovery. The majority of the high-performing molecular property prediction methods currently were developed using deep learning techniques, which rely on massive labeled data. However, accurate molecular property annotation is time-consuming and expensive. Due to the fact that different samples usually have unequal importance in model training, we propose a pre-trained variational adversarial active learning, PREVAIL for short, to query the most informative samples to be annotated to reduce the annotation cost. Specifically, different from previous active learning whose initial set is sampled randomly, PREVAIL selects the most informative initial dataset by an autoencoder and K-Center greedy algorithm, which can avoid biases that affect the accuracy of the early decision-making process. Furthermore, PREVAIL simultaneously adapts the distribution of molecules and the information of the prediction task by incorporating the loss information of the molecular property prediction task into the latent space using task-aware variational adversarial active learning. Our benchmark experiments demonstrate that PREVAIL outperforms state-of-the-art active learning methods on molecular property prediction tasks.","PeriodicalId":374660,"journal":{"name":"2022 IEEE 8th International Conference on Cloud Computing and Intelligent Systems (CCIS)","volume":"35 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-11-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE 8th International Conference on Cloud Computing and Intelligent Systems (CCIS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CCIS57298.2022.10016422","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Molecular property prediction is a fundamental task in drug discovery. The majority of the high-performing molecular property prediction methods currently were developed using deep learning techniques, which rely on massive labeled data. However, accurate molecular property annotation is time-consuming and expensive. Due to the fact that different samples usually have unequal importance in model training, we propose a pre-trained variational adversarial active learning, PREVAIL for short, to query the most informative samples to be annotated to reduce the annotation cost. Specifically, different from previous active learning whose initial set is sampled randomly, PREVAIL selects the most informative initial dataset by an autoencoder and K-Center greedy algorithm, which can avoid biases that affect the accuracy of the early decision-making process. Furthermore, PREVAIL simultaneously adapts the distribution of molecules and the information of the prediction task by incorporating the loss information of the molecular property prediction task into the latent space using task-aware variational adversarial active learning. Our benchmark experiments demonstrate that PREVAIL outperforms state-of-the-art active learning methods on molecular property prediction tasks.
占上风:分子性质预测的预训练变分对抗主动学习
分子性质预测是药物发现的一项基础性工作。目前,大多数高性能的分子性质预测方法都是使用深度学习技术开发的,这种技术依赖于大量的标记数据。然而,精确的分子性质标注既耗时又昂贵。由于不同样本在模型训练中的重要性通常不相等,我们提出了一种预训练变分对抗主动学习(pretrained variational adversarial active learning,简称precpreci)来查询需要标注的信息量最大的样本,以降低标注成本。具体而言,与以往主动学习的初始集随机采样不同,该算法通过自编码器和K-Center贪婪算法选择信息量最大的初始数据集,避免了影响早期决策过程准确性的偏差。此外,通过使用任务感知变分对抗主动学习将分子性质预测任务的损失信息纳入潜在空间,同时适应分子的分布和预测任务的信息。我们的基准实验表明,在分子性质预测任务上,prevai优于最先进的主动学习方法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信