Yoked learning in molecular data science

Artificial intelligence in the life sciences Pub Date : 2023-12-02 DOI:10.1016/j.ailsci.2023.100089

Zhixiong Li, Yan Xiang, Yujing Wen, Daniel Reker

{"title":"Yoked learning in molecular data science","authors":"Zhixiong Li, Yan Xiang, Yujing Wen, Daniel Reker","doi":"10.1016/j.ailsci.2023.100089","DOIUrl":null,"url":null,"abstract":"<div><p>Active machine learning is an established and increasingly popular experimental design technique where the machine learning model can request additional data to improve the model's predictive performance. It is generally assumed that this data is optimal for the machine learning model since it relies on the model's predictions or model architecture and therefore cannot be transferred to other models. Inspired by research in pedagogy, we here introduce the concept of yoked machine learning where a second machine learning model learns from the data selected by another model. We found that in 48% of the benchmarked combinations, yoked learning performed similar or better than active learning. We analyze distinct cases in which yoked learning can improve active learning performance. In particular, we prototype yoked deep learning (YoDeL) where a classic machine learning model provides data to a deep neural network, thereby mitigating challenges of active deep learning such as slow refitting time per learning iteration and poor performance on small datasets. In summary, we expect the new concept of yoked (deep) learning to provide a competitive option to boost the performance of active learning and benefit from distinct capabilities of multiple machine learning models during data acquisition, training, and deployment.</p></div>","PeriodicalId":72304,"journal":{"name":"Artificial intelligence in the life sciences","volume":"5 ","pages":"Article 100089"},"PeriodicalIF":0.0000,"publicationDate":"2023-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2667318523000338/pdfft?md5=798e4cffb7539da96cce07297e51e3de&pid=1-s2.0-S2667318523000338-main.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Artificial intelligence in the life sciences","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2667318523000338","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Active machine learning is an established and increasingly popular experimental design technique where the machine learning model can request additional data to improve the model's predictive performance. It is generally assumed that this data is optimal for the machine learning model since it relies on the model's predictions or model architecture and therefore cannot be transferred to other models. Inspired by research in pedagogy, we here introduce the concept of yoked machine learning where a second machine learning model learns from the data selected by another model. We found that in 48% of the benchmarked combinations, yoked learning performed similar or better than active learning. We analyze distinct cases in which yoked learning can improve active learning performance. In particular, we prototype yoked deep learning (YoDeL) where a classic machine learning model provides data to a deep neural network, thereby mitigating challenges of active deep learning such as slow refitting time per learning iteration and poor performance on small datasets. In summary, we expect the new concept of yoked (deep) learning to provide a competitive option to boost the performance of active learning and benefit from distinct capabilities of multiple machine learning models during data acquisition, training, and deployment.

查看原文本刊更多论文

分子数据科学中的交配学习

主动式机器学习是一种成熟且日益流行的实验设计技术，机器学习模型可以请求额外的数据来提高模型的预测性能。一般认为，这些数据是机器学习模型的最佳数据，因为这些数据依赖于模型的预测或模型架构，因此不能转移到其他模型中。受教学法研究的启发，我们在此引入了枷锁式机器学习的概念，即第二个机器学习模型从另一个模型选择的数据中学习。我们发现，在 48% 的基准组合中，连带学习的表现与主动学习相似或更好。我们分析了联合学习可以提高主动学习性能的不同情况。特别是，我们提出了枷锁式深度学习（YoDeL）的原型，即由一个经典机器学习模型为深度神经网络提供数据，从而缓解主动式深度学习所面临的挑战，如每次学习迭代的重拟合时间较慢以及在小数据集上的性能较差。总之，我们希望轭状（深度）学习这一新概念能提供一种有竞争力的选择，以提高主动学习的性能，并在数据采集、训练和部署过程中受益于多种机器学习模型的独特能力。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Artificial intelligence in the life sciences Pharmacology, Biochemistry, Genetics and Molecular Biology (General), Computer Science Applications, Health Informatics, Drug Discovery, Veterinary Science and Veterinary Medicine (General)

CiteScore

5.00

自引率

0.00%

发文量

审稿时长

15 days