{"title":"Machine learning in the analysis of biomolecular simulations","authors":"Shreyas S. Kaptan, I. Vattulainen","doi":"10.1080/23746149.2021.2006080","DOIUrl":null,"url":null,"abstract":"ABSTRACT Machine learning has rapidly become a key method for the analysis and organization of large-scale data in all scientific disciplines. In life sciences, the use of machine learning techniques is a particularly appealing idea since the enormous capacity of computational infrastructures generates terabytes of data through millisecond simulations of atomistic and molecular-scale biomolecular systems. Due to this explosion of data, the automation, reproducibility, and objectivity provided by machine learning methods are highly desirable features in the analysis of complex systems. In this review, we focus on the use of machine learning in biomolecular simulations. We discuss the main categories of machine learning tasks, such as dimensionality reduction, clustering, regression, and classification used in the analysis of simulation data. We then introduce the most popular classes of techniques involved in these tasks for the purpose of enhanced sampling, coordinate discovery, and structure prediction. Whenever possible, we explain the scope and limitations of machine learning approaches, and we discuss examples of applications of these techniques. Graphical Abstract","PeriodicalId":7374,"journal":{"name":"Advances in Physics: X","volume":" ","pages":""},"PeriodicalIF":7.7000,"publicationDate":"2022-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Advances in Physics: X","FirstCategoryId":"101","ListUrlMain":"https://doi.org/10.1080/23746149.2021.2006080","RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"PHYSICS, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 10
Abstract
ABSTRACT Machine learning has rapidly become a key method for the analysis and organization of large-scale data in all scientific disciplines. In life sciences, the use of machine learning techniques is a particularly appealing idea since the enormous capacity of computational infrastructures generates terabytes of data through millisecond simulations of atomistic and molecular-scale biomolecular systems. Due to this explosion of data, the automation, reproducibility, and objectivity provided by machine learning methods are highly desirable features in the analysis of complex systems. In this review, we focus on the use of machine learning in biomolecular simulations. We discuss the main categories of machine learning tasks, such as dimensionality reduction, clustering, regression, and classification used in the analysis of simulation data. We then introduce the most popular classes of techniques involved in these tasks for the purpose of enhanced sampling, coordinate discovery, and structure prediction. Whenever possible, we explain the scope and limitations of machine learning approaches, and we discuss examples of applications of these techniques. Graphical Abstract
期刊介绍:
Advances in Physics: X is a fully open-access journal that promotes the centrality of physics and physical measurement to modern science and technology. Advances in Physics: X aims to demonstrate the interconnectivity of physics, meaning the intellectual relationships that exist between one branch of physics and another, as well as the influence of physics across (hence the “X”) traditional boundaries into other disciplines including:
Chemistry
Materials Science
Engineering
Biology
Medicine