{"title":"Big data and deep learning for RNA biology","authors":"Hyeonseo Hwang, Hyeonseong Jeon, Nagyeong Yeo, Daehyun Baek","doi":"10.1038/s12276-024-01243-w","DOIUrl":null,"url":null,"abstract":"The exponential growth of big data in RNA biology (RB) has led to the development of deep learning (DL) models that have driven crucial discoveries. As constantly evidenced by DL studies in other fields, the successful implementation of DL in RB depends heavily on the effective utilization of large-scale datasets from public databases. In achieving this goal, data encoding methods, learning algorithms, and techniques that align well with biological domain knowledge have played pivotal roles. In this review, we provide guiding principles for applying these DL concepts to various problems in RB by demonstrating successful examples and associated methodologies. We also discuss the remaining challenges in developing DL models for RB and suggest strategies to overcome these challenges. Overall, this review aims to illuminate the compelling potential of DL for RB and ways to apply this powerful technology to investigate the intriguing biology of RNA more effectively. This review spotlights the revolutionary role of deep learning (DL) in expanding the understanding of RNA. RNA is a fundamental biomolecule that shapes and regulates diverse phenotypes including human diseases. Understanding the principles governing the functions of RNA is a key objective of current biology. Recently, big data produced via high-throughput experiments have been utilized to develop DL models aimed at analyzing and predicting RNA-related biological processes. This review emphasizes the role of public databases in providing these big data for training DL models. The authors introduce core DL concepts necessary for training models from the biological data. By extensively examining DL studies in various fields of RNA biology, the authors suggest how to better leverage DL for revealing novel biological knowledge and demonstrate the potential of DL in deciphering the complex biology of RNA. This summary was initially drafted using artificial intelligence, then revised and fact-checked by the author.","PeriodicalId":50466,"journal":{"name":"Experimental and Molecular Medicine","volume":"56 6","pages":"1293-1321"},"PeriodicalIF":9.5000,"publicationDate":"2024-06-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11263376/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Experimental and Molecular Medicine","FirstCategoryId":"3","ListUrlMain":"https://www.nature.com/articles/s12276-024-01243-w","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOCHEMISTRY & MOLECULAR BIOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
The exponential growth of big data in RNA biology (RB) has led to the development of deep learning (DL) models that have driven crucial discoveries. As constantly evidenced by DL studies in other fields, the successful implementation of DL in RB depends heavily on the effective utilization of large-scale datasets from public databases. In achieving this goal, data encoding methods, learning algorithms, and techniques that align well with biological domain knowledge have played pivotal roles. In this review, we provide guiding principles for applying these DL concepts to various problems in RB by demonstrating successful examples and associated methodologies. We also discuss the remaining challenges in developing DL models for RB and suggest strategies to overcome these challenges. Overall, this review aims to illuminate the compelling potential of DL for RB and ways to apply this powerful technology to investigate the intriguing biology of RNA more effectively. This review spotlights the revolutionary role of deep learning (DL) in expanding the understanding of RNA. RNA is a fundamental biomolecule that shapes and regulates diverse phenotypes including human diseases. Understanding the principles governing the functions of RNA is a key objective of current biology. Recently, big data produced via high-throughput experiments have been utilized to develop DL models aimed at analyzing and predicting RNA-related biological processes. This review emphasizes the role of public databases in providing these big data for training DL models. The authors introduce core DL concepts necessary for training models from the biological data. By extensively examining DL studies in various fields of RNA biology, the authors suggest how to better leverage DL for revealing novel biological knowledge and demonstrate the potential of DL in deciphering the complex biology of RNA. This summary was initially drafted using artificial intelligence, then revised and fact-checked by the author.
期刊介绍:
Experimental & Molecular Medicine (EMM) stands as Korea's pioneering biochemistry journal, established in 1964 and rejuvenated in 1996 as an Open Access, fully peer-reviewed international journal. Dedicated to advancing translational research and showcasing recent breakthroughs in the biomedical realm, EMM invites submissions encompassing genetic, molecular, and cellular studies of human physiology and diseases. Emphasizing the correlation between experimental and translational research and enhanced clinical benefits, the journal actively encourages contributions employing specific molecular tools. Welcoming studies that bridge basic discoveries with clinical relevance, alongside articles demonstrating clear in vivo significance and novelty, Experimental & Molecular Medicine proudly serves as an open-access, online-only repository of cutting-edge medical research.