{"title":"蛋白质和分子设计的预训练语言模型","authors":"Erdong Zhang, Calvin Yu-Chian Chen, Zilin Pan, Zequan Yao, Tiejun Dong, Guan-Xing Chen, Tingwen Deng, Shiwei Chen","doi":"10.1039/d5cp00785b","DOIUrl":null,"url":null,"abstract":"Pre-trained Language Models (PLMs) have recently emerged as a powerful tool, showcasing exceptional performance not just in natural language understanding but also in the realm of biological research. The advantage of PLMs lies in their ability to leverage the structural similarity between biological sequences and natural language. PLMs offer novel solutions for protein research and drug design applications. By pre-training on extensive unlabeled biological sequences and then fine-tuning for specific tasks, PLMs have delivered remarkable results. To summarize the growing landscape of PLMs in biological research, this paper integrates exemplary PLMs and common datasets, demonstrating the potential and application prospects of PLMs in prediction and generation tasks.","PeriodicalId":99,"journal":{"name":"Physical Chemistry Chemical Physics","volume":"10 1","pages":""},"PeriodicalIF":2.9000,"publicationDate":"2025-06-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Pre-trained Language Models for Protein and Molecular Design\",\"authors\":\"Erdong Zhang, Calvin Yu-Chian Chen, Zilin Pan, Zequan Yao, Tiejun Dong, Guan-Xing Chen, Tingwen Deng, Shiwei Chen\",\"doi\":\"10.1039/d5cp00785b\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Pre-trained Language Models (PLMs) have recently emerged as a powerful tool, showcasing exceptional performance not just in natural language understanding but also in the realm of biological research. The advantage of PLMs lies in their ability to leverage the structural similarity between biological sequences and natural language. PLMs offer novel solutions for protein research and drug design applications. By pre-training on extensive unlabeled biological sequences and then fine-tuning for specific tasks, PLMs have delivered remarkable results. To summarize the growing landscape of PLMs in biological research, this paper integrates exemplary PLMs and common datasets, demonstrating the potential and application prospects of PLMs in prediction and generation tasks.\",\"PeriodicalId\":99,\"journal\":{\"name\":\"Physical Chemistry Chemical Physics\",\"volume\":\"10 1\",\"pages\":\"\"},\"PeriodicalIF\":2.9000,\"publicationDate\":\"2025-06-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Physical Chemistry Chemical Physics\",\"FirstCategoryId\":\"92\",\"ListUrlMain\":\"https://doi.org/10.1039/d5cp00785b\",\"RegionNum\":3,\"RegionCategory\":\"化学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"CHEMISTRY, PHYSICAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Physical Chemistry Chemical Physics","FirstCategoryId":"92","ListUrlMain":"https://doi.org/10.1039/d5cp00785b","RegionNum":3,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"CHEMISTRY, PHYSICAL","Score":null,"Total":0}
Pre-trained Language Models for Protein and Molecular Design
Pre-trained Language Models (PLMs) have recently emerged as a powerful tool, showcasing exceptional performance not just in natural language understanding but also in the realm of biological research. The advantage of PLMs lies in their ability to leverage the structural similarity between biological sequences and natural language. PLMs offer novel solutions for protein research and drug design applications. By pre-training on extensive unlabeled biological sequences and then fine-tuning for specific tasks, PLMs have delivered remarkable results. To summarize the growing landscape of PLMs in biological research, this paper integrates exemplary PLMs and common datasets, demonstrating the potential and application prospects of PLMs in prediction and generation tasks.
期刊介绍:
Physical Chemistry Chemical Physics (PCCP) is an international journal co-owned by 19 physical chemistry and physics societies from around the world. This journal publishes original, cutting-edge research in physical chemistry, chemical physics and biophysical chemistry. To be suitable for publication in PCCP, articles must include significant innovation and/or insight into physical chemistry; this is the most important criterion that reviewers and Editors will judge against when evaluating submissions.
The journal has a broad scope and welcomes contributions spanning experiment, theory, computation and data science. Topical coverage includes spectroscopy, dynamics, kinetics, statistical mechanics, thermodynamics, electrochemistry, catalysis, surface science, quantum mechanics, quantum computing and machine learning. Interdisciplinary research areas such as polymers and soft matter, materials, nanoscience, energy, surfaces/interfaces, and biophysical chemistry are welcomed if they demonstrate significant innovation and/or insight into physical chemistry. Joined experimental/theoretical studies are particularly appreciated when complementary and based on up-to-date approaches.