{"title":"Toward high-efficiency, low-resource, and explainable neuropeptide prediction with MSKDNP.","authors":"Peilin Xie, Jiahui Guan, Zhihao Zhao, Yulan Liu, Zhang Cheng, Xuxin He, Xingchen Liu, Yun Tang, Zhenglong Sun, Tzong-Yi Lee, Lantian Yao, Ying-Chih Chiang","doi":"10.1093/bib/bbaf466","DOIUrl":null,"url":null,"abstract":"<p><p>Neuropeptides are essential signaling molecules produced in the nervous system that regulate diverse physiological processes and are closely implicated in the pathogenesis of neurodegenerative and neuropsychiatric disorders. Investigating neuropeptides contributes to a better understanding of their regulatory mechanisms and offers new insights into therapeutic strategies for related diseases. Therefore, accurate identification of neuropeptides is crucial for advancing biomedical research and drug development. Due to the high cost of experimental validation, various artificial intelligence methods have been developed for rapid neuropeptide identification. However, existing approaches often suffer from high computational resource consumption, slow processing speed, and poor deploy ability. Moreover, a user-friendly web server for practical application is still lacking. To this end, we propose MSKDNP, a neuropeptide prediction model based on a multi-stage knowledge distillation framework. With only 1.2% of the parameters, MSKDNP attains performance comparable to a fully fine-tuned protein language model while achieving state-of-the-art results in neuropeptide recognition. Moreover, MSKDNP provides favorable interpretability, facilitating biological understanding. A freely accessible web server is available at https://awi.cuhk.edu.cn/∼biosequence/MSKDNP/index.php.</p>","PeriodicalId":9209,"journal":{"name":"Briefings in bioinformatics","volume":"26 5","pages":""},"PeriodicalIF":7.7000,"publicationDate":"2025-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12423397/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Briefings in bioinformatics","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1093/bib/bbaf466","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}
引用次数: 0
Abstract
Neuropeptides are essential signaling molecules produced in the nervous system that regulate diverse physiological processes and are closely implicated in the pathogenesis of neurodegenerative and neuropsychiatric disorders. Investigating neuropeptides contributes to a better understanding of their regulatory mechanisms and offers new insights into therapeutic strategies for related diseases. Therefore, accurate identification of neuropeptides is crucial for advancing biomedical research and drug development. Due to the high cost of experimental validation, various artificial intelligence methods have been developed for rapid neuropeptide identification. However, existing approaches often suffer from high computational resource consumption, slow processing speed, and poor deploy ability. Moreover, a user-friendly web server for practical application is still lacking. To this end, we propose MSKDNP, a neuropeptide prediction model based on a multi-stage knowledge distillation framework. With only 1.2% of the parameters, MSKDNP attains performance comparable to a fully fine-tuned protein language model while achieving state-of-the-art results in neuropeptide recognition. Moreover, MSKDNP provides favorable interpretability, facilitating biological understanding. A freely accessible web server is available at https://awi.cuhk.edu.cn/∼biosequence/MSKDNP/index.php.
期刊介绍:
Briefings in Bioinformatics is an international journal serving as a platform for researchers and educators in the life sciences. It also appeals to mathematicians, statisticians, and computer scientists applying their expertise to biological challenges. The journal focuses on reviews tailored for users of databases and analytical tools in contemporary genetics, molecular and systems biology. It stands out by offering practical assistance and guidance to non-specialists in computerized methodologies. Covering a wide range from introductory concepts to specific protocols and analyses, the papers address bacterial, plant, fungal, animal, and human data.
The journal's detailed subject areas include genetic studies of phenotypes and genotypes, mapping, DNA sequencing, expression profiling, gene expression studies, microarrays, alignment methods, protein profiles and HMMs, lipids, metabolic and signaling pathways, structure determination and function prediction, phylogenetic studies, and education and training.