基于机器学习的药物蛋白识别方法的实证比较和分析。

IF 3.8 3区生物学 Q1 BIOLOGY

EXCLI Journal Pub Date : 2023-08-29 eCollection Date: 2023-01-01 DOI:10.17179/excli2023-6410

Watshara Shoombuatong, Nalini Schaduangrat, Jaru Nikom

{"title":"基于机器学习的药物蛋白识别方法的实证比较和分析。","authors":"Watshara Shoombuatong, Nalini Schaduangrat, Jaru Nikom","doi":"10.17179/excli2023-6410","DOIUrl":null,"url":null,"abstract":"Efficiently and precisely identifying drug targets is crucial for developing and discovering potential medications. While conventional experimental approaches can accurately pinpoint these targets, they suffer from time constraints and are not easily adaptable to high-throughput processes. On the other hand, computational approaches, particularly those utilizing machine learning (ML), offer an efficient means to accelerate the prediction of druggable proteins based solely on their primary sequences. Recently, several state-of-the-art computational methods have been developed for predicting and analyzing druggable proteins. These computational methods showed high diversity in terms of benchmark datasets, feature extraction schemes, ML algorithms, evaluation strategies and webserver/software usability. Thus, our objective is to reexamine these computational approaches and conduct a comprehensive assessment of their strengths and weaknesses across multiple aspects. In this study, we deliver the first comprehensive survey regarding the state-of-the-art computational approaches for in silico prediction of druggable proteins. First, we provided information regarding the existing benchmark datasets and the types of ML methods employed. Second, we investigated the effectiveness of these computational methods in druggable protein identification for each benchmark dataset. Third, we summarized the important features used in this field and the existing webserver/software. Finally, we addressed the present constraints of the existing methods and offer valuable guidance to the scientific community in designing and developing novel prediction models. We anticipate that this comprehensive review will provide crucial information for the development of more accurate and efficient druggable protein predictors.","PeriodicalId":12247,"journal":{"name":"EXCLI Journal","volume":null,"pages":null},"PeriodicalIF":3.8000,"publicationDate":"2023-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10539545/pdf/","citationCount":"0","resultStr":"{\"title\":\"Empirical comparison and analysis of machine learning-based approaches for druggable protein identification.\",\"authors\":\"Watshara Shoombuatong, Nalini Schaduangrat, Jaru Nikom\",\"doi\":\"10.17179/excli2023-6410\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Efficiently and precisely identifying drug targets is crucial for developing and discovering potential medications. While conventional experimental approaches can accurately pinpoint these targets, they suffer from time constraints and are not easily adaptable to high-throughput processes. On the other hand, computational approaches, particularly those utilizing machine learning (ML), offer an efficient means to accelerate the prediction of druggable proteins based solely on their primary sequences. Recently, several state-of-the-art computational methods have been developed for predicting and analyzing druggable proteins. These computational methods showed high diversity in terms of benchmark datasets, feature extraction schemes, ML algorithms, evaluation strategies and webserver/software usability. Thus, our objective is to reexamine these computational approaches and conduct a comprehensive assessment of their strengths and weaknesses across multiple aspects. In this study, we deliver the first comprehensive survey regarding the state-of-the-art computational approaches for in silico prediction of druggable proteins. First, we provided information regarding the existing benchmark datasets and the types of ML methods employed. Second, we investigated the effectiveness of these computational methods in druggable protein identification for each benchmark dataset. Third, we summarized the important features used in this field and the existing webserver/software. Finally, we addressed the present constraints of the existing methods and offer valuable guidance to the scientific community in designing and developing novel prediction models. We anticipate that this comprehensive review will provide crucial information for the development of more accurate and efficient druggable protein predictors.\",\"PeriodicalId\":12247,\"journal\":{\"name\":\"EXCLI Journal\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":3.8000,\"publicationDate\":\"2023-08-29\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10539545/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"EXCLI Journal\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://doi.org/10.17179/excli2023-6410\",\"RegionNum\":3,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2023/1/1 0:00:00\",\"PubModel\":\"eCollection\",\"JCR\":\"Q1\",\"JCRName\":\"BIOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"EXCLI Journal","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.17179/excli2023-6410","RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2023/1/1 0:00:00","PubModel":"eCollection","JCR":"Q1","JCRName":"BIOLOGY","Score":null,"Total":0}

引用次数: 0

摘要

高效准确地识别药物靶点对于开发和发现潜在药物至关重要。虽然传统的实验方法可以准确地定位这些目标，但它们受到时间限制，不容易适应高通量过程。另一方面，计算方法，特别是那些利用机器学习（ML）的方法，提供了一种有效的方法来加速仅基于其初级序列的可药用蛋白质的预测。最近，已经开发了几种最先进的计算方法来预测和分析可药用蛋白质。这些计算方法在基准数据集、特征提取方案、ML算法、评估策略和网络服务器/软件可用性方面表现出高度的多样性。因此，我们的目标是重新审视这些计算方法，并从多个方面对其优势和劣势进行全面评估。在这项研究中，我们对可药用蛋白质的计算机预测的最先进的计算方法进行了首次全面调查。首先，我们提供了有关现有基准数据集和所使用的ML方法类型的信息。其次，我们研究了这些计算方法在每个基准数据集的药物蛋白鉴定中的有效性。第三，我们总结了该领域使用的重要功能和现有的Web服务器/软件。最后，我们解决了现有方法的当前限制，并为科学界设计和开发新的预测模型提供了宝贵的指导。我们预计，这篇全面的综述将为开发更准确、更有效的药物蛋白预测因子提供关键信息。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

Empirical comparison and analysis of machine learning-based approaches for druggable protein identification.

查看原文本刊更多论文

Empirical comparison and analysis of machine learning-based approaches for druggable protein identification.

Efficiently and precisely identifying drug targets is crucial for developing and discovering potential medications. While conventional experimental approaches can accurately pinpoint these targets, they suffer from time constraints and are not easily adaptable to high-throughput processes. On the other hand, computational approaches, particularly those utilizing machine learning (ML), offer an efficient means to accelerate the prediction of druggable proteins based solely on their primary sequences. Recently, several state-of-the-art computational methods have been developed for predicting and analyzing druggable proteins. These computational methods showed high diversity in terms of benchmark datasets, feature extraction schemes, ML algorithms, evaluation strategies and webserver/software usability. Thus, our objective is to reexamine these computational approaches and conduct a comprehensive assessment of their strengths and weaknesses across multiple aspects. In this study, we deliver the first comprehensive survey regarding the state-of-the-art computational approaches for in silico prediction of druggable proteins. First, we provided information regarding the existing benchmark datasets and the types of ML methods employed. Second, we investigated the effectiveness of these computational methods in druggable protein identification for each benchmark dataset. Third, we summarized the important features used in this field and the existing webserver/software. Finally, we addressed the present constraints of the existing methods and offer valuable guidance to the scientific community in designing and developing novel prediction models. We anticipate that this comprehensive review will provide crucial information for the development of more accurate and efficient druggable protein predictors.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

EXCLI Journal BIOLOGY-

CiteScore

8.00

自引率

2.20%

发文量

审稿时长

6-12 weeks

期刊介绍： EXCLI Journal publishes original research reports, authoritative reviews and case reports of experimental and clinical sciences. The journal is particularly keen to keep a broad view of science and technology, and therefore welcomes papers which bridge disciplines and may not suit the narrow specialism of other journals. Although the general emphasis is on biological sciences, studies from the following fields are explicitly encouraged (alphabetical order): aging research, behavioral sciences, biochemistry, cell biology, chemistry including analytical chemistry, clinical and preclinical studies, drug development, environmental health, ergonomics, forensic medicine, genetics, hepatology and gastroenterology, immunology, neurosciences, occupational medicine, oncology and cancer research, pharmacology, proteomics, psychiatric research, psychology, systems biology, toxicology