cidalsDB: an AI-empowered platform for anti-pathogen therapeutics research

IF 7.1 2区 化学 Q1 CHEMISTRY, MULTIDISCIPLINARY
Emna Harigua-Souiai, Ons Masmoudi, Samer Makni, Rafeh Oualha, Yosser Z. Abdelkrim, Sara Hamdi, Oussama Souiai, Ikram Guizani
{"title":"cidalsDB: an AI-empowered platform for anti-pathogen therapeutics research","authors":"Emna Harigua-Souiai,&nbsp;Ons Masmoudi,&nbsp;Samer Makni,&nbsp;Rafeh Oualha,&nbsp;Yosser Z. Abdelkrim,&nbsp;Sara Hamdi,&nbsp;Oussama Souiai,&nbsp;Ikram Guizani","doi":"10.1186/s13321-024-00929-7","DOIUrl":null,"url":null,"abstract":"<div><p>Computer-aided drug discovery (CADD) is nurtured by late advances in big data analytics and Artificial Intelligence (AI) towards enhanced drug discovery (DD) outcomes. In this context, reliable datasets are of utmost importance. We herein present <i>CidalsDB</i> a novel web server for AI-assisted DD against infectious pathogens, namely <i>Leishmania</i> parasites and Coronaviruses. We performed a literature search on molecules with validated anti-pathogen effects. Then, we consolidated these data with bioassays from PubChem. Finally, we constructed a database to store these datasets and make them accessible and ready-to-use for the scientific community through <i>CidalsDB</i>, a web-based interface. In a second step, we implemented and optimized four machine learning (ML) and three deep learning (DL) algorithms that optimally predicted the biological activity of molecules. Random Forests (RF), Multi-Layer Perceptron (MLP) and ChemBERTa were the best classifiers of anti-<i>Leishmania</i> molecules, while Gradient Boosting (GB), Graph-Convolutional Network (GCN) and ChemBERTa achieved the best performances on the Coronaviruses dataset. All six models were optimized and deployed through <i>CidalsDB</i> as anti-pathogen activity prediction models.</p><p><b>Scientific contribution</b></p><p>CidalsDB is an open access web-based tool that allows browsing and access to ready-to-use datasets of anti-pathogen molecules, alongside best performing AI models for biological activity prediction. It offers a democratized no-code platform for AI-based CADD, which shall foster innovation and collaboration within the DD community. <i>CidalsDB</i> is accessible through https://cidalsdb.streamlit.app/.</p></div>","PeriodicalId":617,"journal":{"name":"Journal of Cheminformatics","volume":"16 1","pages":""},"PeriodicalIF":7.1000,"publicationDate":"2024-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://jcheminf.biomedcentral.com/counter/pdf/10.1186/s13321-024-00929-7","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Cheminformatics","FirstCategoryId":"92","ListUrlMain":"https://link.springer.com/article/10.1186/s13321-024-00929-7","RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CHEMISTRY, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0

Abstract

Computer-aided drug discovery (CADD) is nurtured by late advances in big data analytics and Artificial Intelligence (AI) towards enhanced drug discovery (DD) outcomes. In this context, reliable datasets are of utmost importance. We herein present CidalsDB a novel web server for AI-assisted DD against infectious pathogens, namely Leishmania parasites and Coronaviruses. We performed a literature search on molecules with validated anti-pathogen effects. Then, we consolidated these data with bioassays from PubChem. Finally, we constructed a database to store these datasets and make them accessible and ready-to-use for the scientific community through CidalsDB, a web-based interface. In a second step, we implemented and optimized four machine learning (ML) and three deep learning (DL) algorithms that optimally predicted the biological activity of molecules. Random Forests (RF), Multi-Layer Perceptron (MLP) and ChemBERTa were the best classifiers of anti-Leishmania molecules, while Gradient Boosting (GB), Graph-Convolutional Network (GCN) and ChemBERTa achieved the best performances on the Coronaviruses dataset. All six models were optimized and deployed through CidalsDB as anti-pathogen activity prediction models.

Scientific contribution

CidalsDB is an open access web-based tool that allows browsing and access to ready-to-use datasets of anti-pathogen molecules, alongside best performing AI models for biological activity prediction. It offers a democratized no-code platform for AI-based CADD, which shall foster innovation and collaboration within the DD community. CidalsDB is accessible through https://cidalsdb.streamlit.app/.

cidalsDB:人工智能赋能的抗病原治疗研究平台
计算机辅助药物发现(CADD)是在大数据分析和人工智能(AI)的推动下发展起来的,旨在提高药物发现(DD)的成果。在此背景下,可靠的数据集至关重要。我们在此介绍 CidalsDB,这是一个新型网络服务器,用于针对传染性病原体(即利什曼原虫和冠状病毒)的人工智能辅助药物研发。我们对具有有效抗病原体作用的分子进行了文献检索。然后,我们将这些数据与来自 PubChem 的生物测定结果进行了整合。最后,我们建立了一个数据库来存储这些数据集,并通过基于网络的界面 CidalsDB 使科学界能够访问和使用这些数据集。第二步,我们实施并优化了四种机器学习(ML)算法和三种深度学习(DL)算法,以最佳方式预测分子的生物活性。随机森林(RF)、多层感知器(MLP)和ChemBERTa是抗利什曼病分子的最佳分类器,而梯度提升(GB)、图卷积网络(GCN)和ChemBERTa在冠状病毒数据集上取得了最佳性能。所有六个模型都经过了优化,并通过 CidalsDB 作为抗病原体活性预测模型进行了部署。它为基于人工智能的计算机辅助设计(CADD)提供了一个民主化的无代码平台,可促进 DD 社区的创新与合作。CidalsDB 可通过 https://cidalsdb.streamlit.app/ 访问。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Journal of Cheminformatics
Journal of Cheminformatics CHEMISTRY, MULTIDISCIPLINARY-COMPUTER SCIENCE, INFORMATION SYSTEMS
CiteScore
14.10
自引率
7.00%
发文量
82
审稿时长
3 months
期刊介绍: Journal of Cheminformatics is an open access journal publishing original peer-reviewed research in all aspects of cheminformatics and molecular modelling. Coverage includes, but is not limited to: chemical information systems, software and databases, and molecular modelling, chemical structure representations and their use in structure, substructure, and similarity searching of chemical substance and chemical reaction databases, computer and molecular graphics, computer-aided molecular design, expert systems, QSAR, and data mining techniques.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信