DDINet: Drug-drug interaction prediction network based on multi-molecular fingerprint features and multi-head attention centered weighted autoencoder.

IF 0.7 4区生物学 Q4 MATHEMATICAL & COMPUTATIONAL BIOLOGY

Journal of Bioinformatics and Computational Biology Pub Date : 2025-02-01 DOI:10.1142/S0219720025500039

K Soni Sharmila, Thanga Revathi S, Pokkuluri Kiran Sree

{"title":"DDINet: Drug-drug interaction prediction network based on multi-molecular fingerprint features and multi-head attention centered weighted autoencoder.","authors":"K Soni Sharmila, Thanga Revathi S, Pokkuluri Kiran Sree","doi":"10.1142/S0219720025500039","DOIUrl":null,"url":null,"abstract":"<p><p>Drug-drug interactions (DDIs) pose a major concern in polypharmacy due to their potential to cause unexpected side effects that can adversely affect a patient's health. Therefore, it is crucial to identify DDIs effectively during the early stages of drug discovery and development. In this paper, a novel DDI prediction network (DDINet) is proposed to enhance the predictive performance over conventional DDI methods. Leveraging the DrugBank dataset, drugs are represented using the Simplified Molecular Input Line-Entry System (SMILES), with the RDKit software pre-processing the SMILES strings into their canonical forms. Multiple molecular fingerprinting techniques such as Extended Connectivity Fingerprints (ECFPs), Molecular ACCess System keys (MACCSkeys), PubChem Fingerprints, 3D molecular fingerprints (3D-FP), and molecular dynamics fingerprints (MDFPs) are employed to encode drug chemical structures into feature vectors. Drug similarities are computed using the Tanimoto coefficient (TC), and the final Structural Similarity Profile (SSP) is obtained by averaging the five molecular fingerprint types. The novelty of the approach lies in the integration of a Multi-head Attention centered Weighted Autoencoder (Mul_WAE) as the interaction prediction module, which leverages the Multi-head Attention (MHA) layer to focus on the most significant input features. Furthermore, we introduce the Upgraded Bald Eagle Search Optimization (UBesO) algorithm, which optimally selects the learnable parameters of the Mul_WAE based on cross-entropy loss, improving the model's convergence and performance. The proposed DDINet model achieves an accuracy of 99.77%, 99.66% of AUC, 99.5% average precision, 99.4% precision, and 99.49% recall, providing a comprehensive evaluation of the model's robustness. Beyond high accuracy, DDINet offers advantages in scalability, making it well suited for handling large datasets due to its efficient feature extraction and optimization processes. The unique combination of multiple molecular fingerprinting methods with the MHA layer and UBesO algorithm highlights the innovative aspects of our model and significantly improves prediction performance compared to existing approaches.</p>","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":"23 1","pages":"2550003"},"PeriodicalIF":0.7000,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Bioinformatics and Computational Biology","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1142/S0219720025500039","RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"MATHEMATICAL & COMPUTATIONAL BIOLOGY","Score":null,"Total":0}

引用次数: 0

Abstract

Drug-drug interactions (DDIs) pose a major concern in polypharmacy due to their potential to cause unexpected side effects that can adversely affect a patient's health. Therefore, it is crucial to identify DDIs effectively during the early stages of drug discovery and development. In this paper, a novel DDI prediction network (DDINet) is proposed to enhance the predictive performance over conventional DDI methods. Leveraging the DrugBank dataset, drugs are represented using the Simplified Molecular Input Line-Entry System (SMILES), with the RDKit software pre-processing the SMILES strings into their canonical forms. Multiple molecular fingerprinting techniques such as Extended Connectivity Fingerprints (ECFPs), Molecular ACCess System keys (MACCSkeys), PubChem Fingerprints, 3D molecular fingerprints (3D-FP), and molecular dynamics fingerprints (MDFPs) are employed to encode drug chemical structures into feature vectors. Drug similarities are computed using the Tanimoto coefficient (TC), and the final Structural Similarity Profile (SSP) is obtained by averaging the five molecular fingerprint types. The novelty of the approach lies in the integration of a Multi-head Attention centered Weighted Autoencoder (Mul_WAE) as the interaction prediction module, which leverages the Multi-head Attention (MHA) layer to focus on the most significant input features. Furthermore, we introduce the Upgraded Bald Eagle Search Optimization (UBesO) algorithm, which optimally selects the learnable parameters of the Mul_WAE based on cross-entropy loss, improving the model's convergence and performance. The proposed DDINet model achieves an accuracy of 99.77%, 99.66% of AUC, 99.5% average precision, 99.4% precision, and 99.49% recall, providing a comprehensive evaluation of the model's robustness. Beyond high accuracy, DDINet offers advantages in scalability, making it well suited for handling large datasets due to its efficient feature extraction and optimization processes. The unique combination of multiple molecular fingerprinting methods with the MHA layer and UBesO algorithm highlights the innovative aspects of our model and significantly improves prediction performance compared to existing approaches.

查看原文本刊更多论文

DDINet：基于多分子指纹特征和多头注意中心加权自编码器的药物-药物相互作用预测网络。

药物-药物相互作用（ddi）由于可能引起意想不到的副作用，对患者的健康产生不利影响，在多种药物治疗中引起了主要关注。因此，在药物发现和开发的早期阶段有效识别ddi至关重要。本文提出了一种新的DDI预测网络（DDINet），以提高传统DDI方法的预测性能。利用DrugBank数据集，使用简化分子输入行输入系统（SMILES）表示药物，并使用RDKit软件将SMILES字符串预处理为规范形式。采用扩展连接指纹（ECFPs）、分子访问系统密钥（MACCSkeys）、PubChem指纹、3D分子指纹（3D- fp）、分子动力学指纹（MDFPs）等多种分子指纹技术将药物化学结构编码为特征向量。使用谷本系数（Tanimoto coefficient， TC）计算药物相似度，并通过平均5种分子指纹类型获得最终的结构相似谱（Structural Similarity Profile， SSP）。该方法的新颖之处在于集成了一个以多头注意力为中心的加权自编码器（Mul_WAE）作为交互预测模块，它利用多头注意力（MHA）层来关注最重要的输入特征。在此基础上，引入了基于交叉熵损失的升级秃鹰搜索优化算法（UBesO），对可学习参数进行优化选择，提高了模型的收敛性和性能。提出的DDINet模型的准确率为99.77%，AUC为99.66%，平均精度为99.5%，精度为99.4%，召回率为99.49%，对模型的鲁棒性进行了综合评价。除了高精度之外，DDINet在可扩展性方面具有优势，由于其高效的特征提取和优化过程，使其非常适合处理大型数据集。多种分子指纹识别方法与MHA层和UBesO算法的独特组合突出了我们模型的创新方面，并且与现有方法相比显着提高了预测性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Journal of Bioinformatics and Computational Biology MATHEMATICAL & COMPUTATIONAL BIOLOGY-

CiteScore

2.10

自引率

0.00%

发文量

期刊介绍： The Journal of Bioinformatics and Computational Biology aims to publish high quality, original research articles, expository tutorial papers and review papers as well as short, critical comments on technical issues associated with the analysis of cellular information. The research papers will be technical presentations of new assertions, discoveries and tools, intended for a narrower specialist community. The tutorials, reviews and critical commentary will be targeted at a broader readership of biologists who are interested in using computers but are not knowledgeable about scientific computing, and equally, computer scientists who have an interest in biology but are not familiar with current thrusts nor the language of biology. Such carefully chosen tutorials and articles should greatly accelerate the rate of entry of these new creative scientists into the field.