Interpretable Multimodal Deep Ensemble Framework Dissecting Bloodbrain Barrier Permeability with Molecular Features.

IF 4.8 2区化学 Q2 CHEMISTRY, PHYSICAL

The Journal of Physical Chemistry Letters Pub Date : 2025-06-12 Epub Date: 2025-06-03 DOI:10.1021/acs.jpclett.5c01077

Dushuo Feng, Lulu Guan, Yunxiang Sun, Bote Qi, Yu Zou

{"title":"Interpretable Multimodal Deep Ensemble Framework Dissecting Bloodbrain Barrier Permeability with Molecular Features.","authors":"Dushuo Feng, Lulu Guan, Yunxiang Sun, Bote Qi, Yu Zou","doi":"10.1021/acs.jpclett.5c01077","DOIUrl":null,"url":null,"abstract":"<p><p>Blood-brain barrier permeability (BBBP) prediction plays a critical role in the drug discovery process, particularly for compounds targeting the central nervous system. While machine learning (ML) has significantly advanced the prediction of BBBP, there remains an urgent need for interpretable ML models that can reveal the physicochemical principles governing BBB permeability. In this study, we propose a multimodal ML framework that integrates molecular fingerprints (Morgan, MACCS, RDK) and image features to improve BBBP prediction. The classification task (BBB-permeable vs nonpermeable) is addressed with a stacking ensemble model combining multiple base classifiers. The proposed framework demonstrates competitive predictive stability, generalization ability, and feature interpretability compared with recent approaches, under comparable evaluation settings. Beyond predictive performance, our framework incorporates Principal Component Analysis (PCA) and Shapley Additive Explanations (SHAP) analysis to highlight key fingerprint features contributing to predictions. The regression task (logBB value prediction) is tackled by a multi-input deep learning framework, incorporating a Transformer encoder for fingerprint processing, a convolutional neural network (CNN) for image feature extraction, and a Multi-Head Attention fusion mechanism to enhance feature interactions. Attention maps derived from the multimodal features reveal token-level relationships within molecular representations. This work provides an interpretable framework for modeling BBBP with enhanced transparency and mechanistic insight and lays the foundation for future studies incorporating transparent descriptors and physics-informed features.</p>","PeriodicalId":62,"journal":{"name":"The Journal of Physical Chemistry Letters","volume":" ","pages":"5806-5819"},"PeriodicalIF":4.8000,"publicationDate":"2025-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"The Journal of Physical Chemistry Letters","FirstCategoryId":"1","ListUrlMain":"https://doi.org/10.1021/acs.jpclett.5c01077","RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/6/3 0:00:00","PubModel":"Epub","JCR":"Q2","JCRName":"CHEMISTRY, PHYSICAL","Score":null,"Total":0}

引用次数: 0

Abstract

Blood-brain barrier permeability (BBBP) prediction plays a critical role in the drug discovery process, particularly for compounds targeting the central nervous system. While machine learning (ML) has significantly advanced the prediction of BBBP, there remains an urgent need for interpretable ML models that can reveal the physicochemical principles governing BBB permeability. In this study, we propose a multimodal ML framework that integrates molecular fingerprints (Morgan, MACCS, RDK) and image features to improve BBBP prediction. The classification task (BBB-permeable vs nonpermeable) is addressed with a stacking ensemble model combining multiple base classifiers. The proposed framework demonstrates competitive predictive stability, generalization ability, and feature interpretability compared with recent approaches, under comparable evaluation settings. Beyond predictive performance, our framework incorporates Principal Component Analysis (PCA) and Shapley Additive Explanations (SHAP) analysis to highlight key fingerprint features contributing to predictions. The regression task (logBB value prediction) is tackled by a multi-input deep learning framework, incorporating a Transformer encoder for fingerprint processing, a convolutional neural network (CNN) for image feature extraction, and a Multi-Head Attention fusion mechanism to enhance feature interactions. Attention maps derived from the multimodal features reveal token-level relationships within molecular representations. This work provides an interpretable framework for modeling BBBP with enhanced transparency and mechanistic insight and lays the foundation for future studies incorporating transparent descriptors and physics-informed features.

Abstract Image

查看原文本刊更多论文

可解释的多模态深系综框架与分子特征解剖血脑屏障通透性。

血脑屏障通透性（BBBP）预测在药物发现过程中起着至关重要的作用，特别是针对中枢神经系统的化合物。虽然机器学习（ML）已经显著地推进了血脑屏障的预测，但仍然迫切需要可解释的ML模型，以揭示控制血脑屏障通透性的物理化学原理。在这项研究中，我们提出了一个多模态机器学习框架，该框架集成了分子指纹（Morgan， MACCS， RDK）和图像特征，以提高BBBP的预测。通过结合多个基分类器的叠加集成模型来解决bbb可渗透与不可渗透的分类任务。在可比较的评估设置下，与最近的方法相比，所提出的框架具有竞争性的预测稳定性、泛化能力和特征可解释性。除了预测性能之外，我们的框架还结合了主成分分析（PCA）和Shapley加性解释（SHAP）分析，以突出有助于预测的关键指纹特征。回归任务（logBB值预测）由多输入深度学习框架解决，该框架结合了用于指纹处理的Transformer编码器、用于图像特征提取的卷积神经网络（CNN）和用于增强特征交互的多头注意力融合机制。从多模态特征衍生的注意图揭示了分子表征中的标记级关系。这项工作为BBBP建模提供了一个可解释的框架，具有增强的透明度和机制洞察力，并为未来的研究奠定了基础，包括透明描述符和物理信息特征。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

The Journal of Physical Chemistry Letters CHEMISTRY, PHYSICAL-NANOSCIENCE & NANOTECHNOLOGY

CiteScore

9.60

自引率

7.00%

发文量

1519

审稿时长

1.6 months

期刊介绍： The Journal of Physical Chemistry (JPC) Letters is devoted to reporting new and original experimental and theoretical basic research of interest to physical chemists, biophysical chemists, chemical physicists, physicists, material scientists, and engineers. An important criterion for acceptance is that the paper reports a significant scientific advance and/or physical insight such that rapid publication is essential. Two issues of JPC Letters are published each month.