Muhammad Shahab, Jiazhuo Xiao, Jiaojiao Wang, Zunnan Huang
{"title":"Molecular basis of BACE1 modulation revealed by machine learning, molecular simulations, and experimental validation.","authors":"Muhammad Shahab, Jiazhuo Xiao, Jiaojiao Wang, Zunnan Huang","doi":"10.1016/j.ijbiomac.2026.152409","DOIUrl":null,"url":null,"abstract":"<p><p>Alzheimer's disease (AD) remains one of the most prevalent and debilitating neurodegenerative disorders worldwide, with no currently available disease-modifying treatments. β-site amyloid precursor protein cleaving enzyme 1 (BACE1) catalyzes the rate-limiting step in amyloid-β (Aβ) production and represents a validated therapeutic target for AD intervention. In this study, we developed an integrated computational framework combining machine learning-based virtual screening, molecular docking, and molecular dynamics simulations with experimental validation using CCK-8 assays and Western blot analysis to identify novel BACE1 inhibitors from a natural product library. A curated dataset of experimentally validated BACE1 inhibitors retrieved from the ChEMBL database was used to construct 36 classification models based on three molecular fingerprint representations MACCS Keys, ECFP4, and Topological Torsion in combination with four machine learning algorithms such as Support Vector Machine (SVM), Random Forest (RF), Decision Tree (DT), and Extreme Gradient Boosting (XGBoost). Among all developed models, the SVM model using ECFP4 fingerprints achieved the best predictive performance, with an external test set accuracy of 0.91 and a Matthews Correlation Coefficient (MCC) of 0.78. The optimized models were subsequently applied to screen 4779 natural product compounds from the MedChemExpress library using a consensus prediction strategy. Promising hits were evaluated by molecular docking against the BACE1 crystal structure (PDB: 6PZ4), and top-ranked candidates were subjected to 200 ns molecular dynamics simulations followed by MM/GBSA binding free energy calculations. Among the identified candidates, HY-N7141 exhibited the most favorable docking score (-9.04 kcal/mol) and binding free energy (ΔG = -67.30 kcal/mol), driven predominantly by strong van der Waals interactions with key catalytic residues. Structural stability analysis confirmed that the majority of protein-ligand complexes maintained stable conformations throughout the simulations. Experimental validation in SH-SY5Y human neuroblastoma cells further assessed the effects of selected compounds on BACE1 protein expression. Collectively, these findings demonstrate the utility of integrating machine learning with structure-based approaches for accelerating the discovery of potent BACE1 inhibitors, and present several promising candidates warranting further preclinical investigation.</p>","PeriodicalId":333,"journal":{"name":"International Journal of Biological Macromolecules","volume":" ","pages":"152409"},"PeriodicalIF":8.5000,"publicationDate":"2026-05-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Biological Macromolecules","FirstCategoryId":"92","ListUrlMain":"https://doi.org/10.1016/j.ijbiomac.2026.152409","RegionNum":1,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOCHEMISTRY & MOLECULAR BIOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Alzheimer's disease (AD) remains one of the most prevalent and debilitating neurodegenerative disorders worldwide, with no currently available disease-modifying treatments. β-site amyloid precursor protein cleaving enzyme 1 (BACE1) catalyzes the rate-limiting step in amyloid-β (Aβ) production and represents a validated therapeutic target for AD intervention. In this study, we developed an integrated computational framework combining machine learning-based virtual screening, molecular docking, and molecular dynamics simulations with experimental validation using CCK-8 assays and Western blot analysis to identify novel BACE1 inhibitors from a natural product library. A curated dataset of experimentally validated BACE1 inhibitors retrieved from the ChEMBL database was used to construct 36 classification models based on three molecular fingerprint representations MACCS Keys, ECFP4, and Topological Torsion in combination with four machine learning algorithms such as Support Vector Machine (SVM), Random Forest (RF), Decision Tree (DT), and Extreme Gradient Boosting (XGBoost). Among all developed models, the SVM model using ECFP4 fingerprints achieved the best predictive performance, with an external test set accuracy of 0.91 and a Matthews Correlation Coefficient (MCC) of 0.78. The optimized models were subsequently applied to screen 4779 natural product compounds from the MedChemExpress library using a consensus prediction strategy. Promising hits were evaluated by molecular docking against the BACE1 crystal structure (PDB: 6PZ4), and top-ranked candidates were subjected to 200 ns molecular dynamics simulations followed by MM/GBSA binding free energy calculations. Among the identified candidates, HY-N7141 exhibited the most favorable docking score (-9.04 kcal/mol) and binding free energy (ΔG = -67.30 kcal/mol), driven predominantly by strong van der Waals interactions with key catalytic residues. Structural stability analysis confirmed that the majority of protein-ligand complexes maintained stable conformations throughout the simulations. Experimental validation in SH-SY5Y human neuroblastoma cells further assessed the effects of selected compounds on BACE1 protein expression. Collectively, these findings demonstrate the utility of integrating machine learning with structure-based approaches for accelerating the discovery of potent BACE1 inhibitors, and present several promising candidates warranting further preclinical investigation.
期刊介绍:
The International Journal of Biological Macromolecules is a well-established international journal dedicated to research on the chemical and biological aspects of natural macromolecules. Focusing on proteins, macromolecular carbohydrates, glycoproteins, proteoglycans, lignins, biological poly-acids, and nucleic acids, the journal presents the latest findings in molecular structure, properties, biological activities, interactions, modifications, and functional properties. Papers must offer new and novel insights, encompassing related model systems, structural conformational studies, theoretical developments, and analytical techniques. Each paper is required to primarily focus on at least one named biological macromolecule, reflected in the title, abstract, and text.