Molecular Diversity最新文献

筛选
英文 中文
iDCNNPred: an interpretable deep learning model for virtual screening and identification of PI3Ka inhibitors against triple-negative breast cancer. iDCNNPred:一个可解释的深度学习模型,用于三阴性乳腺癌PI3Ka抑制剂的虚拟筛选和鉴定。
IF 3.9 2区 化学
Molecular Diversity Pub Date : 2025-08-01 Epub Date: 2024-12-08 DOI: 10.1007/s11030-024-11055-9
Ravishankar Jaiswal, Girdhar Bhati, Shakil Ahmed, Mohammad Imran Siddiqi
{"title":"iDCNNPred: an interpretable deep learning model for virtual screening and identification of PI3Ka inhibitors against triple-negative breast cancer.","authors":"Ravishankar Jaiswal, Girdhar Bhati, Shakil Ahmed, Mohammad Imran Siddiqi","doi":"10.1007/s11030-024-11055-9","DOIUrl":"10.1007/s11030-024-11055-9","url":null,"abstract":"<p><p>Triple-negative breast cancer (TNBC) lacks estrogen, progesterone, and HER2 expression, accounting for 15-20% of breast cancer cases. It is challenging due to low therapeutic response, heterogeneity, and aggressiveness. The PI3Ka isoform is a promising therapeutic target, often hyperactivated in TNBC, contributing to uncontrolled growth and cancer cell formation. We have proposed an interpretable deep convolutional neural network prediction (iDCNNPred) system using 2D molecular images to classify bioactivity and identify potential PI3Ka inhibitors. We built Custom-DCNN models and pre-trained models such as AlexNet, SqueezeNet, and VGG19 by using the Bayesian optimization algorithm, and found that our Custom-DCNN model performed better than a pre-trained model with lower complexity and memory usage. All top-performed models were screened with the Maybridge Chemical library to find predictive hit molecules. The screened molecules were further evaluated for protein-ligand interaction with molecular docking and finally 12 promising hits were shortlisted for biological validation using in-vitro PI3K inhibition studies. After biological evaluation, 4 potent molecules with different structural moieties were identified, and these molecules present new starting scaffolds for further improvement in terms of their potency and selectivity as PI3K inhibitors with the help of medicinal chemistry efforts. Furthermore, we also showed the significance of the interpretation and visualization of the model's predictions by the Grad-CAM technique, enhancing the robustness, transparency, and interpretability of the model's predictions. The data and script files and prediction run of models used for this study to reproduce the experiment are available in the GitHub repository at https://github.com/ravishankar1307/iDCNNPred.git .</p>","PeriodicalId":708,"journal":{"name":"Molecular Diversity","volume":" ","pages":"3077-3100"},"PeriodicalIF":3.9,"publicationDate":"2025-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142794296","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Explainable AI-driven prediction of APE1 inhibitors: enhancing cancer therapy with machine learning models and feature importance analysis. 可解释的人工智能驱动的 APE1 抑制剂预测:利用机器学习模型和特征重要性分析加强癌症治疗。
IF 3.9 2区 化学
Molecular Diversity Pub Date : 2025-08-01 Epub Date: 2025-02-21 DOI: 10.1007/s11030-025-11133-6
Aga Basit Iqbal, Tariq Ahmad Masoodi, Ajaz A Bhat, Muzafar A Macha, Assif Assad, Syed Zubair Ahmad Shah
{"title":"Explainable AI-driven prediction of APE1 inhibitors: enhancing cancer therapy with machine learning models and feature importance analysis.","authors":"Aga Basit Iqbal, Tariq Ahmad Masoodi, Ajaz A Bhat, Muzafar A Macha, Assif Assad, Syed Zubair Ahmad Shah","doi":"10.1007/s11030-025-11133-6","DOIUrl":"10.1007/s11030-025-11133-6","url":null,"abstract":"&lt;p&gt;&lt;p&gt;The viability of cells and the integrity of the genome depend on the detection and repair of damaged DNA through intricate mechanisms. Cancer treatment employs chemotherapy or radiation therapy to eliminate neoplastic cells by causing substantial damage to their DNA. In many cases, improved DNA repair mechanisms lead to resistance to these medicines; therefore, it is essential to expand efforts to develop drugs that can sensitise cells to these treatments by inhibiting the DNA repair process. Multiple studies have demonstrated a correlation between the overexpression of Apurinic/Apyrimidinic Endonuclease (APE1), the primary mammalian enzyme responsible for excising apurinic or apyrimidinic sites in DNA, and the resistance of cells to cancer therapies; in contrast, APE1 downregulation increases cellular susceptibility to DNA-damaging agents. Thus, the effectiveness of existing therapies can be improved by promoting the targeted sensitization of cancer cells while protecting healthy cells. The current study aims to employ explainable artificial intelligence (XAI) to enhance the accuracy and reliability of machine learning models for the prediction of APE1 inhibitors. Various ML-based regression models are employed to predict the pIC50 value of different medicines. Bayesian optimization and the Permutation Feature Importance (PFI) approach are employed to determine the best hyperparameters of machine learning models and to discover the most significant features for recognizing drug candidates that target APE1 enzymes, respectively. To acquire comprehensive elucidations for the predictive models in our research, two XAI methodologies, namely SHAP and LIME, are used. The SHAP analysis reveals that the features 'C1SP2' and 'ASP-2' are essential in influencing the model's predictions. The SHAP values demonstrate variability for features such as 'maxHBint2' and 'GATS1s,' signifying that their impact is dependent on specific instances within the dataset. The LIME study corroborates these findings, demonstrating that 'C1SP2' and 'ASP-2' are the most significant positive contributors, whereas features like 'SHCHnX,' 'nHdCH2,' and 'GATS1s' result in a decrease in the predicted values. Due to the limited sample size of the APE1 dataset, direct training on this dataset posed challenges in model generalization and reliability. To overcome this limitation, the BACE-1 dataset is leveraged for model training, enabling the ML models to learn from a more extensive and diverse chemical space. Among the tested algorithms, XGBoost demonstrated superior predictive performance, achieving R&lt;sup&gt;2&lt;/sup&gt; = 0.890, MAE = 0.186, and RMSE = 0.245, significantly surpassing state-of-the-art methods, such as LightGBM and QSAR-ML, which attained R&lt;sup&gt;2&lt;/sup&gt; scores of 0.798 and 0.630, respectively. These results highlight the robustness of our approach, demonstrating its enhanced generalization capability and superior predictive accuracy compared to existing methodologies.&lt;/","PeriodicalId":708,"journal":{"name":"Molecular Diversity","volume":" ","pages":"3371-3390"},"PeriodicalIF":3.9,"publicationDate":"2025-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143466396","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Integrating traditional QSAR and read-across-based regression models for predicting potential anti-leishmanial azole compounds. 结合传统的QSAR和基于读取的回归模型预测潜在的抗利什曼唑化合物。
IF 3.9 2区 化学
Molecular Diversity Pub Date : 2025-08-01 Epub Date: 2024-12-10 DOI: 10.1007/s11030-024-11070-w
Rajat Nandi, Anupama Sharma, Ananya Priya, Diwakar Kumar
{"title":"Integrating traditional QSAR and read-across-based regression models for predicting potential anti-leishmanial azole compounds.","authors":"Rajat Nandi, Anupama Sharma, Ananya Priya, Diwakar Kumar","doi":"10.1007/s11030-024-11070-w","DOIUrl":"10.1007/s11030-024-11070-w","url":null,"abstract":"<p><p>Leishmaniasis, a neglected tropical disease caused by various Leishmania species, poses a significant global health challenge, especially in resource-limited regions. Visceral Leishmaniasis (VL) stands out among its severe manifestations, and current drug therapies have limitations, necessitating the exploration of new, cost-effective treatments. This study utilized a comprehensive computational workflow, integrating traditional 2D-QSAR, q-RASAR, and molecular docking to identify novel anti-leishmanial compounds, with a focus on Glycyl-tRNA Synthetase (LdGlyRS) as a promising drug target. A feature selection process combining Genetic Function Approximation (GFA)-Lasso with Multiple Linear Regression (MLR) was used to characterize 99 azole compounds across ten structural classes. The baseline MLR model (MOD1), containing seven simple and interpretable 2D features, exhibited robust predictive capabilities, achieving an R<sup>2</sup><sub>train</sub> value of 0.82 and an R<sup>2</sup><sub>test</sub> value of 0.87. To further enhance prediction accuracy, three qualified single models (two MLR and one q-RASAR) were used to construct three consensus models (CMs), with CM2 (MAE<sub>test</sub> = 0.127) demonstrating significantly higher prediction accuracy for test compounds than the MOD1. Subsequently, Support Vector Regression (SVR) and Boosting yielded 0.88 (R<sup>2</sup><sub>train</sub>), 0.86 (R<sup>2</sup><sub>test</sub>), 0.92 (R<sup>2</sup><sub>train</sub>), and 0.82 (R<sup>2</sup><sub>test</sub>), respectively. Molecular docking highlighted interactions of potent azoles within the QSAR dataset with critical residues in the LdGlyRS active site (Arg226 and Glu350), emphasizing their inhibitory potential. Furthermore, the pIC50 values of an accurate external set of 2000 azole compounds from the ZINC20 database were simultaneously predicted by CM2 + SVR + Boosting models and docked against the LdGlyRS, which identified Bazedoxifene, Talmetacin, Pyrvinium, Enzastaurin as leading FDA candidates, whereas three novel compounds with the database code ZINC000001153734, ZINC000011934652, and ZINC000009942262 displayed stable docked interactions and favourable ADMET assessments. Subsequently, Molecular Dynamics (MD) simulations for 100 ns were conducted to validate the findings further, offering enhanced insights into the stability and dynamic behaviour of the ligand-protein complexes. The integrated approach of this study underscores the efficacy of 2D-QSAR modelling. It identifies LdGlyRS as a promising leishmaniasis target, offering a robust strategy for discovering and optimizing anti-leishmanial compounds to address the critical need for improved treatments.</p>","PeriodicalId":708,"journal":{"name":"Molecular Diversity","volume":" ","pages":"3207-3231"},"PeriodicalIF":3.9,"publicationDate":"2025-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142799066","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A bibliometric analysis of the Cheminformatics/QSAR literature (2000-2023) for predictive modeling in data science using the SCOPUS database. 使用SCOPUS数据库对化学信息学/QSAR文献(2000-2023)进行数据科学预测建模的文献计量学分析。
IF 3.9 2区 化学
Molecular Diversity Pub Date : 2025-08-01 Epub Date: 2024-12-05 DOI: 10.1007/s11030-024-11056-8
Arkaprava Banerjee, Kunal Roy, Paola Gramatica
{"title":"A bibliometric analysis of the Cheminformatics/QSAR literature (2000-2023) for predictive modeling in data science using the SCOPUS database.","authors":"Arkaprava Banerjee, Kunal Roy, Paola Gramatica","doi":"10.1007/s11030-024-11056-8","DOIUrl":"10.1007/s11030-024-11056-8","url":null,"abstract":"<p><p>A bibliometric analysis of the Cheminformatics/QSAR articles published in the present century (2000-2023) is presented based on a SCOPUS search made in October 2024 using a given set of search criteria. The obtained results of 52,415 documents against the specific query are analyzed based on the number of documents per year, contributions of different countries and Institutes in Cheminformatics/QSAR publications, the contributions of researchers based on the number of documents, appearance in the top-cited articles, h-index, composite c-score (ns), and the newly introduced q-score. Finally, a list of the top 50 Cheminformatics/QSAR researchers is presented. An analysis is also made for the content of the top-cited articles during the period 2000-2023 in comparison to those before 2000 to capture the trend of changes in the Cheminformatics/QSAR research. The limiting factors of any bibliometric analysis are also briefly presented.</p>","PeriodicalId":708,"journal":{"name":"Molecular Diversity","volume":" ","pages":"3703-3715"},"PeriodicalIF":3.9,"publicationDate":"2025-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142783787","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
AI-DPAPT: a machine learning framework for predicting PROTAC activity. AI-DPAPT:预测 PROTAC 活动的机器学习框架。
IF 3.9 2区 化学
Molecular Diversity Pub Date : 2025-08-01 Epub Date: 2024-10-19 DOI: 10.1007/s11030-024-11011-7
Amr S Abouzied, Bahaa Alshammari, Hayam Kari, Bader Huwaimel, Saad Alqarni, Shaymaa E Kassab
{"title":"AI-DPAPT: a machine learning framework for predicting PROTAC activity.","authors":"Amr S Abouzied, Bahaa Alshammari, Hayam Kari, Bader Huwaimel, Saad Alqarni, Shaymaa E Kassab","doi":"10.1007/s11030-024-11011-7","DOIUrl":"10.1007/s11030-024-11011-7","url":null,"abstract":"<p><p>Proteolysis Targeting Chimeras are part of targeted protein degradation (TPD) techniques, which are significant for pharmacological and therapy development. Small-molecule interaction with the targeted protein is a complicated endeavor and a challenge to predict the proteins accurately. This study used machine learning algorithms and molecular fingerprinting techniques to build an AI-powered PROTAC Activity Prediction Tool that could predict PROTAC activity by examining chemical structures. The chemical structures of a diverse set of PROTAC drugs and their corresponding activities are selected as a dataset for training the tool. The processes used in this study included data preparation, feature extraction, and model training. Further, evaluation was done for the performance of the various classifiers, such as AdaBoost, Support Vector Machine, Random Forest, Gradient Boosting, and Multi-Layer Perceptron. The findings show that the methods selected here depict accurate PROTAC activities. All the models in this study showed an ROC curve better than 0.9, while the random forest on the test set of the AI-DPAPT had an area under the curve score of 0.97, thus showing accurate results. Furthermore, the study revealed significant insights into the molecular features that can influence the functions of the PROTAC. These findings can potentially increase the understanding of the structure-activity correlations involved in the TPD. Overall, the investigation contributes to computational drug development by introducing this platform powered by artificial intelligence that predicts the function of PROTAC. In addition, it sped up the processes of identifying and improving previously unknown medications. The AI-DPAPT platform can be accessed online using a web server at https://ai-protac.streamlit.app/ .</p>","PeriodicalId":708,"journal":{"name":"Molecular Diversity","volume":" ","pages":"2995-3007"},"PeriodicalIF":3.9,"publicationDate":"2025-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142455375","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A 4D tensor-enhanced multi-dimensional convolutional neural network for accurate prediction of protein-ligand binding affinity. 一个4D张量增强的多维卷积神经网络,用于准确预测蛋白质与配体的结合亲和力。
IF 3.9 2区 化学
Molecular Diversity Pub Date : 2025-08-01 Epub Date: 2024-12-23 DOI: 10.1007/s11030-024-11044-y
Dingfang Huang, Yu Wang, Yiming Sun, Wenhao Ji, Qing Zhang, Yunya Jiang, Haodi Qiu, Haichun Liu, Tao Lu, Xian Wei, Yadong Chen, Yanmin Zhang
{"title":"A 4D tensor-enhanced multi-dimensional convolutional neural network for accurate prediction of protein-ligand binding affinity.","authors":"Dingfang Huang, Yu Wang, Yiming Sun, Wenhao Ji, Qing Zhang, Yunya Jiang, Haodi Qiu, Haichun Liu, Tao Lu, Xian Wei, Yadong Chen, Yanmin Zhang","doi":"10.1007/s11030-024-11044-y","DOIUrl":"10.1007/s11030-024-11044-y","url":null,"abstract":"<p><p>Protein-ligand interactions are the molecular basis of many important cellular activities, such as gene regulation, cell metabolism, and signal transduction. Protein-ligand binding affinity is a crucial metric of the strength of the interaction between the two, and accurate prediction of its binding affinity is essential for discovering drugs' new uses. So far, although many predictive models based on machine learning and deep learning have been reported, most of the models mainly focus on one-dimensional sequence and two-dimensional structural characteristics of proteins and ligands, but fail to deeply explore the detailed interaction information between proteins and ligand atoms in the binding pocket region of three-dimensional space. In this study, we introduced a novel 4D tensor feature to capture key interactions within the binding pocket and developed a three-dimensional convolutional neural network (CNN) model based on this feature. Using ten-fold cross-validation, we identified the optimal parameter combination and pocket size. Additionally, we employed feature engineering to extract features across multiple dimensions, including one-dimensional sequences, two-dimensional structures of the ligand and protein, and three-dimensional interaction features between them. We proposed an efficient protein-ligand binding affinity prediction model MCDTA (multi-dimensional convolutional drug-target affinity), built on a multi-dimensional convolutional neural network framework. Feature ablation experiments revealed that the 4D tensor feature had the most significant impact on model performance. MCDTA performed exceptionally well on the PDBbind v.2020 dataset, achieving an RMSE of 1.231 and a PCC of 0.823. In comparative experiments, it outperformed five other mainstream binding affinity prediction models, with an RMSE of 1.349 and a PCC of 0.795. Moreover, MCDTA demonstrated strong generalization ability and practical screening performance across multiple benchmark datasets, highlighting its reliability and accuracy in predicting protein-ligand binding affinity. The code for MCDTA is available at https://github.com/dfhuang-AI/MCDTA .</p>","PeriodicalId":708,"journal":{"name":"Molecular Diversity","volume":" ","pages":"3041-3058"},"PeriodicalIF":3.9,"publicationDate":"2025-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142875579","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Generative adversarial network (GAN) model-based design of potent SARS-CoV-2 Mpro inhibitors using the electron density of ligands and 3D binding pockets: insights from molecular docking, dynamics simulation, and MM-GBSA analysis. 基于生成对抗网络(GAN)模型的有效SARS-CoV-2 Mpro抑制剂设计,利用配体的电子密度和3D结合口袋:来自分子对接、动力学模拟和MM-GBSA分析的见解
IF 3.9 2区 化学
Molecular Diversity Pub Date : 2025-08-01 Epub Date: 2024-11-30 DOI: 10.1007/s11030-024-11047-9
Annesha Chakraborty, Vignesh Krishnan, Subbiah Thamotharan
{"title":"Generative adversarial network (GAN) model-based design of potent SARS-CoV-2 M<sup>pro</sup> inhibitors using the electron density of ligands and 3D binding pockets: insights from molecular docking, dynamics simulation, and MM-GBSA analysis.","authors":"Annesha Chakraborty, Vignesh Krishnan, Subbiah Thamotharan","doi":"10.1007/s11030-024-11047-9","DOIUrl":"10.1007/s11030-024-11047-9","url":null,"abstract":"<p><p>Deep learning-based generative adversarial network (GAN) frameworks have recently been developed to expedite the drug discovery process. These models generate novel molecules from scratch and validate them through molecular docking simulation to identify the most promising candidates for a given drug target. In this study, the SARS-CoV-2 main protease (M<sup>pro</sup>) was selected as the drug target. Two distinct GAN algorithms were employed to generate novel small molecules. One approach utilized experimental electron density (ED-based) data of ligands for training to generate drug-like molecules, while the second approach leveraged the target binding pocket to capture spatial and bonding relationship between atoms within the binding pockets. The ED-based approach generated approximately 26,000 molecules, whereas the binding pocket-based method produced around 100 molecules. These generated molecules were subsequently ranked based on molecular docking results using the glide XP score (both flexible and rigid docking) and AutoDock Vina. To identify the most potent GAN-derived molecules, molecular docking was also performed on co-crystallized inhibitor molecules of M<sup>pro</sup>. The six most promising molecules from these GAN approaches were further evaluated for stability, interactions, and MM-GBSA binding free energy through molecular dynamics simulations. This analysis led to the identification of four potent M<sup>pro</sup> inhibitor molecules, all featuring a 2-benzyl-6-bromophenol scaffold. The binding free energies of these compounds were compared with those of other M<sup>pro</sup> inhibitors, revealing that our compounds demonstrated better affinity for M<sup>pro</sup> than some broad-spectrum protease inhibitors. The dynamic cross-correlation matrix plot indicated strongly correlated and anti-correlated regions, potentially linked to ligand binding.</p>","PeriodicalId":708,"journal":{"name":"Molecular Diversity","volume":" ","pages":"3059-3075"},"PeriodicalIF":3.9,"publicationDate":"2025-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142754447","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Titania: an integrated tool for in silico molecular property prediction and NAM-based modeling. 二氧化钛:一个集成的工具,用于硅分子性质预测和基于纳米结构的建模。
IF 3.9 2区 化学
Molecular Diversity Pub Date : 2025-08-01 Epub Date: 2025-04-23 DOI: 10.1007/s11030-025-11196-5
Nikoletta-Maria Koutroumpa, Maria Antoniou, Dimitra-Danai Varsou, Konstantinos D Papavasileiou, Nikolaos K Sidiropoulos, Christoforos Kyprianou, Andreas Tsoumanis, Haralambos Sarimveis, Iseult Lynch, Georgia Melagraki, Antreas Afantitis
{"title":"Titania: an integrated tool for in silico molecular property prediction and NAM-based modeling.","authors":"Nikoletta-Maria Koutroumpa, Maria Antoniou, Dimitra-Danai Varsou, Konstantinos D Papavasileiou, Nikolaos K Sidiropoulos, Christoforos Kyprianou, Andreas Tsoumanis, Haralambos Sarimveis, Iseult Lynch, Georgia Melagraki, Antreas Afantitis","doi":"10.1007/s11030-025-11196-5","DOIUrl":"10.1007/s11030-025-11196-5","url":null,"abstract":"<p><p>Advances in drug discovery and material design rely heavily on in silico analysis of extensive compound datasets and accurate assessment of their properties and activities through computational methods. Efficient and reliable prediction of molecular properties is crucial for rational compound design in the chemical industry. To address this need, we have developed predictive models for nine key properties, including the octanol/water partition coefficient, water solubility, experimental hydration free energy in water, vapor pressure, boiling point, cytotoxicity, mutagenicity, blood-brain barrier permeability, and bioconcentration factor. These models have demonstrated high predictive accuracy and have undergone thorough validation in accordance with OECD test guidelines. The models are seamlessly integrated into the Enalos Cloud Platform through Titania ( https://enaloscloud.novamechanics.com/EnalosWebApps/titania/ ), a comprehensive web-based application designed to democratize access to advanced computational tools. Titania features an intuitive, user-friendly interface, allowing researchers, regardless of computational expertise, to easily employ models for property prediction of novel compounds. The platform enables informed decision-making and supports innovation in drug discovery and material design. We aspire for this tool to become a valuable resource for the scientific community, enhancing both the efficiency and accuracy of property and toxicity predictions.</p>","PeriodicalId":708,"journal":{"name":"Molecular Diversity","volume":" ","pages":"3555-3573"},"PeriodicalIF":3.9,"publicationDate":"2025-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12245999/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143958362","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Identification and validation of oxidative stress-related diagnostic markers for recurrent pregnancy loss: insights from machine learning and molecular analysis. 鉴定和验证与氧化应激相关的复发性妊娠丢失诊断标记:机器学习和分子分析的启示。
IF 3.9 2区 化学
Molecular Diversity Pub Date : 2025-08-01 Epub Date: 2024-09-03 DOI: 10.1007/s11030-024-10947-0
Hui Hu, Li Yu, Yating Cheng, Yao Xiong, Daoxi Qi, Boyu Li, Xiaokang Zhang, Fang Zheng
{"title":"Identification and validation of oxidative stress-related diagnostic markers for recurrent pregnancy loss: insights from machine learning and molecular analysis.","authors":"Hui Hu, Li Yu, Yating Cheng, Yao Xiong, Daoxi Qi, Boyu Li, Xiaokang Zhang, Fang Zheng","doi":"10.1007/s11030-024-10947-0","DOIUrl":"10.1007/s11030-024-10947-0","url":null,"abstract":"<p><p>It has been recognized that oxidative stress (OS) is implicated in the etiology of recurrent pregnancy loss (RPL), yet the biomarkers reflecting oxidative stress in association with RPL remain scarce. The dataset GSE165004 was retrieved from the Gene Expression Omnibus (GEO) database. From the GeneCards database, a compendium of 789 genes related to oxidative stress-related genes (OSRGs) was compiled. By intersecting differentially expressed genes (DEGs) in normal and RPL samples with OSRGs, differentially expressed OSRGs (DE-OSRGs) were identified. In addition, four machine learning algorithms were employed for the selection of diagnostic markers for RPL. The Receiver Operating Characteristic (ROC) curves for these genes were generated and a predictive nomogram for the diagnostic markers was established. The functions and pathways associated with the diagnostic markers were elucidated, and the correlations between immune cells and diagnostic markers were examined. Potential therapeutics targeting the diagnostic markers were proposed based on data from the Comparative Toxicogenomics Database and ClinicalTrials.gov. The candidate biomarker genes from the four models were further validated in RPL tissue samples using RT-PCR and immunohistochemistry. A set of 20 DE-OSRGs was identified, with 4 genes (KRAS, C2orf69, CYP17A1, and UCP3) being recognized by machine learning algorithms as diagnostic markers exhibiting robust diagnostic capabilities. The nomogram constructed demonstrated favorable predictive accuracy. Pathways including ribosome, peroxisome, Parkinson's disease, oxidative phosphorylation, Huntington's disease, and Alzheimer's disease were co-enriched by KRAS, C2orf69, and CYP17A1. Cell chemotaxis terms were commonly enriched by all four diagnostic markers. Significant differences in the abundance of five cell types, namely eosinophils, monocytes, natural killer cells, regulatory T cells, and T follicular helper cells, were observed between normal and RPL samples. A total of 180 drugs were predicted to target the diagnostic markers, including C544151, D014635, and CYP17A1. In the validation cohort of RPL patients, the LASSO model demonstrated superiority over other models. The expression levels of KRAS, C2orf69, and CYP17A1 were significantly reduced in RPL, while UCP3 levels were elevated, indicating their suitability as molecular markers for RPL. Four oxidative stress-related diagnostic markers (KRAS, C2orf69, CYP17A1, and UCP3) have been proposed to diagnose and potentially treat RPL.</p>","PeriodicalId":708,"journal":{"name":"Molecular Diversity","volume":" ","pages":"2881-2897"},"PeriodicalIF":3.9,"publicationDate":"2025-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142118726","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Dual inhibition of AChE and MAO-B in Alzheimer's disease: machine learning approaches and model interpretations. AChE和MAO-B在阿尔茨海默病中的双重抑制:机器学习方法和模型解释。
IF 3.9 2区 化学
Molecular Diversity Pub Date : 2025-08-01 Epub Date: 2025-01-21 DOI: 10.1007/s11030-024-11061-x
Qinghe Hou, Yan Li
{"title":"Dual inhibition of AChE and MAO-B in Alzheimer's disease: machine learning approaches and model interpretations.","authors":"Qinghe Hou, Yan Li","doi":"10.1007/s11030-024-11061-x","DOIUrl":"10.1007/s11030-024-11061-x","url":null,"abstract":"<p><p>Alzheimer's disease (AD) is one of the most prevalent neurodegenerative diseases. Given the multifactorial pathophysiology of AD, monotargeted agents can only alleviate symptoms but not cure AD. Acetylcholinesterase (AChE) and Monoamine oxidase B (MAO-B) are two key targets in the treatment of AD, molecules that inhibiting both targets are considered promising avenue to develop more effective AD therapies. In the present work, a dual inhibition dataset containing 449 molecules was established, based on which five machine learning algorithms (KNN, SVM, RF, GBDT, and LGBM) four fingerprints (MACCS, ECFP4, RDKitFP, PubChemFP) and DRAGON descriptors were combined to develop 25 classification models in which GBDT paired with ECFP4 and RF paired with PubchemFP achieved the same best performance across multiple metrics (Accuracy = 0.92, F1 Score = 0.94, MCC = 0.81). Moreover, based on the curated bioactivity datasets of AChE and MAO-B, regression models were developed to predict pIC<sub>50</sub> values. For the AChE inhibition task, GBDT demonstrated the best performance (RMSE = 0.683, MAE = 0.500, R<sup>2</sup> = 0.721). The SVM algorithm emerged as the most effective for MAO-B inhibition (RMSE = 0.668, MAE = 0.507, R<sup>2</sup> = 0.675). The SHAP algorithm was used to interpret the optimal models, identifying and analyzing the key substructures and properties for both dual-target and single-target inhibitors. Moreover, molecules docking process provided potential mechanism and Structure-Activity Relationships (SAR) of dual-target inhibition further.</p>","PeriodicalId":708,"journal":{"name":"Molecular Diversity","volume":" ","pages":"3113-3130"},"PeriodicalIF":3.9,"publicationDate":"2025-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142998204","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信