Xiaoxin Du, Jingwei Li, Bo Wang, Jianfei Zhang, Tongxuan Wang, Junqi Wang
{"title":"NRGCNMDA: Microbe-Drug Association Prediction Based on Residual Graph Convolutional Networks and Conditional Random Fields.","authors":"Xiaoxin Du, Jingwei Li, Bo Wang, Jianfei Zhang, Tongxuan Wang, Junqi Wang","doi":"10.1007/s12539-024-00678-z","DOIUrl":"https://doi.org/10.1007/s12539-024-00678-z","url":null,"abstract":"<p><p>The process of discovering new drugs related to microbes through traditional biological methods is lengthy and costly. In response to these issues, a new computational model (NRGCNMDA) is proposed to predict microbe-drug associations. First, Node2vec is used to extract potential associations between microorganisms and drugs, and a heterogeneous network of microbes and drugs is constructed. Then, a Graph Convolutional Network incorporating a fusion residual network mechanism (REGCN) is utilized to learn meaningful high-order similarity features. In addition, conditional random fields (CRF) are applied to ensure that microbes and drugs have similar feature embeddings. Finally, unobserved microbe-drug associations are scored based on combined embeddings. The experimental findings demonstrate that the NRGCNMDA approach outperforms several existing deep learning methods, and its AUC and AUPR values are 95.16% and 93.02%, respectively. The case study demonstrates that NRGCNMDA accurately predicts drugs associated with Enterococcus faecalis and Listeria monocytogenes, as well as microbes associated with ibuprofen and tetracycline.</p>","PeriodicalId":13670,"journal":{"name":"Interdisciplinary Sciences: Computational Life Sciences","volume":" ","pages":""},"PeriodicalIF":3.9,"publicationDate":"2025-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142948365","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Domain Adaptive Interpretable Substructure-Aware Graph Attention Network for Drug-Drug Interaction Prediction.","authors":"Qi Zhang, Yuxiao Wei, Liwei Liu","doi":"10.1007/s12539-024-00680-5","DOIUrl":"https://doi.org/10.1007/s12539-024-00680-5","url":null,"abstract":"<p><p>Accurate prediction of drug-drug interaction (DDI) is essential to improve clinical efficacy, avoid adverse effects of drug combination therapy, and enhance drug safety. Recently researchers have developed several computer-aided methods for DDI prediction. However, these methods lack the substructural features that are critical to drug interactions and are not effective in generalizing across domains and different distribution data. In this work, we present SAGAN, a domain adaptive interpretable substructure-aware graph attention network for DDI prediction. Based on attention mechanism and unsupervised clustering algorithm, we propose a new substructure segmentation method, which segments the drug molecule into multiple substructures, learns the mechanism of drug interaction from the perspective of interaction, and identifies important interaction regions between drugs. To enhance the generalization ability of the model, we improve and apply a conditional domain adversarial network to achieve cross-domain generalization by alternately optimizing the cross-entropy loss on the source domain and the adversarial loss of the domain discriminator. We evaluate and compare SAGAN with the state-of-the-art DDI prediction model on four real-world datasets for both in-domain and cross-domain scenarios, and show that SAGAN achieves the best overall performance. Moreover, the visualization results of the model show that SAGAN has achieved pharmacologically significant substructure extraction, which can help drug developers screen for some undiscovered local interaction sites, and provide important information for further drug structure optimization. The codes and datasets are available online at https://github.com/wyx2012/SAGAN .</p>","PeriodicalId":13670,"journal":{"name":"Interdisciplinary Sciences: Computational Life Sciences","volume":" ","pages":""},"PeriodicalIF":3.9,"publicationDate":"2025-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142948364","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Peng-Cheng Zhao, Xue-Xin Wei, Qiong Wang, Hao-Yang Wang, Bing-Xue Du, Jia-Ning Li, Bei Zhu, Hui Yu, Jian-Yu Shi
{"title":"MTGGF: A Metabolism Type-Aware Graph Generative Model for Molecular Metabolite Prediction.","authors":"Peng-Cheng Zhao, Xue-Xin Wei, Qiong Wang, Hao-Yang Wang, Bing-Xue Du, Jia-Ning Li, Bei Zhu, Hui Yu, Jian-Yu Shi","doi":"10.1007/s12539-024-00681-4","DOIUrl":"https://doi.org/10.1007/s12539-024-00681-4","url":null,"abstract":"<p><p>Metabolism in vivo turns small molecules (e.g., drugs) into metabolites (new molecules), which brings unexpected safety issues in drug development. However, it is costly to determine metabolites by biological assays. Recent computational methods provide new promising approaches by predicting possible metabolites. Rule-based methods utilize predefined reaction-derived rules to infer metabolites. However, they are powerless to new metabolic reaction patterns. In contrast, rule-free methods leverage sequence-to-sequence machine translation to generate metabolites. Nevertheless, they are insufficient to characterize molecule structures, and bear weak interpretability. To address these issues in rule-free methods, this manuscript proposes a novel metabolism type-aware graph generative framework (MTGGF) for molecular metabolite prediction. It contains a two-stage learning process, including a pre-training on a large general chemical reaction dataset, and a fine-tuning on three smaller type-specific metabolic reaction datasets. Its core, an elaborate graph-to-graph generative model, treats both atoms and bonds as bipartite vertices, and molecules as bipartite graphs, such that it can embed rich information of molecule structures and ensure the integrity of generated metabolite structures. The comparison with state-of-the-art methods demonstrates its superiority. Furthermore, the ablation study validates the contributions of its two graph encoding components and its reaction-type-specific fine-tuning models. More importantly, based on interactive attention between a molecule and its metabolites, the case studies on five approved drugs reveal that there exist crucial substructures specific to metabolism types. It is anticipated that this framework can boost the risk evaluation of drug metabolites. The codes are available at https://github.com/zpczaizheli/Metabolite .</p>","PeriodicalId":13670,"journal":{"name":"Interdisciplinary Sciences: Computational Life Sciences","volume":" ","pages":""},"PeriodicalIF":3.9,"publicationDate":"2025-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142931780","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"UltraNet: Unleashing the Power of Simplicity for Accurate Medical Image Segmentation.","authors":"Ziyi Han, Yuanyuan Zhang, Lin Liu, Yulin Zhang","doi":"10.1007/s12539-024-00682-3","DOIUrl":"https://doi.org/10.1007/s12539-024-00682-3","url":null,"abstract":"<p><p>The imperative development of point-of-care diagnosis for accurate and rapid medical image segmentation, has become increasingly urgent in recent years. Although some pioneering work has applied complex modules to improve segmentation performance, resulting models are often heavy, which is not practical for the modern clinical setting of point-of-care diagnosis. To address these challenges, we propose UltraNet, a state-of-the-art lightweight model that achieves competitive performance in segmenting multiple parts of medical images with the lowest parameters and computational complexity. To extract a sufficient amount of feature information and replace cumbersome modules, the Shallow Focus Float Block (ShalFoFo) and the Dual-stream Synergy Feature Extraction (DuSem) are respectively proposed at both shallow and deep levels. ShalFoFo is designed to capture finer-grained features containing more pixels, while DuSem is capable of extracting distinct deep semantic features from two different perspectives. By jointly utilizing them, the accuracy and stability of UltraNet segmentation results are enhanced. To evaluate performance, UltraNet's generalization ability was assessed on five datasets with different tasks. Compared to UNet, UltraNet reduces the parameters and computational complexity by 46 times and 26 times, respectively. Experimental results demonstrate that UltraNet achieves a state-of-the-art balance among parameters, computational complexity, and segmentation performance. Codes are available at https://github.com/Ziii1/UltraNet .</p>","PeriodicalId":13670,"journal":{"name":"Interdisciplinary Sciences: Computational Life Sciences","volume":" ","pages":""},"PeriodicalIF":3.9,"publicationDate":"2024-12-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142894251","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"BiGM-lncLoc: Bi-level Multi-Graph Meta-Learning for Predicting Cell-Specific Long Noncoding RNAs Subcellular Localization.","authors":"Xi Deng, Lin Liu","doi":"10.1007/s12539-024-00679-y","DOIUrl":"https://doi.org/10.1007/s12539-024-00679-y","url":null,"abstract":"<p><p>The precise spatiotemporal expression of long noncoding RNAs (lncRNAs) plays a pivotal role in biological regulation, and aberrant expression of lncRNAs in different subcellular localizations has been intricately linked to the onset and progression of a variety of cancers. Computational methods provide effective means for predicting lncRNA subcellular localization, but current studies either ignore cell line and tissue specificity or the correlation and shared information among cell lines. In this study, we propose a novel approach, BiGM-lncLoc, treating the prediction of lncRNA subcellular localization across cell lines as a multi-graph meta-learning task. Our investigation involves two categories of data: the localization data of nucleotide sequences in different cell lines and cell line expression data. BiGM-lncLoc comprises a cell line-specific optimization network learning specific knowledge from cell line expression data and a graph neural network optimized across cell lines. Subsequently, the specific and shared knowledge acquired through bi-level optimization is applied to a new cell-line prediction task without the need for re-training or fine-tuning. Additionally, through key feature analysis of the impact of different nucleotide combinations on the model, we confirm the necessity of cell line-specific studies based on correlation analysis. Finally, experiments conducted on various cell lines with different data sizes indicate that BiGM-lncLoc outperforms other methods in terms of prediction accuracy, with an average accuracy of 97.7%. After removing overlapping samples to ensure data independence for each cell line, the accuracy ranged from 82.4% to 94.7%, still surpassing existing models. Our code can be found at https://github.com/BioCL1/BiGM-lncLoc .</p>","PeriodicalId":13670,"journal":{"name":"Interdisciplinary Sciences: Computational Life Sciences","volume":" ","pages":""},"PeriodicalIF":3.9,"publicationDate":"2024-12-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142894249","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"EnDM-CPP: A Multi-view Explainable Framework Based on Deep Learning and Machine Learning for Identifying Cell-Penetrating Peptides with Transformers and Analyzing Sequence Information.","authors":"Lun Zhu, Zehua Chen, Sen Yang","doi":"10.1007/s12539-024-00673-4","DOIUrl":"https://doi.org/10.1007/s12539-024-00673-4","url":null,"abstract":"<p><p>Cell-Penetrating Peptides (CPPs) are a crucial carrier for drug delivery. Since the process of synthesizing new CPPs in the laboratory is both time- and resource-consuming, computational methods to predict potential CPPs can be used to find CPPs to enhance the development of CPPs in therapy. In this study, EnDM-CPP is proposed, which combines machine learning algorithms (SVM and CatBoost) with convolutional neural networks (CNN and TextCNN). For dataset construction, three previous CPP benchmark datasets, including CPPsite 2.0, MLCPP 2.0, and CPP924, are merged to improve the diversity and reduce homology. For feature generation, two language model-based features obtained from the Transformer architecture, including ProtT5 and ESM-2, are employed in CNN and TextCNN. Additionally, sequence features, such as CPRS, Hybrid PseAAC, KSC, etc., are input to SVM and CatBoost. Based on the result of each predictor, Logistic Regression (LR) is built to predict the final decision. The experiment results indicate that ProtT5 and ESM-2 fusion features significantly contribute to predicting CPP and that combining employed features and models demonstrates better association. On an independent test dataset comparison, EnDM-CPP achieved an accuracy of 0.9495 and a Matthews correlation coefficient of 0.9008 with an improvement of 2.23%-9.48% and 4.32%-19.02%, respectively, compared with other state-of-the-art methods. Code and data are available at https://github.com/tudou1231/EnDM-CPP.git .</p>","PeriodicalId":13670,"journal":{"name":"Interdisciplinary Sciences: Computational Life Sciences","volume":" ","pages":""},"PeriodicalIF":3.9,"publicationDate":"2024-12-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142876973","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"HiSVision: A Method for Detecting Large-Scale Structural Variations Based on Hi-C Data and Detection Transformer.","authors":"Haixia Zhai, Chengyao Dong, Tao Wang, Junwei Luo","doi":"10.1007/s12539-024-00677-0","DOIUrl":"https://doi.org/10.1007/s12539-024-00677-0","url":null,"abstract":"<p><p>Structural variation (SV) is an important component of the diversity of the human genome. Many studies have shown that SV has a significant impact on human disease and is strongly associated with the development of cancer. In recent years, the Hi-C sequencing technique has been shown to be useful for detecting large-scale SVs, and several methods have been proposed for identifying SVs from Hi-C data. However, due to the complexity of the 3D genome structure, accurate identifying SVs from the Hi-C contact matrix remains a challenging task. Here, we present HiSVision, a method for identifying large-scale SVs from Hi-C data using a detection transformer framework. Inspired by object detection network, we transform the Hi-C contact matrix into images, then identify candidate SV regions on the image by detection transformer, and finally filter SVs based on features around the breakpoints. Experimental results show that HiSVision outperforms existing methods in terms of precision and F1 score on cancer cell lines and simulated datasets. The source code and data are available from https://github.com/dcy99/HiSVision .</p>","PeriodicalId":13670,"journal":{"name":"Interdisciplinary Sciences: Computational Life Sciences","volume":" ","pages":""},"PeriodicalIF":3.9,"publicationDate":"2024-12-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142876974","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Identification of Multi-functional Therapeutic Peptides Based on Prototypical Supervised Contrastive Learning.","authors":"Sitong Niu, Henghui Fan, Fei Wang, Xiaomei Yang, Junfeng Xia","doi":"10.1007/s12539-024-00674-3","DOIUrl":"https://doi.org/10.1007/s12539-024-00674-3","url":null,"abstract":"<p><p>High-throughput sequencing has exponentially increased peptide sequences, necessitating a computational method to identify multi-functional therapeutic peptides (MFTP) from their sequences. However, existing computational methods are challenged by class imbalance, particularly in learning effective sequence representations. To address this, we propose PSCFA, a prototypical supervised contrastive learning with a feature augmentation method for MFTP prediction. We employ a two-stage training scheme to train the feature extractor and the classifier respectively, underpinned by the principle that better feature representation boosts classification accuracy. In the first stage, we utilize a prototypical supervised contrastive learning strategy to enhance the uniformity of feature space distribution, ensuring that the characteristics of samples within the same category are tightly clustered while those from different categories are more dispersed. In the second stage, a feature augmentation strategy that focuses on infrequent labels (tail labels) is used to refine the learning process of the classifier. We use a prototype-based variational autoencoder to capture semantic links among common labels (head labels) and their prototypes. This knowledge is then transferred to tail labels, generating enhanced features for classifier training. The experiments prove that the PSCFA method significantly outperforms existing methods for MFTP prediction, making a significant advancement in therapeutic peptide identification.</p>","PeriodicalId":13670,"journal":{"name":"Interdisciplinary Sciences: Computational Life Sciences","volume":" ","pages":""},"PeriodicalIF":3.9,"publicationDate":"2024-12-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142876975","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jinyun Niu, Fangfang Zhu, Donghai Fang, Wenwen Min
{"title":"SpatialCVGAE: Consensus Clustering Improves Spatial Domain Identification of Spatial Transcriptomics Using VGAE.","authors":"Jinyun Niu, Fangfang Zhu, Donghai Fang, Wenwen Min","doi":"10.1007/s12539-024-00676-1","DOIUrl":"https://doi.org/10.1007/s12539-024-00676-1","url":null,"abstract":"<p><p>The advent of spatially resolved transcriptomics (SRT) has provided critical insights into the spatial context of tissue microenvironments. Spatial clustering is a fundamental aspect of analyzing spatial transcriptomics data. However, spatial clustering methods often suffer from instability caused by the sparsity and high noise in the SRT data. To address this challenge, we propose SpatialCVGAE, a consensus clustering framework designed for SRT data analysis. SpatialCVGAE adopts the expression of high-variable genes from different dimensions along with multiple spatial graphs as inputs to variational graph autoencoders (VGAEs), learning multiple latent representations for clustering. These clustering results are then integrated using a consensus clustering approach, which enhances the model's stability and robustness by combining multiple clustering outcomes. Experiments demonstrate that SpatialCVGAE effectively mitigates the instability typically associated with non-ensemble deep learning methods, significantly improving both the stability and accuracy of the results. Compared to previous non-ensemble methods in representation learning and post-processing, our method fully leverages the diversity of multiple representations to accurately identify spatial domains, showing superior robustness and adaptability. All code and public datasets used in this paper are available at https://github.com/wenwenmin/SpatialCVGAE .</p>","PeriodicalId":13670,"journal":{"name":"Interdisciplinary Sciences: Computational Life Sciences","volume":" ","pages":""},"PeriodicalIF":3.9,"publicationDate":"2024-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142828461","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Faiqa Maqsood, Wang Zhenfei, Muhammad Mumtaz Ali, Baozhi Qiu, Naveed Ur Rehman, Fahad Sabah, Tahir Mahmood, Irfanud Din, Raheem Sarwar
{"title":"Artificial Intelligence-Based Classification of CT Images Using a Hybrid SpinalZFNet.","authors":"Faiqa Maqsood, Wang Zhenfei, Muhammad Mumtaz Ali, Baozhi Qiu, Naveed Ur Rehman, Fahad Sabah, Tahir Mahmood, Irfanud Din, Raheem Sarwar","doi":"10.1007/s12539-024-00649-4","DOIUrl":"10.1007/s12539-024-00649-4","url":null,"abstract":"<p><p>The kidney is an abdominal organ in the human body that supports filtering excess water and waste from the blood. Kidney diseases generally occur due to changes in certain supplements, medical conditions, obesity, and diet, which causes kidney function and ultimately leads to complications such as chronic kidney disease, kidney failure, and other renal disorders. Combining patient metadata with computed tomography (CT) images is essential to accurately and timely diagnosing such complications. Deep Neural Networks (DNNs) have transformed medical fields by providing high accuracy in complex tasks. However, the high computational cost of these models is a significant challenge, particularly in real-time applications. This paper proposed SpinalZFNet, a hybrid deep learning approach that integrates the architectural strengths of Spinal Network (SpinalNet) with the feature extraction capabilities of Zeiler and Fergus Network (ZFNet) to classify kidney disease accurately using CT images. This unique combination enhanced feature analysis, significantly improving classification accuracy while reducing the computational overhead. At first, the acquired CT images are pre-processed using a median filter, and the pre-processed image is segmented using Efficient Neural Network (ENet). Later, the images are augmented, and different features are extracted from the augmented CT images. The extracted features finally classify the kidney disease into normal, tumor, cyst, and stone using the proposed SpinalZFNet model. The SpinalZFNet outperformed other models, with 99.9% sensitivity, 99.5% specificity, precision 99.6%, 99.8% accuracy, and 99.7% F1-Score in classifying kidney disease.</p>","PeriodicalId":13670,"journal":{"name":"Interdisciplinary Sciences: Computational Life Sciences","volume":" ","pages":"907-925"},"PeriodicalIF":3.9,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11512893/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142017327","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}