Jinyun Niu, Fangfang Zhu, Donghai Fang, Wenwen Min
{"title":"SpatialCVGAE: Consensus Clustering Improves Spatial Domain Identification of Spatial Transcriptomics Using VGAE.","authors":"Jinyun Niu, Fangfang Zhu, Donghai Fang, Wenwen Min","doi":"10.1007/s12539-024-00676-1","DOIUrl":"https://doi.org/10.1007/s12539-024-00676-1","url":null,"abstract":"<p><p>The advent of spatially resolved transcriptomics (SRT) has provided critical insights into the spatial context of tissue microenvironments. Spatial clustering is a fundamental aspect of analyzing spatial transcriptomics data. However, spatial clustering methods often suffer from instability caused by the sparsity and high noise in the SRT data. To address this challenge, we propose SpatialCVGAE, a consensus clustering framework designed for SRT data analysis. SpatialCVGAE adopts the expression of high-variable genes from different dimensions along with multiple spatial graphs as inputs to variational graph autoencoders (VGAEs), learning multiple latent representations for clustering. These clustering results are then integrated using a consensus clustering approach, which enhances the model's stability and robustness by combining multiple clustering outcomes. Experiments demonstrate that SpatialCVGAE effectively mitigates the instability typically associated with non-ensemble deep learning methods, significantly improving both the stability and accuracy of the results. Compared to previous non-ensemble methods in representation learning and post-processing, our method fully leverages the diversity of multiple representations to accurately identify spatial domains, showing superior robustness and adaptability. All code and public datasets used in this paper are available at https://github.com/wenwenmin/SpatialCVGAE .</p>","PeriodicalId":13670,"journal":{"name":"Interdisciplinary Sciences: Computational Life Sciences","volume":" ","pages":""},"PeriodicalIF":3.9,"publicationDate":"2024-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142828461","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"DeepPD: A Deep Learning Method for Predicting Peptide Detectability Based on Multi-feature Representation and Information Bottleneck.","authors":"Fenglin Li, Yannan Bin, Jianping Zhao, Chunhou Zheng","doi":"10.1007/s12539-024-00665-4","DOIUrl":"https://doi.org/10.1007/s12539-024-00665-4","url":null,"abstract":"<p><p>Peptide detectability measures the relationship between the protein composition and abundance in the sample and the peptides identified during the analytical procedure. This relationship has significant implications for the fundamental tasks of proteomics. Existing methods primarily rely on a single type of feature representation, which limits their ability to capture the intricate and diverse characteristics of peptides. In response to this limitation, we introduce DeepPD, an innovative deep learning framework incorporating multi-feature representation and the information bottleneck principle (IBP) to predict peptide detectability. DeepPD extracts semantic information from peptides using evolutionary scale modeling 2 (ESM-2) and integrates sequence and evolutionary information to construct the feature space collaboratively. The IBP effectively guides the feature learning process, minimizing redundancy in the feature space. Experimental results across various datasets demonstrate that DeepPD outperforms state-of-the-art methods. Furthermore, we demonstrate that DeepPD exhibits competitive generalization and transfer learning capabilities across diverse datasets and species. In conclusion, DeepPD emerges as the most effective method for predicting peptide detectability, showcasing its potential applicability to other protein sequence prediction tasks.</p>","PeriodicalId":13670,"journal":{"name":"Interdisciplinary Sciences: Computational Life Sciences","volume":" ","pages":""},"PeriodicalIF":3.9,"publicationDate":"2024-12-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142806788","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Repurposing Drugs for Infectious Diseases by Graph Convolutional Network with Sensitivity-Based Graph Reduction.","authors":"Rongting Yue, Abhishek Dutta","doi":"10.1007/s12539-024-00672-5","DOIUrl":"https://doi.org/10.1007/s12539-024-00672-5","url":null,"abstract":"<p><p>Computational systems biology employs computational algorithms and integrates diverse data sources, such as gene expression profiles, molecular interactions, and network modeling, to identify promising drug candidates through repurposing existing compounds in response to urgent healthcare needs. This study tackles the urgent need for rapid therapeutic development against emerging infectious diseases. We introduce a novel analytic expression for sensitivity analysis based on the Kronecker product and enhance model prediction performance using Graph Convolutional Networks (GCNs) with sensitivity-based graph reduction. Our algorithm refines prediction performance by leveraging sensitivity-based graph reduction. By integrating RNA-seq data, molecular interactions, and GCNs, we identify disease-related genes and pathways, construct heterogeneous graph models, and predict potential drugs. This approach involves novel analytical expressions that assess sensitivity to model loss, employing the Kronecker product approach. Subgraph analysis identifies nodes for removal, leading to a refined graph used for model retraining. This cost-effective pipeline focuses on computational methods for drug repurposing, targeting infectious diseases such as Zika virus and COVID-19 infection. Applied to these infections, our methodology integrates 659 proteins and 703 drugs for Zika virus, and 495 proteins and 468 drugs for COVID-19, along with their interactions derived from gene expression profiles. Top candidate drugs, such as Betamethasone phosphate and Bizelesin for Zika virus, and Chloroquine, Heparin Disaccharide, and Resveratrol for COVID-19, were validated through literature review or docking analysis. This scalable approach demonstrates promise in repurposing drugs for urgent healthcare challenges.</p>","PeriodicalId":13670,"journal":{"name":"Interdisciplinary Sciences: Computational Life Sciences","volume":" ","pages":""},"PeriodicalIF":3.9,"publicationDate":"2024-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142768515","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Dapeng Xiong, Kaicheng U, Jianfeng Sun, Adam P Cribbs
{"title":"PLMC: Language Model of Protein Sequences Enhances Protein Crystallization Prediction.","authors":"Dapeng Xiong, Kaicheng U, Jianfeng Sun, Adam P Cribbs","doi":"10.1007/s12539-024-00639-6","DOIUrl":"10.1007/s12539-024-00639-6","url":null,"abstract":"<p><p>X-ray diffraction crystallography has been most widely used for protein three-dimensional (3D) structure determination for which whether proteins are crystallizable is a central prerequisite. Yet, there are a number of procedures during protein crystallization, including protein material production, purification, and crystal production, which take turns affecting the crystallization outcome. Due to the expensive and laborious nature of this multi-stage process, various computational tools have been developed to predict protein crystallization propensity, which is then used to guide the experimental determination. In this study, we presented a novel deep learning framework, PLMC, to improve multi-stage protein crystallization propensity prediction by leveraging a pre-trained protein language model. To effectively train PLMC, two groups of features of each protein were integrated into a more comprehensive representation, including protein language embeddings from the large-scale protein sequence database and a handcrafted feature set consisting of physicochemical, sequence-based and disordered-related information. These features were further separately embedded for refinement, and then concatenated for the final prediction. Notably, our extensive benchmarking tests demonstrate that PLMC greatly outperforms other state-of-the-art methods by achieving AUC scores of 0.773, 0.893, and 0.913, respectively, at the aforementioned individual stages, and 0.982 at the final crystallization stage. Furthermore, PLMC is shown to be superior for predicting the crystallization of both globular and membrane proteins, as demonstrated by an AUC score of 0.991 for the latter. These results suggest the significant potential of PLMC in assisting researchers with the experimental design of crystallizable protein variants.</p>","PeriodicalId":13670,"journal":{"name":"Interdisciplinary Sciences: Computational Life Sciences","volume":" ","pages":"802-813"},"PeriodicalIF":3.9,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141999874","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Faiqa Maqsood, Wang Zhenfei, Muhammad Mumtaz Ali, Baozhi Qiu, Naveed Ur Rehman, Fahad Sabah, Tahir Mahmood, Irfanud Din, Raheem Sarwar
{"title":"Artificial Intelligence-Based Classification of CT Images Using a Hybrid SpinalZFNet.","authors":"Faiqa Maqsood, Wang Zhenfei, Muhammad Mumtaz Ali, Baozhi Qiu, Naveed Ur Rehman, Fahad Sabah, Tahir Mahmood, Irfanud Din, Raheem Sarwar","doi":"10.1007/s12539-024-00649-4","DOIUrl":"10.1007/s12539-024-00649-4","url":null,"abstract":"<p><p>The kidney is an abdominal organ in the human body that supports filtering excess water and waste from the blood. Kidney diseases generally occur due to changes in certain supplements, medical conditions, obesity, and diet, which causes kidney function and ultimately leads to complications such as chronic kidney disease, kidney failure, and other renal disorders. Combining patient metadata with computed tomography (CT) images is essential to accurately and timely diagnosing such complications. Deep Neural Networks (DNNs) have transformed medical fields by providing high accuracy in complex tasks. However, the high computational cost of these models is a significant challenge, particularly in real-time applications. This paper proposed SpinalZFNet, a hybrid deep learning approach that integrates the architectural strengths of Spinal Network (SpinalNet) with the feature extraction capabilities of Zeiler and Fergus Network (ZFNet) to classify kidney disease accurately using CT images. This unique combination enhanced feature analysis, significantly improving classification accuracy while reducing the computational overhead. At first, the acquired CT images are pre-processed using a median filter, and the pre-processed image is segmented using Efficient Neural Network (ENet). Later, the images are augmented, and different features are extracted from the augmented CT images. The extracted features finally classify the kidney disease into normal, tumor, cyst, and stone using the proposed SpinalZFNet model. The SpinalZFNet outperformed other models, with 99.9% sensitivity, 99.5% specificity, precision 99.6%, 99.8% accuracy, and 99.7% F1-Score in classifying kidney disease.</p>","PeriodicalId":13670,"journal":{"name":"Interdisciplinary Sciences: Computational Life Sciences","volume":" ","pages":"907-925"},"PeriodicalIF":3.9,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11512893/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142017327","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Predicting Disease-Metabolite Associations Based on the Metapath Aggregation of Tripartite Heterogeneous Networks.","authors":"Wenzhi Liu, Pengli Lu","doi":"10.1007/s12539-024-00645-8","DOIUrl":"10.1007/s12539-024-00645-8","url":null,"abstract":"<p><p>The exploration of the interactions between diseases and metabolites holds significant implications for the diagnosis and treatment of diseases. However, traditional experimental methods are time-consuming and costly, and current computational methods often overlook the influence of other biological entities on both. In light of these limitations, we proposed a novel deep learning model based on metapath aggregation of tripartite heterogeneous networks (MAHN) to explore disease-related metabolites. Specifically, we introduced microbes to construct a tripartite heterogeneous network and employed graph convolutional network and enhanced GraphSAGE to learn node features with metapath length 3. Additionally, we utilized node-level and semantic-level attention mechanisms, a more granular approach, to aggregate node features with metapath length 2. Finally, the reconstructed association probability is obtained by fusing features from different metapaths into the bilinear decoder. The experiments demonstrate that the proposed MAHN model achieved superior performance in five-fold cross-validation with Acc (91.85%), Pre (90.48%), Recall (93.53%), F1 (91.94%), AUC (97.39%), and AUPR (97.47%), outperforming four state-of-the-art algorithms. Case studies on two complex diseases, irritable bowel syndrome and obesity, further validate the predictive results, and the MAHN model is a trustworthy prediction tool for discovering potential metabolites. Moreover, deep learning models integrating multi-omics data represent the future mainstream direction for predicting disease-related biological entities.</p>","PeriodicalId":13670,"journal":{"name":"Interdisciplinary Sciences: Computational Life Sciences","volume":" ","pages":"829-843"},"PeriodicalIF":3.9,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141901633","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Viral Rebound After Antiviral Treatment: A Mathematical Modeling Study of the Role of Antiviral Mechanism of Action.","authors":"Aubrey Chiarelli, Hana Dobrovolny","doi":"10.1007/s12539-024-00643-w","DOIUrl":"10.1007/s12539-024-00643-w","url":null,"abstract":"<p><p>The development of antiviral treatments for SARS-CoV-2 was an important turning point for the pandemic. Availability of safe and effective antivirals has allowed people to return back to normal life. While SARS-CoV-2 antivirals are highly effective at preventing severe disease, there have been concerning reports of viral rebound in some patients after cessation of antiviral treatment. In this study, we use a mathematical model of viral infection to study the potential of different antivirals to prevent viral rebound. We find that antivirals that block production are most likely to result in viral rebound if the treatment time course is not sufficiently long. Since these antivirals do not prevent infection of cells, cells continue to be infected during treatment. When treatment is stopped, the infected cells will begin producing virus at the usual rate. Antivirals that prevent infection of cells are less likely to result in viral rebound since cells are not being infected during treatment. This study highlights the role of antiviral mechanism of action in increasing or reducing the probability of viral rebound.</p>","PeriodicalId":13670,"journal":{"name":"Interdisciplinary Sciences: Computational Life Sciences","volume":" ","pages":"844-853"},"PeriodicalIF":3.9,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141734033","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"FPJA-Net: A Lightweight End-to-End Network for Sleep Stage Prediction Based on Feature Pyramid and Joint Attention.","authors":"Zhi Liu, Qinhan Zhang, Sixin Luo, Meiqiao Qin","doi":"10.1007/s12539-024-00636-9","DOIUrl":"10.1007/s12539-024-00636-9","url":null,"abstract":"<p><p>Sleep staging is the most crucial work before diagnosing and treating sleep disorders. Traditional manual sleep staging is time-consuming and depends on the skill of experts. Nowadays, automatic sleep staging based on deep learning attracts more and more scientific researchers. As we know, the salient waves in sleep signals contain the most important information for automatic sleep staging. However, the key information is not fully utilized in existing deep learning methods since most of them only use CNN or RNN which could not capture multi-scale features in salient waves effectively. To tackle this limitation, we propose a lightweight end-to-end network for sleep stage prediction based on feature pyramid and joint attention. The feature pyramid module is designed to effectively extract multi-scale features in salient waves, and these features are then fed to the joint attention module to closely attend to the channel and location information of the salient waves. The proposed network has much fewer parameters and significant performance improvement, which is better than the state-of-the-art results. The overall accuracy and macro F1 score on the public dataset Sleep-EDF39, Sleep-EDF153 and SHHS are 90.1%, 87.8%, 87.4%, 84.4% and 86.9%, 83.9%, respectively. Ablation experiments confirm the effectiveness of each module.</p>","PeriodicalId":13670,"journal":{"name":"Interdisciplinary Sciences: Computational Life Sciences","volume":" ","pages":"769-780"},"PeriodicalIF":3.9,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141999873","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Function-Genes and Disease-Genes Prediction Based on Network Embedding and One-Class Classification.","authors":"Weiyu Shi, Yan Zhang, Yeqing Sun, Zhengkui Lin","doi":"10.1007/s12539-024-00638-7","DOIUrl":"10.1007/s12539-024-00638-7","url":null,"abstract":"<p><p>Using genes which have been experimentally-validated for diseases (functions) can develop machine learning methods to predict new disease/function-genes. However, the prediction of both function-genes and disease-genes faces the same problem: there are only certain positive examples, but no negative examples. To solve this problem, we proposed a function/disease-genes prediction algorithm based on network embedding (Variational Graph Auto-Encoders, VGAE) and one-class classification (Fast Minimum Covariance Determinant, Fast-MCD): VGAEMCD. Firstly, we constructed a protein-protein interaction (PPI) network centered on experimentally-validated genes; then VGAE was used to get the embeddings of nodes (genes) in the network; finally, the embeddings were input into the improved deep learning one-class classifier based on Fast-MCD to predict function/disease-genes. VGAEMCD can predict function-gene and disease-gene in a unified way, and only the experimentally-verified genes are needed to provide (no need for expression profile). VGAEMCD outperforms classical one-class classification algorithms in Recall, Precision, F-measure, Specificity, and Accuracy. Further experiments show that seven metrics of VGAEMCD are higher than those of state-of-art function/disease-genes prediction algorithms. The above results indicate that VGAEMCD can well learn the distribution characteristics of positive examples and accurately identify function/disease-genes.</p>","PeriodicalId":13670,"journal":{"name":"Interdisciplinary Sciences: Computational Life Sciences","volume":" ","pages":"781-801"},"PeriodicalIF":3.9,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142125655","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Adap-BDCM: Adaptive Bilinear Dynamic Cascade Model for Classification Tasks on CNV Datasets.","authors":"Liancheng Jiang, Liye Jia, Yizhen Wang, Yongfei Wu, Junhong Yue","doi":"10.1007/s12539-024-00635-w","DOIUrl":"10.1007/s12539-024-00635-w","url":null,"abstract":"<p><p>Copy number variation (CNV) is an essential genetic driving factor of cancer formation and progression, making intelligent classification based on CNV feasible. However, there are a few challenges in the current machine learning and deep learning methods, such as the design of base classifier combination schemes in ensemble methods and the selection of layers of neural networks, which often result in low accuracy. Therefore, an adaptive bilinear dynamic cascade model (Adap-BDCM) is developed to further enhance the accuracy and applicability of these methods for intelligent classification on CNV datasets. In this model, a feature selection module is introduced to mitigate the interference of redundant information, and a bilinear model based on the gated attention mechanism is proposed to extract more beneficial deep fusion features. Furthermore, an adaptive base classifier selection scheme is designed to overcome the difficulty of manually designing base classifier combinations and enhance the applicability of the model. Lastly, a novel feature fusion scheme with an attribute recall submodule is constructed, effectively avoiding getting stuck in local solutions and missing some valuable information. Numerous experiments have demonstrated that our Adap-BDCM model exhibits optimal performance in cancer classification, stage prediction, and recurrence on CNV datasets. This study can assist physicians in making diagnoses faster and better.</p>","PeriodicalId":13670,"journal":{"name":"Interdisciplinary Sciences: Computational Life Sciences","volume":" ","pages":"1019-1037"},"PeriodicalIF":3.9,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140956486","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}