Dehua Chen, Yongsheng Yang, Dongdong Shi, Zhenhua Zhang, Mei Wang, Qiao Pan, Jianwen Su, Zhen Wang
{"title":"The use of 4D data-independent acquisition-based proteomic analysis and machine learning to reveal potential biomarkers for stress levels.","authors":"Dehua Chen, Yongsheng Yang, Dongdong Shi, Zhenhua Zhang, Mei Wang, Qiao Pan, Jianwen Su, Zhen Wang","doi":"10.1142/S0219720024500252","DOIUrl":"https://doi.org/10.1142/S0219720024500252","url":null,"abstract":"<p><p>Research suggests that individuals who experience prolonged exposure to stress may be at higher risk for developing psychological stress disorders. Currently, psychological stress is primarily evaluated by professional physicians using rating scales, which may be prone to subjective biases and limitations of the scales. Therefore, it is imperative to explore more objective, accurate, and efficient biomarkers for evaluating the level of psychological stress in an individual. In this study, we utilized 4D data-independent acquisition (4D-DIA) proteomics for quantitative protein analysis, and then employed support vector machine (SVM) combined with SHAP interpretation algorithm to identify potential biomarkers for psychological stress levels. Biomarkers validation was subsequently achieved through machine learning classification and a substantial amount of a priori knowledge derived from the knowledge graph. We performed cross-validation of the biomarkers using two batches of data, and the results showed that the combination of Glyceraldehyde-3-phosphate dehydrogenase and Fibronectin yielded an average area under the curve (AUC) of 92%, an average accuracy of 86%, an average F1 score of 79%, and an average sensitivity of 83%. Therefore, this combination may represent a potential approach for detecting stress levels to prevent psychological stress disorders.</p>","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":" ","pages":"2450025"},"PeriodicalIF":0.9,"publicationDate":"2024-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142639951","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Construction of a multi-tissue compound-target interaction network of Qingfei Paidu decoction in COVID-19 treatment based on deep learning and transcriptomic analysis.","authors":"Xia Li, Xuetong Zhao, Xinjian Yu, Jianping Zhao, Xiangdong Fang","doi":"10.1142/S0219720024500161","DOIUrl":"10.1142/S0219720024500161","url":null,"abstract":"<p><p>The Qingfei Paidu decoction (QFPDD) is a widely acclaimed therapeutic formula employed nationwide for the clinical management of coronavirus disease 2019 (COVID-19). QFPDD exerts a synergistic therapeutic effect, characterized by its multi-component, multi-target, and multi-pathway action. However, the intricate interactions among the ingredients and targets within QFPDD and their systematic effects in multiple tissues remain undetermined. To address this, we qualitatively characterized the chemical components of QFPDD. We integrated multi-tissue transcriptomic analysis with GraphDTA, a deep learning model, to screen for potential compound-target interactions of QFPDD in multiple tissues. We predicted 13 key active compounds, 127 potential targets and 27 pathways associated with QFPDD across six different tissues. Notably, oleanolic acid-AXL exhibited leading affinity in the heart, blood, and liver. Molecular docking and molecular dynamics simulation confirmed their strong binding affinity. The robust interaction between oleanolic acid and the AXL receptor suggests that AXL is a promising target for developing clinical intervention strategies. Through the construction of a multi-tissue compound-target interaction network, our study further elucidated the mechanisms through which QFPDD effectively combats COVID-19 in multiple tissues. Our work also establishes a framework for future investigations into the systemic effects of other Traditional Chinese Medicine (TCM) formulas in disease treatment.</p>","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":" ","pages":"2450016"},"PeriodicalIF":0.9,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141735373","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Min Li, Zhifang Qi, Liang Liu, Mingzhu Lou, Shaobo Deng
{"title":"PCA-constrained multi-core matrix fusion network: A novel approach for cancer subtype identification.","authors":"Min Li, Zhifang Qi, Liang Liu, Mingzhu Lou, Shaobo Deng","doi":"10.1142/S0219720024500148","DOIUrl":"10.1142/S0219720024500148","url":null,"abstract":"<p><p>Cancer subtyping refers to categorizing a particular cancer type into distinct subtypes or subgroups based on a range of molecular characteristics, clinical manifestations, histological features, and other relevant factors. The identification of cancer subtypes can significantly enhance precision in clinical practice and facilitate personalized diagnosis and treatment strategies. Recent advancements in the field have witnessed the emergence of numerous network fusion methods aimed at identifying cancer subtypes. The majority of these fusion algorithms, however, solely rely on the fusion network of a single core matrix for the identification of cancer subtypes and fail to comprehensively capture similarity. To tackle this issue, in this study, we propose a novel cancer subtype recognition method, referred to as PCA-constrained multi-core matrix fusion network (PCA-MM-FN). The PCA-MM-FN algorithm initially employs three distinct methods to obtain three core matrices. Subsequently, the obtained core matrices are projected into a shared subspace using principal component analysis, followed by a weighted network fusion. Lastly, spectral clustering is conducted on the fused network. The results obtained from conducting experiments on the mRNA expression, DNA methylation, and miRNA expression of five TCGA datasets and three multi-omics benchmark datasets demonstrate that the proposed PCA-MM-FN approach exhibits superior accuracy in identifying cancer subtypes compared to the existing methods.</p>","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":" ","pages":"2450014"},"PeriodicalIF":0.9,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142057039","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
V Abinas, U Abhinav, E M Haneem, A Vishnusankar, K A Abdul Nazeer
{"title":"Integration of autoencoder and graph convolutional network for predicting breast cancer drug response.","authors":"V Abinas, U Abhinav, E M Haneem, A Vishnusankar, K A Abdul Nazeer","doi":"10.1142/S0219720024500136","DOIUrl":"https://doi.org/10.1142/S0219720024500136","url":null,"abstract":"<p><p><b>Background and objectives:</b> Breast cancer is the most prevalent type of cancer among women. The effectiveness of anticancer pharmacological therapy may get adversely affected by tumor heterogeneity that includes genetic and transcriptomic features. This leads to clinical variability in patient response to therapeutic drugs. Anticancer drug design and cancer understanding require precise identification of cancer drug responses. The performance of drug response prediction models can be improved by integrating multi-omics data and drug structure data. <b>Methods:</b> In this paper, we propose an Autoencoder (AE) and Graph Convolutional Network (AGCN) for drug response prediction, which integrates multi-omics data and drug structure data. Specifically, we first converted the high dimensional representation of each omic data to a lower dimensional representation using an AE for each omic data set. Subsequently, these individual features are combined with drug structure data obtained using a Graph Convolutional Network and given to a Convolutional Neural Network to calculate IC[Formula: see text] values for every combination of cell lines and drugs. Then a threshold IC[Formula: see text] value is obtained for each drug by performing K-means clustering of their known IC[Formula: see text] values. Finally, with the help of this threshold value, cell lines are classified as either sensitive or resistant to each drug. <b>Results:</b> Experimental results indicate that AGCN has an accuracy of 0.82 and performs better than many existing methods. In addition to that, we have done external validation of AGCN using data taken from The Cancer Genome Atlas (TCGA) clinical database, and we got an accuracy of 0.91. <b>Conclusion:</b> According to the results obtained, concatenating multi-omics data with drug structure data using AGCN for drug response prediction tasks greatly improves the accuracy of the prediction task.</p>","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":"22 3","pages":"2450013"},"PeriodicalIF":0.9,"publicationDate":"2024-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141761970","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Gtie-Rt: A comprehensive graph learning model for predicting drugs targeting metabolic pathways in human.","authors":"Hayat Ali Shah, Juan Liu, Zhihui Yang","doi":"10.1142/S0219720024500100","DOIUrl":"10.1142/S0219720024500100","url":null,"abstract":"<p><p>Drugs often target specific metabolic pathways to produce a therapeutic effect. However, these pathways are complex and interconnected, making it challenging to predict a drug's potential effects on an organism's overall metabolism. The mapping of drugs with targeting metabolic pathways in the organisms can provide a more complete understanding of the metabolic effects of a drug and help to identify potential drug-drug interactions. In this study, we proposed a machine learning hybrid model Graph Transformer Integrated Encoder (GTIE-RT) for mapping drugs to target metabolic pathways in human. The proposed model is a composite of a Graph Convolution Network (GCN) and transformer encoder for graph embedding and attention mechanism. The output of the transformer encoder is then fed into the Extremely Randomized Trees Classifier to predict target metabolic pathways. The evaluation of the GTIE-RT on drugs dataset demonstrates excellent performance metrics, including accuracy (>95%), recall (>92%), precision (>93%) and F1-score (>92%). Compared to other variants and machine learning methods, GTIE-RT consistently shows more reliable results.</p>","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":" ","pages":"2450010"},"PeriodicalIF":0.9,"publicationDate":"2024-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141727966","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Construction of transcript regulation mechanism prediction models based on binding motif environment of transcription factor AoXlnR in <i>Aspergillus oryzae</i>.","authors":"Hiroya Oka, Takaaki Kojima, Ryuji Kato, Kunio Ihara, Hideo Nakano","doi":"10.1142/S0219720024500173","DOIUrl":"10.1142/S0219720024500173","url":null,"abstract":"<p><p>DNA-binding transcription factors (TFs) play a central role in transcriptional regulation mechanisms, mainly through their specific binding to target sites on the genome and regulation of the expression of downstream genes. Therefore, a comprehensive analysis of the function of these TFs will lead to the understanding of various biological mechanisms. However, the functions of TFs <i>in vivo</i> are diverse and complicated, and the identified binding sites on the genome are not necessarily involved in the regulation of downstream gene expression. In this study, we investigated whether DNA structural information around the binding site of TFs can be used to predict the involvement of the binding site in the regulation of the expression of genes located downstream of the binding site. Specifically, we calculated the structural parameters based on the DNA shape around the DNA binding motif located upstream of the gene whose expression is directly regulated by one TF AoXlnR from <i>Aspergillus oryzae</i>, and showed that the presence or absence of expression regulation can be predicted from the sequence information with high accuracy ([Formula: see text]-1.0) by machine learning incorporating these parameters.</p>","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":"22 3","pages":"2450017"},"PeriodicalIF":0.9,"publicationDate":"2024-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141761969","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"NDMNN: A novel deep residual network based MNN method to remove batch effects from scRNA-seq data.","authors":"Yupeng Ma, Yongzhen Pei","doi":"10.1142/S021972002450015X","DOIUrl":"10.1142/S021972002450015X","url":null,"abstract":"<p><p>The rapid development of single-cell RNA sequencing (scRNA-seq) technology has generated vast amounts of data. However, these data often exhibit batch effects due to various factors such as different time points, experimental personnel, and instruments used, which can obscure the biological differences in the data itself. Based on the characteristics of scRNA-seq data, we designed a dense deep residual network model, referred to as NDnetwork. Subsequently, we combined the NDnetwork model with the MNN method to correct batch effects in scRNA-seq data, and named it the NDMNN method. Comprehensive experimental results demonstrate that the NDMNN method outperforms existing commonly used methods for correcting batch effects in scRNA-seq data. As the scale of single-cell sequencing continues to expand, we believe that NDMNN will be a valuable tool for researchers in the biological community for correcting batch effects in their studies. The source code and experimental results of the NDMNN method can be found at https://github.com/mustang-hub/NDMNN.</p>","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":" ","pages":"2450015"},"PeriodicalIF":0.9,"publicationDate":"2024-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141735374","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"How much can ChatGPT really help computational biologists in programming?","authors":"Chowdhury Rafeed Rahman, Limsoon Wong","doi":"10.1142/S021972002471001X","DOIUrl":"10.1142/S021972002471001X","url":null,"abstract":"<p><p>ChatGPT, a recently developed product by openAI, is successfully leaving its mark as a multi-purpose natural language based chatbot. In this paper, we are more interested in analyzing its potential in the field of computational biology. A major share of work done by computational biologists these days involve coding up bioinformatics algorithms, analyzing data, creating pipelining scripts and even machine learning modeling and feature extraction. This paper focuses on the potential influence (both positive and negative) of ChatGPT in the mentioned aspects with illustrative examples from different perspectives. Compared to other fields of computer science, computational biology has (1) less coding resources, (2) more sensitivity and bias issues (deals with medical data), and (3) more necessity of coding assistance (people from diverse background come to this field). Keeping such issues in mind, we cover use cases such as code writing, reviewing, debugging, converting, refactoring, and pipelining using ChatGPT from the perspective of computational biologists in this paper.</p>","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":" ","pages":"2471001"},"PeriodicalIF":1.0,"publicationDate":"2024-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141082392","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Predictive Recognition of DNA-binding proteins based on Pre-trained Language Model BERT","authors":"Yue Ma, Yongzhen Pei, Changguo Li","doi":"10.1142/s0219720023500282","DOIUrl":"https://doi.org/10.1142/s0219720023500282","url":null,"abstract":"","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":"185 3","pages":""},"PeriodicalIF":1.0,"publicationDate":"2023-12-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139011307","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}