{"title":"Gtie-Rt: A comprehensive graph learning model for predicting drugs targeting metabolic pathways in human.","authors":"Hayat Ali Shah, Juan Liu, Zhihui Yang","doi":"10.1142/S0219720024500100","DOIUrl":"10.1142/S0219720024500100","url":null,"abstract":"<p><p>Drugs often target specific metabolic pathways to produce a therapeutic effect. However, these pathways are complex and interconnected, making it challenging to predict a drug's potential effects on an organism's overall metabolism. The mapping of drugs with targeting metabolic pathways in the organisms can provide a more complete understanding of the metabolic effects of a drug and help to identify potential drug-drug interactions. In this study, we proposed a machine learning hybrid model Graph Transformer Integrated Encoder (GTIE-RT) for mapping drugs to target metabolic pathways in human. The proposed model is a composite of a Graph Convolution Network (GCN) and transformer encoder for graph embedding and attention mechanism. The output of the transformer encoder is then fed into the Extremely Randomized Trees Classifier to predict target metabolic pathways. The evaluation of the GTIE-RT on drugs dataset demonstrates excellent performance metrics, including accuracy (>95%), recall (>92%), precision (>93%) and F1-score (>92%). Compared to other variants and machine learning methods, GTIE-RT consistently shows more reliable results.</p>","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":" ","pages":"2450010"},"PeriodicalIF":0.9,"publicationDate":"2024-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141727966","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Construction of transcript regulation mechanism prediction models based on binding motif environment of transcription factor AoXlnR in <i>Aspergillus oryzae</i>.","authors":"Hiroya Oka, Takaaki Kojima, Ryuji Kato, Kunio Ihara, Hideo Nakano","doi":"10.1142/S0219720024500173","DOIUrl":"10.1142/S0219720024500173","url":null,"abstract":"<p><p>DNA-binding transcription factors (TFs) play a central role in transcriptional regulation mechanisms, mainly through their specific binding to target sites on the genome and regulation of the expression of downstream genes. Therefore, a comprehensive analysis of the function of these TFs will lead to the understanding of various biological mechanisms. However, the functions of TFs <i>in vivo</i> are diverse and complicated, and the identified binding sites on the genome are not necessarily involved in the regulation of downstream gene expression. In this study, we investigated whether DNA structural information around the binding site of TFs can be used to predict the involvement of the binding site in the regulation of the expression of genes located downstream of the binding site. Specifically, we calculated the structural parameters based on the DNA shape around the DNA binding motif located upstream of the gene whose expression is directly regulated by one TF AoXlnR from <i>Aspergillus oryzae</i>, and showed that the presence or absence of expression regulation can be predicted from the sequence information with high accuracy ([Formula: see text]-1.0) by machine learning incorporating these parameters.</p>","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":"22 3","pages":"2450017"},"PeriodicalIF":0.9,"publicationDate":"2024-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141761969","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"NDMNN: A novel deep residual network based MNN method to remove batch effects from scRNA-seq data.","authors":"Yupeng Ma, Yongzhen Pei","doi":"10.1142/S021972002450015X","DOIUrl":"10.1142/S021972002450015X","url":null,"abstract":"<p><p>The rapid development of single-cell RNA sequencing (scRNA-seq) technology has generated vast amounts of data. However, these data often exhibit batch effects due to various factors such as different time points, experimental personnel, and instruments used, which can obscure the biological differences in the data itself. Based on the characteristics of scRNA-seq data, we designed a dense deep residual network model, referred to as NDnetwork. Subsequently, we combined the NDnetwork model with the MNN method to correct batch effects in scRNA-seq data, and named it the NDMNN method. Comprehensive experimental results demonstrate that the NDMNN method outperforms existing commonly used methods for correcting batch effects in scRNA-seq data. As the scale of single-cell sequencing continues to expand, we believe that NDMNN will be a valuable tool for researchers in the biological community for correcting batch effects in their studies. The source code and experimental results of the NDMNN method can be found at https://github.com/mustang-hub/NDMNN.</p>","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":" ","pages":"2450015"},"PeriodicalIF":0.9,"publicationDate":"2024-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141735374","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"How much can ChatGPT really help computational biologists in programming?","authors":"Chowdhury Rafeed Rahman, Limsoon Wong","doi":"10.1142/S021972002471001X","DOIUrl":"10.1142/S021972002471001X","url":null,"abstract":"<p><p>ChatGPT, a recently developed product by openAI, is successfully leaving its mark as a multi-purpose natural language based chatbot. In this paper, we are more interested in analyzing its potential in the field of computational biology. A major share of work done by computational biologists these days involve coding up bioinformatics algorithms, analyzing data, creating pipelining scripts and even machine learning modeling and feature extraction. This paper focuses on the potential influence (both positive and negative) of ChatGPT in the mentioned aspects with illustrative examples from different perspectives. Compared to other fields of computer science, computational biology has (1) less coding resources, (2) more sensitivity and bias issues (deals with medical data), and (3) more necessity of coding assistance (people from diverse background come to this field). Keeping such issues in mind, we cover use cases such as code writing, reviewing, debugging, converting, refactoring, and pipelining using ChatGPT from the perspective of computational biologists in this paper.</p>","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":" ","pages":"2471001"},"PeriodicalIF":1.0,"publicationDate":"2024-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141082392","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Predictive Recognition of DNA-binding proteins based on Pre-trained Language Model BERT","authors":"Yue Ma, Yongzhen Pei, Changguo Li","doi":"10.1142/s0219720023500282","DOIUrl":"https://doi.org/10.1142/s0219720023500282","url":null,"abstract":"","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":"185 3","pages":""},"PeriodicalIF":1.0,"publicationDate":"2023-12-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139011307","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Imputation for single-cell RNA-seq data with non-negative matrix factorization and transfer learning","authors":"Jiadi Zhu, Youlong Yang","doi":"10.1142/s0219720023500294","DOIUrl":"https://doi.org/10.1142/s0219720023500294","url":null,"abstract":"","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":"65 2","pages":""},"PeriodicalIF":1.0,"publicationDate":"2023-12-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139011371","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Algorithms for the Uniqueness of the Longest Common Subsequence.","authors":"Yue Wang","doi":"10.1142/S0219720023500270","DOIUrl":"10.1142/S0219720023500270","url":null,"abstract":"<p><p>Given several number sequences, determining the longest common subsequence is a classical problem in computer science. This problem has applications in bioinformatics, especially determining transposable genes. Nevertheless, related works only consider how to find one longest common subsequence. In this paper, we consider how to determine the uniqueness of the longest common subsequence. If there are multiple longest common subsequences, we also determine which number appears in all/some/none of the longest common subsequences. We focus on four scenarios: (1) linear sequences without duplicated numbers; (2) circular sequences without duplicated numbers; (3) linear sequences with duplicated numbers; (4) circular sequences with duplicated numbers. We develop corresponding algorithms and apply them to gene sequencing data.</p>","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":" ","pages":"2350027"},"PeriodicalIF":0.9,"publicationDate":"2023-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139425753","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Small groups in multidimensional feature space: two examples of supervised two-group classification from biomedicine","authors":"Dmitriy Karpenko, Aleksei Bigildeev","doi":"10.1142/s0219720023500257","DOIUrl":"https://doi.org/10.1142/s0219720023500257","url":null,"abstract":"","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":"115 21","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135541469","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Chengyou Li, Shiqiang Fan, Haiyong Zhao, Xiaotong Liu
{"title":"CNV-FB: A Feature bagging strategy-based approach to detect copy number variants from NGS data","authors":"Chengyou Li, Shiqiang Fan, Haiyong Zhao, Xiaotong Liu","doi":"10.1142/s0219720023500269","DOIUrl":"https://doi.org/10.1142/s0219720023500269","url":null,"abstract":"","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":"115 22","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135541468","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}