{"title":"NDMNN: A novel deep residual network based MNN method to remove batch effects from scRNA-seq data.","authors":"Yupeng Ma, Yongzhen Pei","doi":"10.1142/S021972002450015X","DOIUrl":"10.1142/S021972002450015X","url":null,"abstract":"<p><p>The rapid development of single-cell RNA sequencing (scRNA-seq) technology has generated vast amounts of data. However, these data often exhibit batch effects due to various factors such as different time points, experimental personnel, and instruments used, which can obscure the biological differences in the data itself. Based on the characteristics of scRNA-seq data, we designed a dense deep residual network model, referred to as NDnetwork. Subsequently, we combined the NDnetwork model with the MNN method to correct batch effects in scRNA-seq data, and named it the NDMNN method. Comprehensive experimental results demonstrate that the NDMNN method outperforms existing commonly used methods for correcting batch effects in scRNA-seq data. As the scale of single-cell sequencing continues to expand, we believe that NDMNN will be a valuable tool for researchers in the biological community for correcting batch effects in their studies. The source code and experimental results of the NDMNN method can be found at https://github.com/mustang-hub/NDMNN.</p>","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":" ","pages":"2450015"},"PeriodicalIF":0.9,"publicationDate":"2024-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141735374","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"How much can ChatGPT really help computational biologists in programming?","authors":"Chowdhury Rafeed Rahman, Limsoon Wong","doi":"10.1142/S021972002471001X","DOIUrl":"10.1142/S021972002471001X","url":null,"abstract":"<p><p>ChatGPT, a recently developed product by openAI, is successfully leaving its mark as a multi-purpose natural language based chatbot. In this paper, we are more interested in analyzing its potential in the field of computational biology. A major share of work done by computational biologists these days involve coding up bioinformatics algorithms, analyzing data, creating pipelining scripts and even machine learning modeling and feature extraction. This paper focuses on the potential influence (both positive and negative) of ChatGPT in the mentioned aspects with illustrative examples from different perspectives. Compared to other fields of computer science, computational biology has (1) less coding resources, (2) more sensitivity and bias issues (deals with medical data), and (3) more necessity of coding assistance (people from diverse background come to this field). Keeping such issues in mind, we cover use cases such as code writing, reviewing, debugging, converting, refactoring, and pipelining using ChatGPT from the perspective of computational biologists in this paper.</p>","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":" ","pages":"2471001"},"PeriodicalIF":1.0,"publicationDate":"2024-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141082392","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Predictive Recognition of DNA-binding proteins based on Pre-trained Language Model BERT","authors":"Yue Ma, Yongzhen Pei, Changguo Li","doi":"10.1142/s0219720023500282","DOIUrl":"https://doi.org/10.1142/s0219720023500282","url":null,"abstract":"","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":"185 3","pages":""},"PeriodicalIF":1.0,"publicationDate":"2023-12-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139011307","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Imputation for single-cell RNA-seq data with non-negative matrix factorization and transfer learning","authors":"Jiadi Zhu, Youlong Yang","doi":"10.1142/s0219720023500294","DOIUrl":"https://doi.org/10.1142/s0219720023500294","url":null,"abstract":"","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":"65 2","pages":""},"PeriodicalIF":1.0,"publicationDate":"2023-12-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139011371","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Algorithms for the Uniqueness of the Longest Common Subsequence.","authors":"Yue Wang","doi":"10.1142/S0219720023500270","DOIUrl":"10.1142/S0219720023500270","url":null,"abstract":"<p><p>Given several number sequences, determining the longest common subsequence is a classical problem in computer science. This problem has applications in bioinformatics, especially determining transposable genes. Nevertheless, related works only consider how to find one longest common subsequence. In this paper, we consider how to determine the uniqueness of the longest common subsequence. If there are multiple longest common subsequences, we also determine which number appears in all/some/none of the longest common subsequences. We focus on four scenarios: (1) linear sequences without duplicated numbers; (2) circular sequences without duplicated numbers; (3) linear sequences with duplicated numbers; (4) circular sequences with duplicated numbers. We develop corresponding algorithms and apply them to gene sequencing data.</p>","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":" ","pages":"2350027"},"PeriodicalIF":1.0,"publicationDate":"2023-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139425753","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Small groups in multidimensional feature space: two examples of supervised two-group classification from biomedicine","authors":"Dmitriy Karpenko, Aleksei Bigildeev","doi":"10.1142/s0219720023500257","DOIUrl":"https://doi.org/10.1142/s0219720023500257","url":null,"abstract":"","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":"115 21","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135541469","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Chengyou Li, Shiqiang Fan, Haiyong Zhao, Xiaotong Liu
{"title":"CNV-FB: A Feature bagging strategy-based approach to detect copy number variants from NGS data","authors":"Chengyou Li, Shiqiang Fan, Haiyong Zhao, Xiaotong Liu","doi":"10.1142/s0219720023500269","DOIUrl":"https://doi.org/10.1142/s0219720023500269","url":null,"abstract":"","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":"115 22","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135541468","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Analyzing omics data by feature combinations based on kernel functions.","authors":"Chao Li, Tianxiang Wang, Xiaohui Lin","doi":"10.1142/S021972002350021X","DOIUrl":"10.1142/S021972002350021X","url":null,"abstract":"<p><p>Defining meaningful feature (molecule) combinations can enhance the study of disease diagnosis and prognosis. However, feature combinations are complex and various in biosystems, and the existing methods examine the feature cooperation in a single, fixed pattern for all feature pairs, such as linear combination. To identify the appropriate combination between two features and evaluate feature combination more comprehensively, this paper adopts kernel functions to study feature relationships and proposes a new omics data analysis method KF-[Formula: see text]-TSP. Besides linear combination, KF-[Formula: see text]-TSP also explores the nonlinear combination of features, and allows hybridizing multiple kernel functions to evaluate feature interaction from multiple views. KF-[Formula: see text]-TSP selects [Formula: see text] > 0 top-scoring pairs to build an ensemble classifier. Experimental results show that KF-[Formula: see text]-TSP with multiple kernel functions which evaluates feature combinations from multiple views is better than that with only one kernel function. Meanwhile, KF-[Formula: see text]-TSP performs better than TSP family algorithms and the previous methods based on conversion strategy in most cases. It performs similarly to the popular machine learning methods in omics data analysis, but involves fewer feature pairs. In the procedure of physiological and pathological changes, molecular interactions can be both linear and nonlinear. Hence, KF-[Formula: see text]-TSP, which can measure molecular combination from multiple perspectives, can help to mine information closely related to physiological and pathological changes and study disease mechanism.</p>","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":"1 1","pages":"2350021"},"PeriodicalIF":1.0,"publicationDate":"2023-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41358214","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Konstantinos Lazaros, Panagiotis Vlamos, Aristidis G Vrahatis
{"title":"Methods for cell-type annotation on scRNA-seq data: A recent overview.","authors":"Konstantinos Lazaros, Panagiotis Vlamos, Aristidis G Vrahatis","doi":"10.1142/S0219720023400024","DOIUrl":"10.1142/S0219720023400024","url":null,"abstract":"<p><p>The evolution of single-cell technology is ongoing, continually generating massive amounts of data that reveal many mysteries surrounding intricate diseases. However, their drawbacks continue to constrain us. Among these, annotating cell types in single-cell gene expressions pose a substantial challenge, despite the myriad of tools at our disposal. The rapid growth in data, resources, and tools has consequently brought about significant alterations in this area over the years. In our study, we spotlight all note-worthy cell type annotation techniques developed over the past four years. We provide an overview of the latest trends in this field, showcasing the most advanced methods in taxonomy. Our research underscores the demand for additional tools that incorporate a biological context and also predicts that the rising trend of graph neural network approaches will likely lead this research field in the coming years.</p>","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":" ","pages":"2340002"},"PeriodicalIF":1.0,"publicationDate":"2023-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41155989","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}