Journal of Bioinformatics and Computational Biology最新文献_第10页

DeepBtoD: Improved RNA-binding proteins prediction via integrated deep learning DeepBtoD:通过集成深度学习改进rna结合蛋白预测

IF 1 4区生物学

Journal of Bioinformatics and Computational Biology Pub Date : 2022-04-21 DOI: 10.1142/S0219720022500068

Xiuquan Du, Xiu-juan Zhao, Yanping Zhang

{"title":"DeepBtoD: Improved RNA-binding proteins prediction via integrated deep learning","authors":"Xiuquan Du, Xiu-juan Zhao, Yanping Zhang","doi":"10.1142/S0219720022500068","DOIUrl":"https://doi.org/10.1142/S0219720022500068","url":null,"abstract":"RNA-binding proteins (RBPs) have crucial roles in various cellular processes such as alternative splicing and gene regulation. Therefore, the analysis and identification of RBPs is an essential issue. However, although many computational methods have been developed for predicting RBPs, a few studies simultaneously consider local and global information from the perspective of the RNA sequence. Facing this challenge, we present a novel method called DeepBtoD, which predicts RBPs directly from RNA sequences. First, a [Formula: see text]-BtoD encoding is designed, which takes into account the composition of [Formula: see text]-nucleotides and their relative positions and forms a local module. Second, we designed a multi-scale convolutional module embedded with a self-attentive mechanism, the ms-focusCNN, which is used to further learn more effective, diverse, and discriminative high-level features. Finally, global information is considered to supplement local modules with ensemble learning to predict whether the target RNA binds to RBPs. Our preliminary 24 independent test datasets show that our proposed method can classify RBPs with the area under the curve of 0.933. Remarkably, DeepBtoD shows competitive results across seven state-of-the-art methods, suggesting that RBPs can be highly recognized by integrating local [Formula: see text]-BtoD and global information only from RNA sequences. Hence, our integrative method may be useful to improve the power of RBPs prediction, which might be particularly useful for modeling protein-nucleic acid interactions in systems biology studies. Our DeepBtoD server can be accessed at http://175.27.228.227/DeepBtoD/.","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":"1 1","pages":"2250006"},"PeriodicalIF":1.0,"publicationDate":"2022-04-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42540334","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

RPfam: A refiner towards curated-like multiple sequence alignments of the Pfam protein families RPfam：Pfam蛋白家族的精细化多序列比对

IF 1 4区生物学

Journal of Bioinformatics and Computational Biology Pub Date : 2022-04-14 DOI: 10.1142/S0219720022400029

Qingting Wei, Hong Zou, Cuncong Zhong, Jianfeng Xu

引用次数: 0

Analysis to determine the effect of mutations on binding to small chemical molecules 分析以确定突变对与小化学分子结合的影响

IF 1 4区生物学

Journal of Bioinformatics and Computational Biology Pub Date : 2022-04-14 DOI: 10.1142/S0219720022400030

T. Koshlan, K. Kulikov

引用次数: 0

Clinical drug response prediction from preclinical cancer cell lines by logistic matrix factorization approach. logistic矩阵分解法预测临床前癌细胞的临床药物反应。

IF 1 4区生物学

Journal of Bioinformatics and Computational Biology Pub Date : 2022-04-01 Epub Date: 2021-12-17 DOI: 10.1142/S0219720021500359

Akram Emdadi, Changiz Eslahchi

{"title":"Clinical drug response prediction from preclinical cancer cell lines by logistic matrix factorization approach.","authors":"Akram Emdadi, Changiz Eslahchi","doi":"10.1142/S0219720021500359","DOIUrl":"https://doi.org/10.1142/S0219720021500359","url":null,"abstract":"Predicting tumor drug response using cancer cell line drug response values for a large number of anti-cancer drugs is a significant challenge in personalized medicine. Predicting patient response to drugs from data obtained from preclinical models is made easier by the availability of different knowledge on cell lines and drugs. This paper proposes the TCLMF method, a predictive model for predicting drug response in tumor samples that was trained on preclinical samples and is based on the logistic matrix factorization approach. The TCLMF model is designed based on gene expression profiles, tissue type information, the chemical structure of drugs and drug sensitivity (IC 50) data from cancer cell lines. We use preclinical data from the Genomics of Drug Sensitivity in Cancer dataset (GDSC) to train the proposed drug response model, which we then use to predict drug sensitivity of samples from the Cancer Genome Atlas (TCGA) dataset. The TCLMF approach focuses on identifying successful features of cell lines and drugs in order to calculate the probability of the tumor samples being sensitive to drugs. The closest cell line neighbours for each tumor sample are calculated using a description of similarity between tumor samples and cell lines in this study. The drug response for a new tumor is then calculated by averaging the low-rank features obtained from its neighboring cell lines. We compare the results of the TCLMF model with the results of the previously proposed methods using two databases and two approaches to test the model's performance. In the first approach, 12 drugs with enough known clinical drug response, considered in previous methods, are studied. For 7 drugs out of 12, the TCLMF can significantly distinguish between patients that are resistance to these drugs and the patients that are sensitive to them. These approaches are converted to classification models using a threshold in the second approach, and the results are compared. The results demonstrate that the TCLMF method provides accurate predictions across the results of the other algorithms. Finally, we accurately classify tumor tissue type using the latent vectors obtained from TCLMF's logistic matrix factorization process. These findings demonstrate that the TCLMF approach produces effective latent vectors for tumor samples. The source code of the TCLMF method is available in https://github.com/emdadi/TCLMF.","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":"20 2","pages":"2150035"},"PeriodicalIF":1.0,"publicationDate":"2022-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"39614910","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Optimized splitting of mixed-species RNA sequencing data. 混合物种 RNA 测序数据的优化分割。

IF 1 4区生物学

Journal of Bioinformatics and Computational Biology Pub Date : 2022-04-01 Epub Date: 2022-01-06 DOI: 10.1142/S0219720022500019

Xuan Song, Hai Yun Gao, Karl Herrup, Ronald P Hart

{"title":"Optimized splitting of mixed-species RNA sequencing data.","authors":"Xuan Song, Hai Yun Gao, Karl Herrup, Ronald P Hart","doi":"10.1142/S0219720022500019","DOIUrl":"10.1142/S0219720022500019","url":null,"abstract":"Gene expression studies using xenograft transplants or co-culture systems, usually with mixed human and mouse cells, have proven to be valuable to uncover cellular dynamics during development or in disease models. However, the mRNA sequence similarities among species presents a challenge for accurate transcript quantification. To identify optimal strategies for analyzing mixed-species RNA sequencing data, we evaluate both alignment-dependent and alignment-independent methods. Alignment of reads to a pooled reference index is effective, particularly if optimal alignments are used to classify sequencing reads by species, which are re-aligned with individual genomes, generating [Formula: see text] accuracy across a range of species ratios. Alignment-independent methods, such as convolutional neural networks, which extract the conserved patterns of sequences from two species, classify RNA sequencing reads with over 85% accuracy. Importantly, both methods perform well with different ratios of human and mouse reads. While non-alignment strategies successfully partitioned reads by species, a more traditional approach of mixed-genome alignment followed by optimized separation of reads proved to be the more successful with lower error rates.","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":"20 2","pages":"2250001"},"PeriodicalIF":1.0,"publicationDate":"2022-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9081140/pdf/nihms-1770823.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"39792860","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A protein succinylation sites prediction method based on the hybrid architecture of LSTM network and CNN. 基于LSTM网络和CNN混合结构的蛋白质琥珀酰化位点预测方法。

IF 1 4区生物学

Journal of Bioinformatics and Computational Biology Pub Date : 2022-04-01 Epub Date: 2022-02-21 DOI: 10.1142/S0219720022500032

Die Zhang, Shunfang Wang

{"title":"A protein succinylation sites prediction method based on the hybrid architecture of LSTM network and CNN.","authors":"Die Zhang, Shunfang Wang","doi":"10.1142/S0219720022500032","DOIUrl":"https://doi.org/10.1142/S0219720022500032","url":null,"abstract":"The succinylation modification of protein participates in the regulation of a variety of cellular processes. Identification of modified substrates with precise sites is the basis for understanding the molecular mechanism and regulation of succinylation. In this work, we picked and chose five superior feature codes: CKSAAP, ACF, BLOSUM62, AAindex, and one-hot, according to their performance in the problem of succinylation sites prediction. Then, LSTM network and CNN were used to construct four models: LSTM-CNN, CNN-LSTM, LSTM, and CNN. The five selected features were, respectively, input into each of these four models for training to compare the four models. Based on the performance of each model, the optimal model among them was chosen to construct a hybrid model DeepSucc that was composed of five sub-modules for integrating heterogeneous information. Under the 10-fold cross-validation, the hybrid model DeepSucc achieves 86.26% accuracy, 84.94% specificity, 87.57% sensitivity, 0.9406 AUC, and 0.7254 MCC. When compared with other prediction tools using an independent test set, DeepSucc outperformed them in sensitivity and MCC. The datasets and source codes can be accessed at https://github.com/1835174863zd/DeepSucc.","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":"20 2","pages":"2250003"},"PeriodicalIF":1.0,"publicationDate":"2022-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"39942705","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

Tensor decomposition based on the potential low-rank and p-shrinkage generalized threshold algorithm for analyzing cancer multiomics data. 基于潜在低秩p缩广义阈值算法的张量分解癌症多组学数据分析。

IF 1 4区生物学

Journal of Bioinformatics and Computational Biology Pub Date : 2022-04-01 Epub Date: 2022-02-21 DOI: 10.1142/S0219720022500020

Hang-Jin Yang, Yu-Xia Lei, Juan Wang, Xiang-Zhen Kong, Jin-Xing Liu, Ying-Lian Gao

{"title":"Tensor decomposition based on the potential low-rank and p-shrinkage generalized threshold algorithm for analyzing cancer multiomics data.","authors":"Hang-Jin Yang, Yu-Xia Lei, Juan Wang, Xiang-Zhen Kong, Jin-Xing Liu, Ying-Lian Gao","doi":"10.1142/S0219720022500020","DOIUrl":"https://doi.org/10.1142/S0219720022500020","url":null,"abstract":"Tensor Robust Principal Component Analysis (TRPCA) has achieved promising results in the analysis of genomics data. However, the TRPCA model under the existing tensor singular value decomposition ([Formula: see text]-SVD) framework insufficiently extracts the potential low-rank structure of the data, resulting in suboptimal restored components. Simultaneously, the tensor nuclear norm (TNN) defined based on [Formula: see text]-SVD uses the same standard to handle various singular values. TNN ignores the difference of singular values, leading to the failure of the main information that needs to be well preserved. To preserve the heterogeneous structure in the low-rank information, we propose a novel TNN and extend it to the TRPCA model. Potential low-rank space may contain important information. We learn the low-rank structural information from the core tensor. The singular value space contains the association information between genes and cancers. The [Formula: see text]-shrinkage generalized threshold function is utilized to preserve the low-rank properties of larger singular values. The optimization problem is solved by the alternating direction method of the multiplier (ADMM) algorithm. Clustering and feature selection experiments are performed on the TCGA data set. The experimental results show that the proposed model is more promising than other state-of-the-art tensor decomposition methods.","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":"20 2","pages":"2250002"},"PeriodicalIF":1.0,"publicationDate":"2022-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"39942706","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

RNA modification writers influence tumor microenvironment in gastric cancer and prospects of targeted drug therapy RNA修饰对胃癌肿瘤微环境的影响及靶向药物治疗前景

IF 1 4区生物学

Journal of Bioinformatics and Computational Biology Pub Date : 2022-03-14 DOI: 10.1142/S0219720022500044

P. Song, Sheng Zhou, Xiaoyang Qi, Y. Jiao, Y. Gong, Jie Zhao, Haojun Yang, Z. Qian, J. Qian, Liming Tang

{"title":"RNA modification writers influence tumor microenvironment in gastric cancer and prospects of targeted drug therapy","authors":"P. Song, Sheng Zhou, Xiaoyang Qi, Y. Jiao, Y. Gong, Jie Zhao, Haojun Yang, Z. Qian, J. Qian, Liming Tang","doi":"10.1142/S0219720022500044","DOIUrl":"https://doi.org/10.1142/S0219720022500044","url":null,"abstract":"Background: RNA adenosine modifications are crucial for regulating RNA levels. N6-methyladenosine (m6A), N1-methyladenosine (m1A), adenosine-to-inosine RNA editing, and alternative polyadenylation (APA) are four major RNA modification types. Methods: We evaluated the altered mRNA expression profiles of 27 RNA modification enzymes and compared the differences in tumor microenvironment (TME) and clinical prognosis between two RNA modification patterns using unsupervised clustering. Then, we constructed a scoring system, WM_score, and quantified the RNA modifications in patients of gastric cancer (GC), associating WM_score with TME, clinical outcomes, and effectiveness of targeted therapies. Results: RNA adenosine modifications strongly correlated with TME and could predict the degree of TME cell infiltration, genetic variation, and clinical prognosis. Two modification patterns were identified according to high and low WM_scores. Tumors in the WM_score-high subgroup were closely linked with survival advantage, CD4[Formula: see text] T-cell infiltration, high tumor mutation burden, and cell cycle signaling pathways, whereas those in the WM_score-low subgroup showed strong infiltration of inflammatory cells and poor survival. Regarding the immunotherapy response, a high WM_score showed a significant correlation with PD-L1 expression, predicting the effect of PD-L1 blockade therapy. Conclusion: The WM_scoring system could facilitate scoring and prediction of GC prognosis.","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":"1 1","pages":"2250004"},"PeriodicalIF":1.0,"publicationDate":"2022-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48168063","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A sequence-based two-layer predictor for identifying enhancers and their strength through enhanced feature extraction 一种基于序列的两层预测器，用于通过增强特征提取识别增强子及其强度

IF 1 4区生物学

Journal of Bioinformatics and Computational Biology Pub Date : 2022-03-09 DOI: 10.1142/S0219720022500056

Santhosh Amilpur, Raju Bhukya

{"title":"A sequence-based two-layer predictor for identifying enhancers and their strength through enhanced feature extraction","authors":"Santhosh Amilpur, Raju Bhukya","doi":"10.1142/S0219720022500056","DOIUrl":"https://doi.org/10.1142/S0219720022500056","url":null,"abstract":"Enhancers are short regulatory DNA fragments that are bound with proteins called activators. They are free-bound and distant elements, which play a vital role in controlling gene expression. It is challenging to identify enhancers and their strength due to their dynamic nature. Although some machine learning methods exist to accelerate identification process, their prediction accuracy and efficiency will need more improvement. In this regard, we propose a two-layer prediction model with enhanced feature extraction strategy which does feature combination from improved position-specific amino acid propensity (PSTKNC) method along with Enhanced Nucleic Acid Composition (ENAC) and Composition of k-spaced Nucleic Acid Pairs (CKSNAP). The feature sets from all three feature extraction approaches were concatenated and then sent through a simple artificial neural network (ANN) to accurately identify enhancers in the first layer and their strength in the second layer. Experiments are conducted on benchmark chromatin nine cell lines dataset. A 10-fold cross validation method is employed to evaluate model's performance. The results show that the proposed model gives an outstanding performance with 94.50%, 0.8903 of accuracy and Matthew's correlation coefficient (MCC) in predicting enhancers and fairly does well with independent test also when compared with all other existing methods.","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":"1 1","pages":"2250005"},"PeriodicalIF":1.0,"publicationDate":"2022-03-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41464283","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Identification of cancer-related module in protein-protein interaction network based on gene prioritization. 基于基因优先级的蛋白质相互作用网络中癌症相关模块的鉴定。

IF 1 4区生物学

Journal of Bioinformatics and Computational Biology Pub Date : 2022-02-01 Epub Date: 2021-12-03 DOI: 10.1142/S0219720021500311

Jingli Wu, Qi Zhang, Gaoshi Li

{"title":"Identification of cancer-related module in protein-protein interaction network based on gene prioritization.","authors":"Jingli Wu, Qi Zhang, Gaoshi Li","doi":"10.1142/S0219720021500311","DOIUrl":"https://doi.org/10.1142/S0219720021500311","url":null,"abstract":"With the rapid development of deep sequencing technologies, a large amount of high-throughput data has been available for studying the carcinogenic mechanism at the molecular level. It has been widely accepted that the development and progression of cancer are regulated by modules/pathways rather than individual genes. The investigation of identifying cancer-related active modules has received an extensive attention. In this paper, we put forward an identification method ModFinder by integrating both biological networks and gene expression profiles. More concretely, a gene scoring function is devised by using the regression model with [Formula: see text]-step random walk kernel, and the genes are ranked according to both of their active scores and degrees in the PPI network. Then a greedy algorithm NSEA is introduced to find an active module with high score and strong connectivity. Experiments were performed on both simulated data and real biological one, i.e. breast cancer and cervical cancer. Compared with the previous methods SigMod, LEAN and RegMod, ModFinder shows competitive performance. It can successfully identify a well-connected module that contains a large proportion of cancer-related genes, including some well-known oncogenes or tumor suppressors enriched in cancer-related pathways.","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":"20 1","pages":"2150031"},"PeriodicalIF":1.0,"publicationDate":"2022-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"39956506","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0