Current Bioinformatics最新文献_第8页

DMR_Kmeans: Identifying Differentially Methylated Regions Based on kmeans Clustering and Read Methylation Haplotype Filtering DMR_Kmeans:基于kmeans聚类和Read甲基化单倍型过滤识别差异甲基化区域

3区生物学

Current Bioinformatics Pub Date : 2023-10-06 DOI: 10.2174/0115748936245495230925112419

Xiaoqing Peng, Wanxin Cui, Xiangyan Kong, Yuannan Huang, Ji Li

{"title":"DMR_Kmeans: Identifying Differentially Methylated Regions Based on kmeans Clustering and Read Methylation Haplotype Filtering","authors":"Xiaoqing Peng, Wanxin Cui, Xiangyan Kong, Yuannan Huang, Ji Li","doi":"10.2174/0115748936245495230925112419","DOIUrl":"https://doi.org/10.2174/0115748936245495230925112419","url":null,"abstract":"Introduction:: Differentially methylated regions (DMRs), including tissue-specific DMRs and disease-specific DMRs, can be used in revealing the mechanisms of gene regulation and screening diseases. Up until now, many methods have been proposed to detect DMRs from bisulfite sequencing data. In these methods, differentially methylated CpG sites and DMRs are usually identified based on statistical tests or distribution models, which neglect the joint methylation statuses provided in each read and result in inaccurate boundaries of DMRs. background: Differentially methylated regions (DMRs), including the tissue-specific DMRs and disease-specific DMRs, can be used in revealing the mechanisms of gene regulation and screening diseases. Up until now, many methods have been proposed to detect DMRs from bisulfite sequencing data. In these methods, differentially methylated CpG sites and DMRs are usually identified based on statistical tests or distribution models, which neglects the joint methylation statuses provided in each read and results in inaccurate boundaries of DMRs. Methods:: In this paper, a method, named DMR_Kmeans, is proposed to detect DMRs based on kmeans clustering and read methylation haplotype filtering. In DMR_Kmeans, for each CpG site, the k-means algorithm is used to cluster the methylation levels from two groups, and the methylation difference of the CpG is measured based on the different distributions in clusters. Methylation haplotypes of reads are employed to extract the methylation patterns in a candidate region. Finally, DMRs are identified based on the methylation differences and the methylation patterns in candidate regions. objective: Make use of the joint methylation statuses provided in each read and predict accurate boundaries of DMRs. Result:: Comparing the performance of DMR_Kmeans and eight DMR detection methods on the whole genome bisulfite sequencing data of six pairs of tissues, the results show that DMR_Kmeans achieves higher Qn and Ql, and more overlapped promoters than other methods when given a certain threshold of methylation difference greater than 0.4, which indicates that the DMRs predicted by DMR_Kmeans with accurate boundaries contain less CpGs with small methylation differences than those by other methods. method: In this paper, a method, named DMR_Kmeans, is proposed to detect DMRs based on k-means clustering and read methylation haplotype filtering. In DMR_Kmeans, for each CpG site, the k-means algorithm is used to cluster the methylation levels from two groups, and the methylation difference of the CpG is measured based on the different distributions in clusters. Methylation haplotypes of reads are employed to extract the methylation patterns in a candidate region. Finally, DMRs are identified based on the methylation differences and the methylation patterns in candidate regions. Conclusion:: Furthermore, it suggests that DMR_Kmeans can provide a DMR set with high quality for downstream analysis since th","PeriodicalId":10801,"journal":{"name":"Current Bioinformatics","volume":"300 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134945067","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A Deep Neural Network Model with Attribute Network Representation for lncRNA-Protein Interaction Prediction 基于属性网络的lncrna -蛋白相互作用预测的深度神经网络模型

3区生物学

Current Bioinformatics Pub Date : 2023-10-06 DOI: 10.2174/0115748936267109230919104630

Meng-Meng Wei, Chang-Qing Yu, Li-Ping Li, Zhu-Hong You, Lei Wang

{"title":"A Deep Neural Network Model with Attribute Network Representation for lncRNA-Protein Interaction Prediction","authors":"Meng-Meng Wei, Chang-Qing Yu, Li-Ping Li, Zhu-Hong You, Lei Wang","doi":"10.2174/0115748936267109230919104630","DOIUrl":"https://doi.org/10.2174/0115748936267109230919104630","url":null,"abstract":"background: LncRNA is not only involved in the regulation of the biological functions of protein-coding genes but its dysfunction is also associated with the occurrence and progression of various diseases. As more and more studies have shown that an in-depth understanding of the mechanism of action of lncRNA is of great significance for disease treatment. However, traditional wet testing is time-consuming, laborious, expensive, and has many subjective factors, which may affect the accuracy of the experiment. objective: Most of the methods for predicting lncRNA-protein interaction (LPI) only rely on a single feature or there is noise in the feature. To solve this problem, we propose a computational model CSALPI based on a deep neural network. method: Firstly, this model utilizes cosine similarity to extract similarity features for lncRNA-lncRNA and protein-protein. Denoising similar features using the Sparse Autoencoder. Second, a neighbor enhancement autoencoder is employed to enforce neighboring nodes to be represented in a similar way by reconstructing the denoised features. Finally, a Light Gradient Boosting Machine classifier is used to predict potential LPIs. result: To demonstrate the reliability of CSALPI, multiple evaluation metrics were used under a 5-fold cross-validation experiment and excellent results were achieved. In the case study, the model successfully predicted 7 out of 10 disease-associated lncRNA and protein pairs. conclusion: The CSALPI can be used as an effective complementary method for predicting potential LPIs from biological experiments.","PeriodicalId":10801,"journal":{"name":"Current Bioinformatics","volume":"2010 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134944940","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

MSSD: An Efficient Method for Constructing Accurate and Stable Phylogenetic Networks by Merging Subtrees of Equal Depth MSSD:一种利用等深度子树合并构建准确稳定的系统发育网络的有效方法

3区生物学

Current Bioinformatics Pub Date : 2023-10-04 DOI: 10.2174/0115748936256923230927081102

Jiajie Xing, Xu Song, Meiju Yu, Juan Wang, Jing Yu

{"title":"MSSD: An Efficient Method for Constructing Accurate and Stable Phylogenetic Networks by Merging Subtrees of Equal Depth","authors":"Jiajie Xing, Xu Song, Meiju Yu, Juan Wang, Jing Yu","doi":"10.2174/0115748936256923230927081102","DOIUrl":"https://doi.org/10.2174/0115748936256923230927081102","url":null,"abstract":"Background:: Systematic phylogenetic networks are essential for studying the evolutionary relationships and diversity among species. These networks are particularly important for capturing non-tree-like processes resulting from reticulate evolutionary events. However, existing methods for constructing phylogenetic networks are influenced by the order of inputs. The different orders can lead to inconsistent experimental results. Moreover, constructing a network for large datasets is time-consuming and the network often does not include all of the input tree nodes. Aims: This paper aims to propose a novel method, called as MSSD, which can construct a phylogenetic network from gene trees by Merging Subtrees with the Same Depth in a bottom-up way. background: Phylogenetic trees can represent the evolutionary history of genes vertically. There is a difference between phylogenetic trees of different genes due to the reticulate evolution events of species. Phylogenetic networks can represent reticulate evolutionary processes and show the difference between rooted gene trees. Methods:: The MSSD first decomposes trees into subtrees based on depth. Then it merges subtrees with the same depth from 0 to the maximum depth. For all subtrees of one depth, it inserts each subtree into the current networks by means of identical subtrees. Results:: We test the MSSD on the simulated data and real data. The experimental results show that the networks constructed by the MSSD can represent all input trees and the MSSD is more stable than other methods. The MSSD can construct networks faster and the constructed networks have more similar information with the input trees than other methods. Conclusion:: MSSD is a powerful tool for studying the evolutionary relationships among species in biologyand is free available at https://github.com/xingjiajie2023/MSSD. conclusion: The MSSD can construct networks faster and the constructed networks have more similar information with the input trees than other methods.","PeriodicalId":10801,"journal":{"name":"Current Bioinformatics","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135647357","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

iPSI(2L)-EDL: a Two-layer Predictor for Identifying Promoters and their Types based on Ensemble Deep Learning iPSI(2L)-EDL:基于集成深度学习的启动子及其类型识别的双层预测器

3区生物学

Current Bioinformatics Pub Date : 2023-10-02 DOI: 10.2174/0115748936264316230926073231

Xuan Xiao, Zaihao Hu, ZhenTao Luo, Zhaochun Xu

{"title":"iPSI(2L)-EDL: a Two-layer Predictor for Identifying Promoters and their Types based on Ensemble Deep Learning","authors":"Xuan Xiao, Zaihao Hu, ZhenTao Luo, Zhaochun Xu","doi":"10.2174/0115748936264316230926073231","DOIUrl":"https://doi.org/10.2174/0115748936264316230926073231","url":null,"abstract":"Abstract: Promoters are DNA fragments located near the transcription initiation site, they can be divided into strong promoter type and weak promoter type according to transcriptional activation and expression level. Identifying promoters and their strengths in DNA sequences is essential for understanding gene expression regulation. Therefore, it is crucial to further improve predictive quality of predictors for real-world application requirements. Here, we constructed the latest training dataset based on the RegalonDB website, where all the promoters in this dataset have been experimentally validated, and their sequence similarity is less than 85%. We used one-hot and nucleotide chemical property and density (NCPD) to represent DNA sequence samples. Additionally, we proposed an ensemble deep learning framework containing a multi-head attention module, long short-term memory present, and a convolutional neural network module. The results showed that iPSI(2L)-EDL outperformed other existing methods for both promoter prediction and identification of strong promoter type and weak promoter type, the AUC and MCC for the iPSI(2L)-EDL in identifying promoter were improved by 2.23% and 2.96% compared to that of PseDNC-DL on independent testing data, respectively, while the AUC and MCC for the iPSI(2L)- EDL were increased by 3.74% and 5.86% in predicting promoter strength type, respectively. The results of ablation experiments indicate that CNN plays a crucial role in recognizing promoters, the importance of different input positions and long-range dependency relationships among features are helpful for recognizing promoters. Furthermore, to make it easier for most experimental scientists to get the results they need, a userfriendly web server has been established and can be accessed at http://47.94.248.117/IPSW(2L)-EDL.","PeriodicalId":10801,"journal":{"name":"Current Bioinformatics","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135901216","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Predicting the risk of breast cancer recurrence and metastasis based on miRNA expression 基于miRNA表达预测乳腺癌复发和转移风险

3区生物学

Current Bioinformatics Pub Date : 2023-09-14 DOI: 10.2174/1574893618666230914105741

Yaping Lv, Yanfeng Wang, Yumeng Zhang, Shuzhen Chen, Yuhua Yao

{"title":"Predicting the risk of breast cancer recurrence and metastasis based on miRNA expression","authors":"Yaping Lv, Yanfeng Wang, Yumeng Zhang, Shuzhen Chen, Yuhua Yao","doi":"10.2174/1574893618666230914105741","DOIUrl":"https://doi.org/10.2174/1574893618666230914105741","url":null,"abstract":"Background: Even after surgery, breast cancer patients still suffer from recurrence and metastasis. Thus, it is critical to predict accurately the risk of recurrence and metastasis for individual patients, which can help determine the appropriate adjuvant therapy. Methods: The purpose of this study is to investigate and compare the performance of several categories of molecular biomarkers, i.e., microRNA (miRNA), long non-coding RNA (lncRNA), messenger RNA (mRNA), and copy number variation (CNV), in predicting the risk of breast cancer recurrence and metastasis. First, the molecular data (miRNA, lncRNA, mRNA, and CNV) of 483 breast cancer patients were downloaded from the Cancer Genome Atlas, which were then randomly divided into the training and test sets with a ratio of 7:3. Second, the feature selection process was applied by univariate Cox and multivariate Cox variance analysis on the training set (e.g., 15 miRNAs). According to the selected features (e.g., 15 miRNAs), a random forest classifier and several other classification methods were established according to the label of recurrence and metastasis. Finally, the performances of the classification models were compared and evaluated on the test set. Results: The area under the ROC curve was 0.70 for miRNA, better than those using other biomarkers. Conclusion: These results indicated that miRNA has important guiding significance in predicting recurrence and metastasis of breast cancer.","PeriodicalId":10801,"journal":{"name":"Current Bioinformatics","volume":"85 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134969906","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Investigation of LncRNAs Expression as a Potential Biomarker in the Diagnosis and Treatment of Human Brucellosis LncRNAs表达在人布鲁氏菌病诊断和治疗中的潜在生物标志物研究

3区生物学

Current Bioinformatics Pub Date : 2023-09-14 DOI: 10.2174/1574893618666230914160213

Mansoor Kodori, Mohammad Abavisani, Hadis Fathizadeh, Mansoor Khaledi, Mohammad Hossein Haddadi, Shahrbanoo Keshavarz Aziziraftar, Foroogh Neamati, Amirhossein Sahebkar

{"title":"Investigation of LncRNAs Expression as a Potential Biomarker in the Diagnosis and Treatment of Human Brucellosis","authors":"Mansoor Kodori, Mohammad Abavisani, Hadis Fathizadeh, Mansoor Khaledi, Mohammad Hossein Haddadi, Shahrbanoo Keshavarz Aziziraftar, Foroogh Neamati, Amirhossein Sahebkar","doi":"10.2174/1574893618666230914160213","DOIUrl":"https://doi.org/10.2174/1574893618666230914160213","url":null,"abstract":"Abstract: Long non-coding RNAs (LncRNAs) are significant contributors to bacterial infections and host defense responses, presenting a novel class of gene regulators beyond conventional protein-coding genes. This narrative review aimed to explore the involvement of LncRNAs as a potential biomarker in the diagnosis and treatment of bacterial infections, with a specific focus on Brucella infections. A comprehensive literature review was conducted to identify relevant studies examining the roles of LncRNAs in immune responses during bacterial infections, with a specific emphasis on Brucella infections. PubMed, Scopus and other major scientific databases were searched using relevant keywords. LncRNAs crucially regulate immune responses to bacterial infections, influencing transcription factors, pro-inflammatory cytokines, and immune cell behavior, with both positive and negative effects. The NF-κB pathway is a key regulator for many LncRNAs in bacterial infections. During Brucella infections, essential LncRNAs activate the innate immune response, increasing proinflammatory cytokine production and immune cell differentiation. LncRNAs are associated with human brucellosis, holding promise for screening, diagnostics, or therapeutics. Further research is needed to fully understand LncRNAs' precise functions in Brucella infection and pathogenesis. Specific LncRNAs, like IFNG-AS1 and NLRP3, are upregulated during brucellosis, while others, such as Gm28309, are downregulated, influencing immunosuppression and bacterial survival. Investigating the prognostic and therapeutic potential of Brucella-related LncRNAs warrants ongoing investigation, including their roles in other immune cells like macrophages, dendritic cells, and neutrophils responsible for bacterial clearance. Unraveling the intricate relationship between LncRNAs and brucellosis may reveal novel regulatory mechanisms and LncRNAs' roles in infection regulation, expediting diagnostics and enhancing therapeutic strategies against Brucella infections.","PeriodicalId":10801,"journal":{"name":"Current Bioinformatics","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134969910","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Thorough Assessment of Machine Learning Techniques for Predicting Protein-Nucleic Acid Binding Hot Spots 预测蛋白质-核酸结合热点的机器学习技术的全面评估

3区生物学

Current Bioinformatics Pub Date : 2023-09-13 DOI: 10.2174/1574893618666230913090436

Xianzhe Zou, Chen Zhang, MIngyan Tang, Lei Deng

{"title":"Thorough Assessment of Machine Learning Techniques for Predicting Protein-Nucleic Acid Binding Hot Spots","authors":"Xianzhe Zou, Chen Zhang, MIngyan Tang, Lei Deng","doi":"10.2174/1574893618666230913090436","DOIUrl":"https://doi.org/10.2174/1574893618666230913090436","url":null,"abstract":"Background: Proteins and nucleic acids are vital biomolecules that contribute significantly to biological life. The precise and efficient identification of hot spots at protein-nucleic acid interfaces is crucial for guiding drug development, advancing protein engineering, and exploring the underlying molecular recognition mechanisms. As experimental methods like alanine scanning mutagenesis prove to be time-consuming and expensive, a growing number of machine learning techniques are being employed to predict hot spots. However, the existing approach is distinguished by a lack of uniform standards, a scarcity of data, and a wide range of attributes. Currently, there is no comprehensive overview or evaluation of this field. As a result, providing a full overview and review is extremely helpful. Methods: In this study, we present an overview of cutting-edge machine learning approaches utilized for hot spot prediction in protein-nucleic acid complexes. Additionally, we outline the feature categories currently in use, derived from relevant biological data sources, and assess conventional feature selection methods based on 600 extracted features. Simultaneously, we create two new benchmark datasets, PDHS87 and PRHS48, and develop distinct binary classification models based on these datasets to evaluate the advantages and disadvantages of various machine-learning techniques. Results: Prediction of protein-nucleic acid interaction hotspots is a challenging task. The study demonstrates that structural neighborhood features play a crucial role in identifying hot spots. The prediction performance can be improved by choosing effective feature selection methods and machine learning methods. Among the existing prediction methods, XGBPRH has the best performance. Conclusion: It is crucial to continue studying hot spot theories, discover new and effective features, add accurate experimental data, and utilize DNA/RNA information. Semi-supervised learning, transfer learning, and ensemble learning can optimize predictive ability. Combining computational docking with machine learning methods can potentially further improve predictive performance.","PeriodicalId":10801,"journal":{"name":"Current Bioinformatics","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135784172","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Drug-Target Interaction Prediction By Combining Transformer and Graph Neural Networks 结合变形神经网络和图神经网络的药物-靶标相互作用预测

3区生物学

Current Bioinformatics Pub Date : 2023-09-12 DOI: 10.2174/1574893618666230912141426

Junkai Liu, Yaoyao Lu, Shixuan Guan, Tengsheng Jiang, Yijie Ding, Qiming Fu, Zhiming Cui, Hongjie Wu

{"title":"Drug-Target Interaction Prediction By Combining Transformer and Graph Neural Networks","authors":"Junkai Liu, Yaoyao Lu, Shixuan Guan, Tengsheng Jiang, Yijie Ding, Qiming Fu, Zhiming Cui, Hongjie Wu","doi":"10.2174/1574893618666230912141426","DOIUrl":"https://doi.org/10.2174/1574893618666230912141426","url":null,"abstract":"Background: The prediction of drug-target interactions (DTIs) plays an essential role in drug discovery. Recently, deep learning methods have been widely applied in DTI prediction. However, most of the existing research does not fully utilize the molecular structures of drug compounds and the sequence structures of proteins, which makes these models unable to obtain precise and effective feature representations. Methods: In this study, we propose a novel deep learning framework combining transformer and graph neural networks for predicting DTIs. Our model utilizes graph convolutional neural networks to capture the global and local structure information of drugs, and convolutional neural networks are employed to capture the sequence feature of targets. In addition, the obtained drug and protein representations are input to multi-layer transformer encoders, respectively, to integrate their features and generate final representations. Results: The experiments on benchmark datasets demonstrated that our model outperforms previous graph-based and transformer-based methods, with 1.5% and 1.8% improvement in precision and 0.2% and 1.0% improvement in recall, respectively. The results indicate that the transformer encoders effectively extract feature information of both drug compounds and proteins. Conclusion: Overall, our proposed method validates the applicability of combining graph neural networks and transformer architecture in drug discovery, and due to the attention mechanisms, it can extract deep structure feature data of drugs and proteins.","PeriodicalId":10801,"journal":{"name":"Current Bioinformatics","volume":"85 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135885613","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Prediction of Plant Ubiquitylation Proteins and Sites by Fusing Multiple Features 融合多特征预测植物泛素化蛋白和位点

3区生物学

Current Bioinformatics Pub Date : 2023-09-08 DOI: 10.2174/1574893618666230908092847

Meng-Yue Guan, Wang-Ren Qiu, Qian-Kun Wang, Xuan Xiao

{"title":"Prediction of Plant Ubiquitylation Proteins and Sites by Fusing Multiple Features","authors":"Meng-Yue Guan, Wang-Ren Qiu, Qian-Kun Wang, Xuan Xiao","doi":"10.2174/1574893618666230908092847","DOIUrl":"https://doi.org/10.2174/1574893618666230908092847","url":null,"abstract":"Introduction: Protein ubiquitylation is an important post-translational modification (PTM), which is considered to be one of the most important processes regulating cell function and various diseases. Therefore, accurate prediction of ubiquitylation proteins and their PTM sites is of great significance for the study of basic biological processes and the development of related drugs. Researchers have developed some large-scale computational methods to predict ubiquitylation sites, but there is still much room for improvement. Much of the research related to ubiquitylation is cross-species while the life pattern is diversified, and the prediction method always shows its specificity in practical application. This study just aims at the issue of plants and has constructed computational methods for identifying ubiquitylation protein and ubiquitylation sites. Method: In this work, we constructed two predictive models to identify plant ubiquitylation proteins and sites. First, in the ubiquitylation proteins prediction model, in order to better reflect protein sequence information and obtain better prediction results, the KNN scoring matrix model based on functional domain Gene Ontology (GO) annotation and word embedding model, i.e. Skip-Gram and Continuous Bag of Words (CBOW), are used to extract the features, and the light gradient boosting machine (LGBM) is selected as the ubiquitylation proteins prediction engine. Results: As a result, accuracy (ACC), Precision, recall rate (Recall), F1_score and AUC are respectively 85.12%, 80.96%, 72.80%, 76.37% and 0.9193 in the 10-fold cross-validations on independent dataset. In the ubiquitylation sites prediction model, Skip-Gram, CBOW and enhanced amino acid composition (EAAC) feature extraction codes were used to extract protein sequence fragment features, and the predicted results on training and independent test data have also achieved good performance. Conclusion: In a word, the comparison results demonstrate that our models have a decided advantage in predicting ubiquitylation proteins and sites, and it may provide useful insights for studying the mechanisms and modulation of ubiquitination pathways","PeriodicalId":10801,"journal":{"name":"Current Bioinformatics","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136361817","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Identification and Functional Prediction of lncRNAs using Bioinformatic Techniques 利用生物信息学技术鉴定和预测lncrna的功能

3区生物学

Current Bioinformatics Pub Date : 2023-09-07 DOI: 10.2174/1574893618666230907165829

Shizuka Uchida

引用次数: 0