Evolutionary Bioinformatics最新文献_第5页

Stability of scRNA-Seq Analysis Workflows is Susceptible to Preprocessing and is Mitigated by Regularized or Supervised Approaches. scRNA-Seq分析工作流程的稳定性容易受到预处理的影响，并且可以通过正则化或监督方法来降低稳定性。

IF 2.6 4区生物学

Evolutionary Bioinformatics Pub Date : 2022-01-01 DOI: 10.1177/11769343221123050

Arda Durmaz, Jacob G Scott

{"title":"Stability of scRNA-Seq Analysis Workflows is Susceptible to Preprocessing and is Mitigated by Regularized or Supervised Approaches.","authors":"Arda Durmaz, Jacob G Scott","doi":"10.1177/11769343221123050","DOIUrl":"https://doi.org/10.1177/11769343221123050","url":null,"abstract":"Background: Statistical methods developed to address various questions in single-cell datasets show increased variability to different parameter regimes. In order to delineate further the robustness of commonly utilized methods for single-cell RNA-Seq, we aimed to comprehensively review scRNA-Seq analysis workflows in the setting of dimension reduction, clustering, and trajectory inference.Methods: We utilized datasets with temporal single-cell transcriptomics profiles from public repositories. Combining multiple methods at each level of the workflow, we have performed over 6k analysis and evaluated the results of clustering and pseudotime estimation using adjusted rand index and rank correlation metrics. We have further integrated neural network methods to assess whether models with increased complexity can show increased bias/variance trade-off.Results: Combinatorial workflows showed that utilizing non-linear dimension reduction techniques such as t-SNE and UMAP are sensitive to initial preprocessing steps hence clustering results on dimension reduced space of single-cell datasets should be utilized carefully. Similarly, pseudotime estimation methods that depend on previous non-linear dimension reduction steps can result in highly variable trajectories. In contrast, methods that avoid non-linearity such as WOT can result in repeatable inferences of temporal gene expression dynamics. Furthermore, imputation methods do not improve clustering or trajectory inference results substantially in terms of repeatability. In contrast, the selection of the normalization method shows an increased effect on downstream analysis where ScTransform reduces variability overall.","PeriodicalId":50472,"journal":{"name":"Evolutionary Bioinformatics","volume":"18 ","pages":"11769343221123050"},"PeriodicalIF":2.6,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ftp.ncbi.nlm.nih.gov/pub/pmc/oa_pdf/07/96/10.1177_11769343221123050.PMC9527995.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9743388","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Evolutionary Dynamics of Indels in SARS-CoV-2 Spike Glycoprotein. SARS-CoV-2 穗状糖蛋白中吲哚的进化动力学

IF 2.6 4区生物学

Evolutionary Bioinformatics Pub Date : 2021-12-06 eCollection Date: 2021-01-01 DOI: 10.1177/11769343211064616

R Shyama Prasad Rao, Nagib Ahsan, Chunhui Xu, Lingtao Su, Jacob Verburgt, Luca Fornelli, Daisuke Kihara, Dong Xu

{"title":"Evolutionary Dynamics of Indels in SARS-CoV-2 Spike Glycoprotein.","authors":"R Shyama Prasad Rao, Nagib Ahsan, Chunhui Xu, Lingtao Su, Jacob Verburgt, Luca Fornelli, Daisuke Kihara, Dong Xu","doi":"10.1177/11769343211064616","DOIUrl":"10.1177/11769343211064616","url":null,"abstract":"SARS-CoV-2, responsible for the current COVID-19 pandemic that claimed over 5.0 million lives, belongs to a class of enveloped viruses that undergo quick evolutionary adjustments under selection pressure. Numerous variants have emerged in SARS-CoV-2, posing a serious challenge to the global vaccination effort and COVID-19 management. The evolutionary dynamics of this virus are only beginning to be explored. In this work, we have analysed 1.79 million spike glycoprotein sequences of SARS-CoV-2 and found that the virus is fine-tuning the spike with numerous amino acid insertions and deletions (indels). Indels seem to have a selective advantage as the proportions of sequences with indels steadily increased over time, currently at over 89%, with similar trends across countries/variants. There were as many as 420 unique indel positions and 447 unique combinations of indels. Despite their high frequency, indels resulted in only minimal alteration of N-glycosylation sites, including both gain and loss. As indels and point mutations are positively correlated and sequences with indels have significantly more point mutations, they have implications in the evolutionary dynamics of the SARS-CoV-2 spike glycoprotein.","PeriodicalId":50472,"journal":{"name":"Evolutionary Bioinformatics","volume":"17 ","pages":"11769343211064616"},"PeriodicalIF":2.6,"publicationDate":"2021-12-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ftp.ncbi.nlm.nih.gov/pub/pmc/oa_pdf/18/95/10.1177_11769343211064616.PMC8655444.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"39718297","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

The Power of Universal Contextualized Protein Embeddings in Cross-species Protein Function Prediction. 通用语境化蛋白质嵌入在跨物种蛋白质功能预测中的作用。

IF 1.7 4区生物学

Evolutionary Bioinformatics Pub Date : 2021-12-03 eCollection Date: 2021-01-01 DOI: 10.1177/11769343211062608

Irene van den Bent, Stavros Makrodimitris, Marcel Reinders

{"title":"The Power of Universal Contextualized Protein Embeddings in Cross-species Protein Function Prediction.","authors":"Irene van den Bent, Stavros Makrodimitris, Marcel Reinders","doi":"10.1177/11769343211062608","DOIUrl":"10.1177/11769343211062608","url":null,"abstract":"Computationally annotating proteins with a molecular function is a difficult problem that is made even harder due to the limited amount of available labeled protein training data. Unsupervised protein embeddings partly circumvent this limitation by learning a universal protein representation from many unlabeled sequences. Such embeddings incorporate contextual information of amino acids, thereby modeling the underlying principles of protein sequences insensitive to the context of species. We used an existing pre-trained protein embedding method and subjected its molecular function prediction performance to detailed characterization, first to advance the understanding of protein language models, and second to determine areas of improvement. Then, we applied the model in a transfer learning task by training a function predictor based on the embeddings of annotated protein sequences of one training species and making predictions on the proteins of several test species with varying evolutionary distance. We show that this approach successfully generalizes knowledge about protein function from one eukaryotic species to various other species, outperforming both an alignment-based and a supervised-learning-based baseline. This implies that such a method could be effective for molecular function prediction in inadequately annotated species from understudied taxonomic kingdoms.","PeriodicalId":50472,"journal":{"name":"Evolutionary Bioinformatics","volume":"17 ","pages":"11769343211062608"},"PeriodicalIF":1.7,"publicationDate":"2021-12-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8647222/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"39957598","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Screening of Important Factors in the Early Sepsis Stage Based on the Evaluation of ssGSEA Algorithm and ceRNA Regulatory Network. 基于ssGSEA算法和ceRNA调控网络评价的脓毒症早期重要因素筛选

IF 2.6 4区生物学

Evolutionary Bioinformatics Pub Date : 2021-11-26 eCollection Date: 2021-01-01 DOI: 10.1177/11769343211058463

Liou Huang, Chunrong Wu, Dan Xu, Yuhui Cui, Jianguo Tang

{"title":"Screening of Important Factors in the Early Sepsis Stage Based on the Evaluation of ssGSEA Algorithm and ceRNA Regulatory Network.","authors":"Liou Huang, Chunrong Wu, Dan Xu, Yuhui Cui, Jianguo Tang","doi":"10.1177/11769343211058463","DOIUrl":"https://doi.org/10.1177/11769343211058463","url":null,"abstract":"Background: Sepsis is a dysregulated host response to pathogens. Delay in sepsis diagnosis has become a primary cause of patient death. This study determines some factors to prevent septic shock in its early stage, contributing to the early treatment of sepsis.Methods: The sequencing data (RNA- and miRNA-sequencing) of patients with septic shock were obtained from the NCBI GEO database. After re-annotation, we obtained lncRNAs, miRNA, and mRNA information. Then, we evaluated the immune characteristics of the sample based on the ssGSEA algorithm. We used the WGCNA algorithm to obtain genes significantly related to immunity and screen for important related factors by constructing a ceRNA regulatory network.Result: After re-annotation, we obtained 1708 lncRNAs, 129 miRNAs, and 17 326 mRNAs. Also, through the ssGSEA algorithm, we obtained 5 important immune cells. Finally, we constructed a ceRNA regulation network associated with SS pathways.Conclusion: We identified 5 immune cells with significant changes in the early stage of septic shock. We also constructed a ceRNA network, which will help us explore the pathogenesis of septic shock.","PeriodicalId":50472,"journal":{"name":"Evolutionary Bioinformatics","volume":"17 ","pages":"11769343211058463"},"PeriodicalIF":2.6,"publicationDate":"2021-11-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ftp.ncbi.nlm.nih.gov/pub/pmc/oa_pdf/ac/ad/10.1177_11769343211058463.PMC8637398.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"39693076","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 20

AURKB as a Promising Prognostic Biomarker in Hepatocellular Carcinoma. AURKB作为肝细胞癌预后的生物标志物。

IF 2.6 4区生物学

Evolutionary Bioinformatics Pub Date : 2021-11-24 eCollection Date: 2021-01-01 DOI: 10.1177/11769343211057589

Jingchuan Xiao, Yingai Zhang

{"title":"AURKB as a Promising Prognostic Biomarker in Hepatocellular Carcinoma.","authors":"Jingchuan Xiao, Yingai Zhang","doi":"10.1177/11769343211057589","DOIUrl":"https://doi.org/10.1177/11769343211057589","url":null,"abstract":"The Aurora kinases form a family of 3 genes encoding serine/threonine kinases and are involved in the regulation of cell division during the mitosis. This study was designed to investigate the prognostic role of Aurora kinases in hepatocellular carcinoma (HCC). In this study, we analyzed the expression, overall survival (OS) data, promoter methylation level, and relationship with immunoinhibitors of Aurora kinases in patients with HCC from GEPIA2, UALCAN, OncoLnc, and TISIDB databases. Protein-protein interaction (PPI) network, gene ontology, Kyoto Encyclopedia of Genes and Genomes (KEGG), and Reactome pathway analysis were performed using the STRING database and Cytoscape software. We found that the mRNA expression, stages of HCC, and OS of AURKA and AURKB in HCC tissues were significantly different from control tissues, but there were significant inconsistencies in promoter methylation level and relationship with immunoinhibitors for AURKA and AURKB. None of the above items were significantly different for AURKC. Furthermore, a hub module including AURKA, AURKB, and AURKC was identified within the PPI network constructed with the Molecular Complex Detection (MCODE) plug-in in Cytoscape software. Our results show that AURKB could be a potential biomarker for HCC prognosis.","PeriodicalId":50472,"journal":{"name":"Evolutionary Bioinformatics","volume":"17 ","pages":"11769343211057589"},"PeriodicalIF":2.6,"publicationDate":"2021-11-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ftp.ncbi.nlm.nih.gov/pub/pmc/oa_pdf/20/c3/10.1177_11769343211057589.PMC8637395.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"39693075","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5

Comparison of the Gut Microbiota in Patients with Benign and Malignant Breast Tumors: A Pilot Study. 比较良性和恶性乳腺肿瘤患者的肠道微生物群：一项试点研究

IF 2.6 4区生物学

Evolutionary Bioinformatics Pub Date : 2021-11-13 eCollection Date: 2021-01-01 DOI: 10.1177/11769343211057573

Peidong Yang, Zhitang Wang, Qingqin Peng, Weibin Lian, Debo Chen

{"title":"Comparison of the Gut Microbiota in Patients with Benign and Malignant Breast Tumors: A Pilot Study.","authors":"Peidong Yang, Zhitang Wang, Qingqin Peng, Weibin Lian, Debo Chen","doi":"10.1177/11769343211057573","DOIUrl":"10.1177/11769343211057573","url":null,"abstract":"The microbiome plays diverse roles in many diseases and can potentially contribute to cancer development. Breast cancer is the most commonly diagnosed cancer in women worldwide. Thus, we investigated whether the gut microbiota differs between patients with breast carcinoma and those with benign tumors. The DNA of the fecal microbiota community was detected by Illumina sequencing and the taxonomy of 16S rRNA genes. The α-diversity and β-diversity analyses were used to determine richness and evenness of the gut microbiota. Gene function prediction of the microbiota in patients with benign and malignant carcinoma was performed using PICRUSt. There was no significant difference in the α-diversity between patients with benign and malignant tumors (P = 3.15e-1 for the Chao index and P = 3.1e-1 for the ACE index). The microbiota composition was different between the 2 groups, although no statistical difference was observed in β-diversity. Of the 31 different genera compared between the 2 groups, level of only Citrobacter was significantly higher in the malignant tumor group than that in benign tumor group. The metabolic pathways of the gut microbiome in the malignant tumor group were significantly different from those in benign tumor group. Furthermore, the study establishes the distinct richness of the gut microbiome in patients with breast cancer with different clinicopathological factors, including ER, PR, Ki-67 level, Her2 status, and tumor grade. These findings suggest that the gut microbiome may be useful for the diagnosis and treatment of malignant breast carcinoma.","PeriodicalId":50472,"journal":{"name":"Evolutionary Bioinformatics","volume":"17 ","pages":"11769343211057573"},"PeriodicalIF":2.6,"publicationDate":"2021-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ftp.ncbi.nlm.nih.gov/pub/pmc/oa_pdf/b1/42/10.1177_11769343211057573.PMC8593289.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"39637270","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Exploration of Prognostic Biomarkers of Muscle-Invasive Bladder Cancer (MIBC) by Bioinformatics. 肌肉浸润性膀胱癌(MIBC)预后生物标志物的生物信息学探索。

IF 2.6 4区生物学

Evolutionary Bioinformatics Pub Date : 2021-10-28 eCollection Date: 2021-01-01 DOI: 10.1177/11769343211049270

Xianglai Xu, Yelin Wang, Sihong Zhang, Yanjun Zhu, Jiajun Wang

{"title":"Exploration of Prognostic Biomarkers of Muscle-Invasive Bladder Cancer (MIBC) by Bioinformatics.","authors":"Xianglai Xu, Yelin Wang, Sihong Zhang, Yanjun Zhu, Jiajun Wang","doi":"10.1177/11769343211049270","DOIUrl":"https://doi.org/10.1177/11769343211049270","url":null,"abstract":"We aimed to discover prognostic factors of muscle-invasive bladder cancer (MIBC) and investigate their relationship with immune therapies. Online data of MIBC were obtained from The Cancer Genome Atlas (TCGA) and Gene Expression Omnibus database (GEO) database. Weighted gene co-expression network analysis (WGCNA) and univariate Cox analysis were applied to classify genes into different groups. Venn diagram was used to find the intersection of genes, and prognostic efficacy was proved by Kaplan-Meier analysis. Heatmap was utilized for differential analysis. Riskscore (RS) was calculated according to multivariate Cox analysis and evaluated by receiver operating characteristic curve (ROC). MIBC samples from TCGA and GEO were analyzed by WGCNA and univariate Cox analysis and intersected at 4 genes, CLK4, DEDD2, ENO1, and SYTL1. Higher SYTL1 and DEDD2 expressions were significantly correlated with high tumor grades. Riskscore based on genes showed great prognostic efficiency in predicting overall survival (OS), disease-specific survival (DSS), and progression-free interval (PFI) in TCGA dataset (P < .001). The area under the ROC curve (AUC) of RS reached 0.671 in predicting 1-year survival and 0.653 in 3-year survival. KEGG pathways enrichment filtered 5 enriched pathways. xCell analysis showed increased T cell CD4+ Th2 cell, macrophage, macrophage M1, and macrophage M2 infiltration in high RS samples (P < .001). In immune checkpoints analysis, PD-L1 expression was significantly higher in patients with high RS. We have, therefore, constructed RS as a convincing prognostic index for MIBC patients and found potential targeted pathways.","PeriodicalId":50472,"journal":{"name":"Evolutionary Bioinformatics","volume":"17 ","pages":"11769343211049270"},"PeriodicalIF":2.6,"publicationDate":"2021-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ftp.ncbi.nlm.nih.gov/pub/pmc/oa_pdf/d8/07/10.1177_11769343211049270.PMC8558584.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"39676752","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

Identification of Conserved Pappalysin 1-Derived Circular RNA-Mediated Competing Endogenous RNA in Osteosarcoma. 骨肉瘤中保守的Pappalysin - 1衍生环状RNA介导的竞争内源性RNA的鉴定。

IF 2.6 4区生物学

Evolutionary Bioinformatics Pub Date : 2021-10-21 eCollection Date: 2021-01-01 DOI: 10.1177/11769343211041379

Guang-Fu Ming, Bo-Hua Gao, Peng Chen

{"title":"Identification of Conserved Pappalysin 1-Derived Circular RNA-Mediated Competing Endogenous RNA in Osteosarcoma.","authors":"Guang-Fu Ming, Bo-Hua Gao, Peng Chen","doi":"10.1177/11769343211041379","DOIUrl":"https://doi.org/10.1177/11769343211041379","url":null,"abstract":"The etiology of osteosarcoma (OS) is complex and not fully understood till now. This study aimed to identify the miRNAs, circRNAs, and genes (mRNAs) that are differentially expressed in OS cell lines to investigate the mechanism of circRNA-associated competing endogenous RNAs (ceRNAs) in OS. Microarray datasets reporting mRNA (GSE70414), miRNA (GSE70367), and circRNA changes (GSE96964) in human OS cell lines were downloaded, differentially expressed (DE) RNAs were identified, and DEmRNAs were used for the annotation of Gene Ontology (GO) biological processes (BP), and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways. The mechanisms of DEcircRNA-mediated ceRNAs were identified in a step-by-step process. A total of 326 DEmRNAs, 45 DEmiRNAs, and 110 DEcircRNAs were identified from 3 datasets. The DEmRNAs were associated with GO BP terms, including cholesterol biosynthetic process, angiogenesis, extracellular matrix organization and KEGG pathways, including p53 signaling pathway and biosynthesis of antibiotics. The final ceRNA network consisted of 8 DEcircRNAs, including 5 pappalysin (PAPPA) 1-derived DEcircRNAs (hsa_circ_0005456, hsa_circ_0088209, hsa_circ_0002052, hsa_circ_0088214 and has_circ_0008792, all downregulated), 3 DEmiRNAs (hsa-miR-760, hsa-miR-4665-5p and hsa-miR-4539, all upregulated), and downregulated genes (including MMP13 and HMOX1). The ceRNA regulation network of OS was built, which played important roles in the pathogenesis of OS and might be of great importance in therapy.","PeriodicalId":50472,"journal":{"name":"Evolutionary Bioinformatics","volume":"17 ","pages":"11769343211041379"},"PeriodicalIF":2.6,"publicationDate":"2021-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ftp.ncbi.nlm.nih.gov/pub/pmc/oa_pdf/4a/09/10.1177_11769343211041379.PMC8544760.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"39564464","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Sequence-Based Prediction of Plant Protein-Protein Interactions by Combining Discrete Sine Transformation With Rotation Forest. 结合离散正弦变换和旋转森林的植物蛋白相互作用序列预测。

IF 2.6 4区生物学

Evolutionary Bioinformatics Pub Date : 2021-10-12 eCollection Date: 2021-01-01 DOI: 10.1177/11769343211050067

Jie Pan, Li-Ping Li, Chang-Qing Yu, Zhu-Hong You, Yong-Jian Guan, Zhong-Hao Ren

{"title":"Sequence-Based Prediction of Plant Protein-Protein Interactions by Combining Discrete Sine Transformation With Rotation Forest.","authors":"Jie Pan, Li-Ping Li, Chang-Qing Yu, Zhu-Hong You, Yong-Jian Guan, Zhong-Hao Ren","doi":"10.1177/11769343211050067","DOIUrl":"https://doi.org/10.1177/11769343211050067","url":null,"abstract":"Protein-protein interactions (PPIs) in plants are essential for understanding the regulation of biological processes. Although high-throughput technologies have been widely used to identify PPIs, they are usually laborious, expensive, and suffer from high false-positive rates. Therefore, it is imperative to develop novel computational approaches as a supplement tool to detect PPIs in plants. In this work, we presented a method, namely DST-RoF, to identify PPIs in plants by combining an ensemble learning classifier-Rotation Forest (RoF) with discrete sine transformation (DST). Specifically, plant protein sequence is firstly converted into Position-Specific Scoring Matrix (PSSM). Then, the discrete sine transformation was employed to extract effective features for obtaining the evolutionary information of proteins. Finally, these optimal features were fed into the RoF classifier for training and prediction. When performed on the plant datasets Arabidopsis, Rice, and Maize, DST-RoF yielded high prediction accuracy of 82.95%, 88.82%, and 93.70%, respectively. To further evaluate the prediction ability of our approach, we compared it with 4 state-of-the-art classifiers and 3 different feature extraction methods. Comprehensive experimental results anticipated that our method is feasible and robust for predicting potential plant-protein interacted pairs.","PeriodicalId":50472,"journal":{"name":"Evolutionary Bioinformatics","volume":"17 ","pages":"11769343211050067"},"PeriodicalIF":2.6,"publicationDate":"2021-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ftp.ncbi.nlm.nih.gov/pub/pmc/oa_pdf/b4/46/10.1177_11769343211050067.PMC8521741.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"39560690","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

Compelling Evidence Suggesting the Codon Usage of SARS-CoV-2 Adapts to Human After the Split From RaTG13. 令人信服的证据表明，SARS-CoV-2 的密码子用法在从 RaTG13 分裂后适应了人类。

IF 2.6 4区生物学

Evolutionary Bioinformatics Pub Date : 2021-10-08 eCollection Date: 2021-01-01 DOI: 10.1177/11769343211052013

Yanping Zhang, Xiaojie Jin, Haiyan Wang, Yaoyao Miao, Xiaoping Yang, Wenqing Jiang, Bin Yin

{"title":"Compelling Evidence Suggesting the Codon Usage of SARS-CoV-2 Adapts to Human After the Split From RaTG13.","authors":"Yanping Zhang, Xiaojie Jin, Haiyan Wang, Yaoyao Miao, Xiaoping Yang, Wenqing Jiang, Bin Yin","doi":"10.1177/11769343211052013","DOIUrl":"10.1177/11769343211052013","url":null,"abstract":"SARS-CoV-2 needs to efficiently make use of the resources from hosts in order to survive and propagate. Among the multiple layers of regulatory network, mRNA translation is the rate-limiting step in gene expression. Synonymous codon usage usually conforms with tRNA concentration to allow fast decoding during translation. It is acknowledged that SARS-CoV-2 has adapted to the codon usage of human lungs so that the virus could rapidly proliferate in the lung environment. While this notion seems to nicely explain the adaptation of SARS-CoV-2 to lungs, it is unable to tell why other viruses do not have this advantage. In this study, we retrieve the GTEx RNA-seq data for 30 tissues (belonging to over 17 000 individuals). We calculate the RSCU (relative synonymous codon usage) weighted by gene expression in each human sample, and investigate the correlation of RSCU between the human tissues and SARS-CoV-2 or RaTG13 (the closest coronavirus to SARS-CoV-2). Lung has the highest correlation of RSCU to SARS-CoV-2 among all tissues, suggesting that the lung environment is generally suitable for SARS-CoV-2. Interestingly, for most tissues, SARS-CoV-2 has higher correlations with the human samples compared with the RaTG13-human correlation. This difference is most significant for lungs. In conclusion, the codon usage of SARS-CoV-2 has adapted to human lungs to allow fast decoding and translation. This adaptation probably took place after SARS-CoV-2 split from RaTG13 because RaTG13 is less perfectly correlated with human. This finding depicts the trajectory of adaptive evolution from ancestral sequence to SARS-CoV-2, and also well explains why SARS-CoV-2 rather than other viruses could perfectly adapt to human lung environment.","PeriodicalId":50472,"journal":{"name":"Evolutionary Bioinformatics","volume":"17 ","pages":"11769343211052013"},"PeriodicalIF":2.6,"publicationDate":"2021-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ftp.ncbi.nlm.nih.gov/pub/pmc/oa_pdf/5c/4a/10.1177_11769343211052013.PMC8504689.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"39518083","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0