Irene van den Bent, Stavros Makrodimitris, Marcel Reinders
{"title":"The Power of Universal Contextualized Protein Embeddings in Cross-species Protein Function Prediction.","authors":"Irene van den Bent, Stavros Makrodimitris, Marcel Reinders","doi":"10.1177/11769343211062608","DOIUrl":"10.1177/11769343211062608","url":null,"abstract":"<p><p>Computationally annotating proteins with a molecular function is a difficult problem that is made even harder due to the limited amount of available labeled protein training data. Unsupervised protein embeddings partly circumvent this limitation by learning a universal protein representation from many unlabeled sequences. Such embeddings incorporate contextual information of amino acids, thereby modeling the underlying principles of protein sequences insensitive to the context of species. We used an existing pre-trained protein embedding method and subjected its molecular function prediction performance to detailed characterization, first to advance the understanding of protein language models, and second to determine areas of improvement. Then, we applied the model in a transfer learning task by training a function predictor based on the embeddings of annotated protein sequences of one training species and making predictions on the proteins of several test species with varying evolutionary distance. We show that this approach successfully generalizes knowledge about protein function from one eukaryotic species to various other species, outperforming both an alignment-based and a supervised-learning-based baseline. This implies that such a method could be effective for molecular function prediction in inadequately annotated species from understudied taxonomic kingdoms.</p>","PeriodicalId":50472,"journal":{"name":"Evolutionary Bioinformatics","volume":"17 ","pages":"11769343211062608"},"PeriodicalIF":1.7,"publicationDate":"2021-12-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8647222/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"39957598","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Liou Huang, Chunrong Wu, Dan Xu, Yuhui Cui, Jianguo Tang
{"title":"Screening of Important Factors in the Early Sepsis Stage Based on the Evaluation of ssGSEA Algorithm and ceRNA Regulatory Network.","authors":"Liou Huang, Chunrong Wu, Dan Xu, Yuhui Cui, Jianguo Tang","doi":"10.1177/11769343211058463","DOIUrl":"https://doi.org/10.1177/11769343211058463","url":null,"abstract":"<p><strong>Background: </strong>Sepsis is a dysregulated host response to pathogens. Delay in sepsis diagnosis has become a primary cause of patient death. This study determines some factors to prevent septic shock in its early stage, contributing to the early treatment of sepsis.</p><p><strong>Methods: </strong>The sequencing data (RNA- and miRNA-sequencing) of patients with septic shock were obtained from the NCBI GEO database. After re-annotation, we obtained lncRNAs, miRNA, and mRNA information. Then, we evaluated the immune characteristics of the sample based on the ssGSEA algorithm. We used the WGCNA algorithm to obtain genes significantly related to immunity and screen for important related factors by constructing a ceRNA regulatory network.</p><p><strong>Result: </strong>After re-annotation, we obtained 1708 lncRNAs, 129 miRNAs, and 17 326 mRNAs. Also, through the ssGSEA algorithm, we obtained 5 important immune cells. Finally, we constructed a ceRNA regulation network associated with SS pathways.</p><p><strong>Conclusion: </strong>We identified 5 immune cells with significant changes in the early stage of septic shock. We also constructed a ceRNA network, which will help us explore the pathogenesis of septic shock.</p>","PeriodicalId":50472,"journal":{"name":"Evolutionary Bioinformatics","volume":"17 ","pages":"11769343211058463"},"PeriodicalIF":2.6,"publicationDate":"2021-11-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ftp.ncbi.nlm.nih.gov/pub/pmc/oa_pdf/ac/ad/10.1177_11769343211058463.PMC8637398.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"39693076","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"AURKB as a Promising Prognostic Biomarker in Hepatocellular Carcinoma.","authors":"Jingchuan Xiao, Yingai Zhang","doi":"10.1177/11769343211057589","DOIUrl":"https://doi.org/10.1177/11769343211057589","url":null,"abstract":"<p><p>The Aurora kinases form a family of 3 genes encoding serine/threonine kinases and are involved in the regulation of cell division during the mitosis. This study was designed to investigate the prognostic role of Aurora kinases in hepatocellular carcinoma (HCC). In this study, we analyzed the expression, overall survival (OS) data, promoter methylation level, and relationship with immunoinhibitors of Aurora kinases in patients with HCC from GEPIA2, UALCAN, OncoLnc, and TISIDB databases. Protein-protein interaction (PPI) network, gene ontology, Kyoto Encyclopedia of Genes and Genomes (KEGG), and Reactome pathway analysis were performed using the STRING database and Cytoscape software. We found that the mRNA expression, stages of HCC, and OS of AURKA and AURKB in HCC tissues were significantly different from control tissues, but there were significant inconsistencies in promoter methylation level and relationship with immunoinhibitors for AURKA and AURKB. None of the above items were significantly different for AURKC. Furthermore, a hub module including AURKA, AURKB, and AURKC was identified within the PPI network constructed with the Molecular Complex Detection (MCODE) plug-in in Cytoscape software. Our results show that AURKB could be a potential biomarker for HCC prognosis.</p>","PeriodicalId":50472,"journal":{"name":"Evolutionary Bioinformatics","volume":"17 ","pages":"11769343211057589"},"PeriodicalIF":2.6,"publicationDate":"2021-11-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ftp.ncbi.nlm.nih.gov/pub/pmc/oa_pdf/20/c3/10.1177_11769343211057589.PMC8637395.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"39693075","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Comparison of the Gut Microbiota in Patients with Benign and Malignant Breast Tumors: A Pilot Study.","authors":"Peidong Yang, Zhitang Wang, Qingqin Peng, Weibin Lian, Debo Chen","doi":"10.1177/11769343211057573","DOIUrl":"10.1177/11769343211057573","url":null,"abstract":"<p><p>The microbiome plays diverse roles in many diseases and can potentially contribute to cancer development. Breast cancer is the most commonly diagnosed cancer in women worldwide. Thus, we investigated whether the gut microbiota differs between patients with breast carcinoma and those with benign tumors. The DNA of the fecal microbiota community was detected by Illumina sequencing and the taxonomy of 16S rRNA genes. The α-diversity and β-diversity analyses were used to determine richness and evenness of the gut microbiota. Gene function prediction of the microbiota in patients with benign and malignant carcinoma was performed using PICRUSt. There was no significant difference in the α-diversity between patients with benign and malignant tumors (<i>P</i> = 3.15e<sup>-1</sup> for the Chao index and <i>P</i> = 3.1e<sup>-1</sup> for the ACE index). The microbiota composition was different between the 2 groups, although no statistical difference was observed in β-diversity. Of the 31 different genera compared between the 2 groups, level of only <i>Citrobacter</i> was significantly higher in the malignant tumor group than that in benign tumor group. The metabolic pathways of the gut microbiome in the malignant tumor group were significantly different from those in benign tumor group. Furthermore, the study establishes the distinct richness of the gut microbiome in patients with breast cancer with different clinicopathological factors, including ER, PR, Ki-67 level, Her2 status, and tumor grade. These findings suggest that the gut microbiome may be useful for the diagnosis and treatment of malignant breast carcinoma.</p>","PeriodicalId":50472,"journal":{"name":"Evolutionary Bioinformatics","volume":"17 ","pages":"11769343211057573"},"PeriodicalIF":2.6,"publicationDate":"2021-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ftp.ncbi.nlm.nih.gov/pub/pmc/oa_pdf/b1/42/10.1177_11769343211057573.PMC8593289.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"39637270","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xianglai Xu, Yelin Wang, Sihong Zhang, Yanjun Zhu, Jiajun Wang
{"title":"Exploration of Prognostic Biomarkers of Muscle-Invasive Bladder Cancer (MIBC) by Bioinformatics.","authors":"Xianglai Xu, Yelin Wang, Sihong Zhang, Yanjun Zhu, Jiajun Wang","doi":"10.1177/11769343211049270","DOIUrl":"https://doi.org/10.1177/11769343211049270","url":null,"abstract":"<p><p>We aimed to discover prognostic factors of muscle-invasive bladder cancer (MIBC) and investigate their relationship with immune therapies. Online data of MIBC were obtained from The Cancer Genome Atlas (TCGA) and Gene Expression Omnibus database (GEO) database. Weighted gene co-expression network analysis (WGCNA) and univariate Cox analysis were applied to classify genes into different groups. Venn diagram was used to find the intersection of genes, and prognostic efficacy was proved by Kaplan-Meier analysis. Heatmap was utilized for differential analysis. Riskscore (RS) was calculated according to multivariate Cox analysis and evaluated by receiver operating characteristic curve (ROC). MIBC samples from TCGA and GEO were analyzed by WGCNA and univariate Cox analysis and intersected at 4 genes, CLK4, DEDD2, ENO1, and SYTL1. Higher SYTL1 and DEDD2 expressions were significantly correlated with high tumor grades. Riskscore based on genes showed great prognostic efficiency in predicting overall survival (OS), disease-specific survival (DSS), and progression-free interval (PFI) in TCGA dataset (<i>P</i> < .001). The area under the ROC curve (AUC) of RS reached 0.671 in predicting 1-year survival and 0.653 in 3-year survival. KEGG pathways enrichment filtered 5 enriched pathways. xCell analysis showed increased T cell CD4+ Th2 cell, macrophage, macrophage M1, and macrophage M2 infiltration in high RS samples (<i>P</i> < .001). In immune checkpoints analysis, PD-L1 expression was significantly higher in patients with high RS. We have, therefore, constructed RS as a convincing prognostic index for MIBC patients and found potential targeted pathways.</p>","PeriodicalId":50472,"journal":{"name":"Evolutionary Bioinformatics","volume":"17 ","pages":"11769343211049270"},"PeriodicalIF":2.6,"publicationDate":"2021-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ftp.ncbi.nlm.nih.gov/pub/pmc/oa_pdf/d8/07/10.1177_11769343211049270.PMC8558584.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"39676752","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Identification of Conserved Pappalysin 1-Derived Circular RNA-Mediated Competing Endogenous RNA in Osteosarcoma.","authors":"Guang-Fu Ming, Bo-Hua Gao, Peng Chen","doi":"10.1177/11769343211041379","DOIUrl":"https://doi.org/10.1177/11769343211041379","url":null,"abstract":"<p><p>The etiology of osteosarcoma (OS) is complex and not fully understood till now. This study aimed to identify the miRNAs, circRNAs, and genes (mRNAs) that are differentially expressed in OS cell lines to investigate the mechanism of circRNA-associated competing endogenous RNAs (ceRNAs) in OS. Microarray datasets reporting mRNA (GSE70414), miRNA (GSE70367), and circRNA changes (GSE96964) in human OS cell lines were downloaded, differentially expressed (DE) RNAs were identified, and DEmRNAs were used for the annotation of Gene Ontology (GO) biological processes (BP), and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways. The mechanisms of DEcircRNA-mediated ceRNAs were identified in a step-by-step process. A total of 326 DEmRNAs, 45 DEmiRNAs, and 110 DEcircRNAs were identified from 3 datasets. The DEmRNAs were associated with GO BP terms, including cholesterol biosynthetic process, angiogenesis, extracellular matrix organization and KEGG pathways, including p53 signaling pathway and biosynthesis of antibiotics. The final ceRNA network consisted of 8 DEcircRNAs, including 5 pappalysin (PAPPA) 1-derived DEcircRNAs (hsa_circ_0005456, hsa_circ_0088209, hsa_circ_0002052, hsa_circ_0088214 and has_circ_0008792, all downregulated), 3 DEmiRNAs (hsa-miR-760, hsa-miR-4665-5p and hsa-miR-4539, all upregulated), and downregulated genes (including MMP13 and HMOX1). The ceRNA regulation network of OS was built, which played important roles in the pathogenesis of OS and might be of great importance in therapy.</p>","PeriodicalId":50472,"journal":{"name":"Evolutionary Bioinformatics","volume":"17 ","pages":"11769343211041379"},"PeriodicalIF":2.6,"publicationDate":"2021-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ftp.ncbi.nlm.nih.gov/pub/pmc/oa_pdf/4a/09/10.1177_11769343211041379.PMC8544760.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"39564464","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Sequence-Based Prediction of Plant Protein-Protein Interactions by Combining Discrete Sine Transformation With Rotation Forest.","authors":"Jie Pan, Li-Ping Li, Chang-Qing Yu, Zhu-Hong You, Yong-Jian Guan, Zhong-Hao Ren","doi":"10.1177/11769343211050067","DOIUrl":"https://doi.org/10.1177/11769343211050067","url":null,"abstract":"<p><p>Protein-protein interactions (PPIs) in plants are essential for understanding the regulation of biological processes. Although high-throughput technologies have been widely used to identify PPIs, they are usually laborious, expensive, and suffer from high false-positive rates. Therefore, it is imperative to develop novel computational approaches as a supplement tool to detect PPIs in plants. In this work, we presented a method, namely DST-RoF, to identify PPIs in plants by combining an ensemble learning classifier-Rotation Forest (RoF) with discrete sine transformation (DST). Specifically, plant protein sequence is firstly converted into Position-Specific Scoring Matrix (PSSM). Then, the discrete sine transformation was employed to extract effective features for obtaining the evolutionary information of proteins. Finally, these optimal features were fed into the RoF classifier for training and prediction. When performed on the plant datasets Arabidopsis, Rice, and Maize, DST-RoF yielded high prediction accuracy of 82.95%, 88.82%, and 93.70%, respectively. To further evaluate the prediction ability of our approach, we compared it with 4 state-of-the-art classifiers and 3 different feature extraction methods. Comprehensive experimental results anticipated that our method is feasible and robust for predicting potential plant-protein interacted pairs.</p>","PeriodicalId":50472,"journal":{"name":"Evolutionary Bioinformatics","volume":"17 ","pages":"11769343211050067"},"PeriodicalIF":2.6,"publicationDate":"2021-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ftp.ncbi.nlm.nih.gov/pub/pmc/oa_pdf/b4/46/10.1177_11769343211050067.PMC8521741.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"39560690","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Compelling Evidence Suggesting the Codon Usage of SARS-CoV-2 Adapts to Human After the Split From RaTG13.","authors":"Yanping Zhang, Xiaojie Jin, Haiyan Wang, Yaoyao Miao, Xiaoping Yang, Wenqing Jiang, Bin Yin","doi":"10.1177/11769343211052013","DOIUrl":"10.1177/11769343211052013","url":null,"abstract":"<p><p>SARS-CoV-2 needs to efficiently make use of the resources from hosts in order to survive and propagate. Among the multiple layers of regulatory network, mRNA translation is the rate-limiting step in gene expression. Synonymous codon usage usually conforms with tRNA concentration to allow fast decoding during translation. It is acknowledged that SARS-CoV-2 has adapted to the codon usage of human lungs so that the virus could rapidly proliferate in the lung environment. While this notion seems to nicely explain the adaptation of SARS-CoV-2 to lungs, it is unable to tell why other viruses do not have this advantage. In this study, we retrieve the GTEx RNA-seq data for 30 tissues (belonging to over 17 000 individuals). We calculate the RSCU (relative synonymous codon usage) weighted by gene expression in each human sample, and investigate the correlation of RSCU between the human tissues and SARS-CoV-2 or RaTG13 (the closest coronavirus to SARS-CoV-2). Lung has the highest correlation of RSCU to SARS-CoV-2 among all tissues, suggesting that the lung environment is generally suitable for SARS-CoV-2. Interestingly, for most tissues, SARS-CoV-2 has higher correlations with the human samples compared with the RaTG13-human correlation. This difference is most significant for lungs. In conclusion, the codon usage of SARS-CoV-2 has adapted to human lungs to allow fast decoding and translation. This adaptation probably took place after SARS-CoV-2 split from RaTG13 because RaTG13 is less perfectly correlated with human. This finding depicts the trajectory of adaptive evolution from ancestral sequence to SARS-CoV-2, and also well explains why SARS-CoV-2 rather than other viruses could perfectly adapt to human lung environment.</p>","PeriodicalId":50472,"journal":{"name":"Evolutionary Bioinformatics","volume":"17 ","pages":"11769343211052013"},"PeriodicalIF":2.6,"publicationDate":"2021-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ftp.ncbi.nlm.nih.gov/pub/pmc/oa_pdf/5c/4a/10.1177_11769343211052013.PMC8504689.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"39518083","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yongjiang Qian, Lili Zhang, Zhen Sun, Guangyao Zang, Yalan Li, Zhongqun Wang, Lihua Li
{"title":"Biomarkers of Blood from Patients with Atherosclerosis Based on Bioinformatics Analysis.","authors":"Yongjiang Qian, Lili Zhang, Zhen Sun, Guangyao Zang, Yalan Li, Zhongqun Wang, Lihua Li","doi":"10.1177/11769343211046020","DOIUrl":"https://doi.org/10.1177/11769343211046020","url":null,"abstract":"<p><p>Atherosclerosis is a multifaceted disease characterized by the formation and accumulation of plaques that attach to arteries and cause cardiovascular disease and vascular embolism. A range of diagnostic techniques, including selective coronary angiography, stress tests, computerized tomography, and nuclear scans, assess cardiovascular disease risk and treatment targets. However, there is currently no simple blood biochemical index or biological target for the diagnosis of atherosclerosis. Therefore, it is of interest to find a biochemical blood marker for atherosclerosis. Three datasets from the Gene Expression Omnibus (GEO) database were analyzed to obtain differentially expressed genes (DEG) and the results were integrated using the Robustrankaggreg algorithm. The genes considered more critical by the Robustrankaggreg algorithm were put into their own data set and the data set system with cell classification information for verification. Twenty-one possible genes were screened out. Interestingly, we found a good correlation between <i>RPS4Y1</i>, <i>EIF1AY</i>, and <i>XIST</i>. In addition, we know the general expression of these genes in different cell types and whole blood cells. In this study, we identified <i>BTNL8</i> and <i>BLNK</i> as having good clinical significance. These results will contribute to the analysis of the underlying genes involved in the progression of atherosclerosis and provide insights for the discovery of new diagnostic and evaluation methods.</p>","PeriodicalId":50472,"journal":{"name":"Evolutionary Bioinformatics","volume":"17 ","pages":"11769343211046020"},"PeriodicalIF":2.6,"publicationDate":"2021-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ftp.ncbi.nlm.nih.gov/pub/pmc/oa_pdf/a0/be/10.1177_11769343211046020.PMC8477683.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"39477141","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Erratum to \"On the matrix condition of phylogenetic tree\".","authors":"","doi":"10.1177/11769343211046767","DOIUrl":"https://doi.org/10.1177/11769343211046767","url":null,"abstract":"<p><p>[This corrects the article DOI: 10.1177/1176934320901721.].</p>","PeriodicalId":50472,"journal":{"name":"Evolutionary Bioinformatics","volume":"17 ","pages":"11769343211046767"},"PeriodicalIF":2.6,"publicationDate":"2021-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ftp.ncbi.nlm.nih.gov/pub/pmc/oa_pdf/36/d0/10.1177_11769343211046767.PMC8436297.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"39421044","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}