{"title":"simplifyEnrichment: A Bioconductor Package for Clustering and Visualizing Functional Enrichment Results","authors":"Zuguang Gu , Daniel Hübschmann","doi":"10.1016/j.gpb.2022.04.008","DOIUrl":"10.1016/j.gpb.2022.04.008","url":null,"abstract":"<div><p><strong>Functional enrichment</strong> analysis or gene set enrichment analysis is a basic bioinformatics method that evaluates the biological importance of a list of genes of interest. However, it may produce a long list of significant terms with highly redundant information that is difficult to summarize. Current tools to <strong>simplify enrichment</strong> results by <strong>clustering</strong> them into groups either still produce redundancy between clusters or do not retain consistent term similarities within clusters. We propose a new method named <em>binary cut</em> for clustering similarity matrices of functional terms. Through comprehensive benchmarks on both simulated and real-world datasets, we demonstrated that <em>binary cut</em> could efficiently cluster functional terms into groups where terms showed consistent similarities within groups and were mutually exclusive between groups. We compared <em>binary cut</em> clustering on the similarity matrices obtained from different similarity measures and found that semantic similarity worked well with <em>binary cut</em>, while similarity matrices based on gene overlap showed less consistent patterns. We implemented the <em>binary cut</em> algorithm in the R package <em>simplifyEnrichment</em>, which additionally provides functionalities for visualizing, summarizing, and comparing the clustering. The <em>simplifyEnrichment</em> package and the documentation are available at <span>https://bioconductor.org/packages/simplifyEnrichment/</span><svg><path></path></svg>.</p></div>","PeriodicalId":12528,"journal":{"name":"Genomics, Proteomics & Bioinformatics","volume":null,"pages":null},"PeriodicalIF":9.5,"publicationDate":"2023-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10373083/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9938752","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Chengbo Zhang , Zhenghan Lian , Bo Xu , Qingzhong Shen , Mingwei Bao , Zunxi Huang , Hongchen Jiang , Wenjun Li
{"title":"Gut Microbiome Variation Along A Lifestyle Gradient Reveals Threats Faced by Asian Elephants","authors":"Chengbo Zhang , Zhenghan Lian , Bo Xu , Qingzhong Shen , Mingwei Bao , Zunxi Huang , Hongchen Jiang , Wenjun Li","doi":"10.1016/j.gpb.2023.04.003","DOIUrl":"10.1016/j.gpb.2023.04.003","url":null,"abstract":"<div><p>The <strong>gut microbiome</strong> is closely related to host nutrition and health. However, the relationships between gut microorganisms and host <strong>lifestyle</strong> are not well characterized. In the absence of confounding geographic variation, we defined clear patterns of variation in the gut microbiomes of <strong>Asian elephants</strong> (AEs) in the Wild Elephant Valley, Xishuangbanna, China, along a lifestyle gradient (completely captive, semicaptive, semiwild, and completely wild). A <strong>phylogenetic analysis</strong> using the 16S rRNA gene sequences highlighted that the microbial diversity decreased as the degree of captivity increased. Furthermore, the results showed that the bacterial taxon WCHB1-41_c was substantially affected by lifestyle variations. qRT-PCR analysis revealed a paucity of genes related to butyrate production in the gut microbiome of AEs with a completely wild lifestyle, which may be due to the increased unfavorable environmental factors. Overall, these results demonstrate the distinct gut microbiome characteristics among AEs with a gradient of lifestyles and provide a basis for designing strategies to improve the well-being or conservation of this important animal species.</p></div>","PeriodicalId":12528,"journal":{"name":"Genomics, Proteomics & Bioinformatics","volume":null,"pages":null},"PeriodicalIF":9.5,"publicationDate":"2023-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10372918/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9884923","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yali Liu, Harry Cheuk-Hay Lau, Wing Yin Cheng, Jun Yu
{"title":"Gut Microbiome in Colorectal Cancer: Clinical Diagnosis and Treatment","authors":"Yali Liu, Harry Cheuk-Hay Lau, Wing Yin Cheng, Jun Yu","doi":"10.1016/j.gpb.2022.07.002","DOIUrl":"10.1016/j.gpb.2022.07.002","url":null,"abstract":"<div><p><strong>Colorectal cancer</strong> (CRC) is one of the most frequently diagnosed cancers and the leading cause of cancer-associated deaths. Epidemiological studies have shown that both genetic and environmental risk factors contribute to the development of CRC. Several metagenomic studies of CRC have identified gut dysbiosis as a fundamental risk factor in the evolution of colorectal malignancy. Although enormous efforts and substantial progresses have been made in understanding the relationship between human <strong>gut microbiome</strong> and CRC, the precise mechanisms involved remain elusive. Recent data have shown a direct causative role of the gut microbiome in DNA damage, inflammation, and drug resistance in CRC, suggesting that modulation of gut microbiome could act as a powerful tool in CRC prevention and therapy. Here, we provide an overview of the relationship between gut microbiome and CRC, and explore relevant mechanisms of colorectal tumorigenesis. We next highlight the potential of bacterial species as clinical biomarkers, as well as their roles in therapeutic response. Factors limiting the clinical translation of gut microbiome and strategies for resolving current challenges are further discussed.</p></div>","PeriodicalId":12528,"journal":{"name":"Genomics, Proteomics & Bioinformatics","volume":null,"pages":null},"PeriodicalIF":9.5,"publicationDate":"2023-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10372906/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9939253","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Wei-Zhen Zhou , Wenke Li , Huayan Shen , Ruby W. Wang , Wen Chen , Yujing Zhang , Qingyi Zeng , Hao Wang , Meng Yuan , Ziyi Zeng , Jinhui Cui , Chuan-Yun Li , Fred Y. Ye , Zhou Zhou
{"title":"CHDbase: A Comprehensive Knowledgebase for Congenital Heart Disease-related Genes and Clinical Manifestations","authors":"Wei-Zhen Zhou , Wenke Li , Huayan Shen , Ruby W. Wang , Wen Chen , Yujing Zhang , Qingyi Zeng , Hao Wang , Meng Yuan , Ziyi Zeng , Jinhui Cui , Chuan-Yun Li , Fred Y. Ye , Zhou Zhou","doi":"10.1016/j.gpb.2022.08.001","DOIUrl":"10.1016/j.gpb.2022.08.001","url":null,"abstract":"<div><p><strong>Congenital heart disease</strong> (CHD) is one of the<!--> <!-->most common causes of major birth defects, with a prevalence of 1%. Although an increasing number of studies have reported the etiology of CHD, the findings scattered throughout the literature are difficult to retrieve and utilize in research and clinical practice. We therefore developed CHDbase, an evidence-based knowledgebase of CHD-related genes and clinical manifestations manually curated from 1114 publications, linking 1124<!--> <!-->susceptibility genes and 3591 variations to more than 300 CHD types and related syndromes. Metadata such as the information of each publication and the selected population and samples, the strategy of studies, and the major findings of studies were integrated with each item of the research record. We also integrated functional annotations through parsing ∼ 50 <strong>databases</strong>/tools to facilitate the interpretation of these genes and variations in disease pathogenicity. We further prioritized the significance of these CHD-related genes with a gene interaction network approach and extracted a core CHD sub-network with 163 genes. The clear genetic landscape of CHD enables the phenotype <strong>classification</strong> based on the shared genetic origin. Overall, CHDbase provides a comprehensive and freely available resource to study CHD susceptibilities, supporting a wide range of users in the scientific and medical communities. CHDbase is accessible at <span>http://chddb.fwgenetics.org</span><svg><path></path></svg>.</p></div>","PeriodicalId":12528,"journal":{"name":"Genomics, Proteomics & Bioinformatics","volume":null,"pages":null},"PeriodicalIF":9.5,"publicationDate":"2023-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10372913/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9899876","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jun Zhang , Wenli Liu , Halimureti Simayijiang , Ping Hu , Jiangwei Yan
{"title":"Application of Microbiome in Forensics","authors":"Jun Zhang , Wenli Liu , Halimureti Simayijiang , Ping Hu , Jiangwei Yan","doi":"10.1016/j.gpb.2022.07.007","DOIUrl":"10.1016/j.gpb.2022.07.007","url":null,"abstract":"<div><p>Recent advances in <strong>next-generation sequencing</strong> technologies and improvements in <strong>bioinformatics</strong> have expanded the scope of <strong>microbiome</strong> analysis as a <strong>forensic</strong> tool. Microbiome research is concerned with the study of the compositional profile and diversity of microbial flora as well as the interactions between microbes, hosts, and the environment. It has opened up many new possibilities for forensic analysis. In this review, we discuss various <strong>applications</strong> of microbiome in forensics, including identification of individuals, geolocation inference, and post-mortem interval (PMI) estimation.</p></div>","PeriodicalId":12528,"journal":{"name":"Genomics, Proteomics & Bioinformatics","volume":null,"pages":null},"PeriodicalIF":9.5,"publicationDate":"2023-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10372919/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10275638","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ran Wang , Guangdun Peng , Patrick P.L. Tam , Naihe Jing
{"title":"Integration of Computational Analysis and Spatial Transcriptomics in Single-cell Studies","authors":"Ran Wang , Guangdun Peng , Patrick P.L. Tam , Naihe Jing","doi":"10.1016/j.gpb.2022.06.006","DOIUrl":"10.1016/j.gpb.2022.06.006","url":null,"abstract":"<div><p>Recent advances of single-cell transcriptomics technologies and allied computational methodologies have revolutionized molecular cell biology. Meanwhile, pioneering explorations in spatial transcriptomics have opened up avenues to address fundamental biological questions in health and diseases. Here, we review the technical attributes of single-cell RNA sequencing and spatial transcriptomics, and the core concepts of computational data analysis. We further highlight the challenges in the application of <strong>data integration</strong> methodologies and the interpretation of the biological context of the findings.</p></div>","PeriodicalId":12528,"journal":{"name":"Genomics, Proteomics & Bioinformatics","volume":null,"pages":null},"PeriodicalIF":9.5,"publicationDate":"2023-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10372908/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10261030","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Comprehensive Analysis of Ubiquitously Expressed Genes in Humans from A Data-driven Perspective","authors":"Jianlei Gu , Jiawei Dai , Hui Lu , Hongyu Zhao","doi":"10.1016/j.gpb.2021.08.017","DOIUrl":"10.1016/j.gpb.2021.08.017","url":null,"abstract":"<div><p>Comprehensive characterization of spatial and temporal gene expression patterns in humans is critical for uncovering the regulatory codes of the human genome and understanding the molecular mechanisms of human diseases. Ubiquitously expressed genes (UEGs) refer to the genes expressed across a majority of, if not all, phenotypic and physiological conditions of an organism. It is known that many human genes are broadly expressed across tissues. However, most previous UEG studies have only focused on providing a list of UEGs without capturing their global expression patterns, thus limiting the potential use of UEG information. In this study, we proposed a novel data-driven framework to leverage the extensive collection of ∼ 40,000 human transcriptomes to derive a list of UEGs and their corresponding global expression patterns, which offers a valuable resource to further characterize human transcriptome. Our results suggest that about half (12,234; 49.01%) of the human genes are expressed in at least 80% of human transcriptomes, and the median size of the human transcriptome is 16,342 genes (65.44%). Through gene clustering, we identified a set of UEGs, named LoVarUEGs, which have stable expression across human transcriptomes and can be used as internal reference genes for expression measurement. To further demonstrate the usefulness of this resource, we evaluated the global expression patterns for 16 previously predicted <strong>disallowed genes</strong> in islet beta cells and found that seven of these genes showed relatively more varied expression patterns, suggesting that the repression of these genes may not be unique to islet beta cells.</p></div>","PeriodicalId":12528,"journal":{"name":"Genomics, Proteomics & Bioinformatics","volume":null,"pages":null},"PeriodicalIF":9.5,"publicationDate":"2023-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10373092/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9899356","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Survey on Methods for Predicting Polyadenylation Sites from DNA Sequences, Bulk RNA-seq, and Single-cell RNA-seq","authors":"Wenbin Ye , Qiwei Lian , Congting Ye , Xiaohui Wu","doi":"10.1016/j.gpb.2022.09.005","DOIUrl":"10.1016/j.gpb.2022.09.005","url":null,"abstract":"<div><p>Alternative <strong>polyadenylation</strong> (APA) plays important roles in modulating mRNA stability, translation, and subcellular localization, and contributes extensively to shaping eukaryotic transcriptome complexity and proteome diversity. Identification of poly(A) sites (pAs) on a genome-wide scale is a critical step toward understanding the underlying mechanism of APA-mediated gene regulation. A number of established computational tools have been proposed to predict pAs from diverse genomic data. Here we provided an exhaustive overview of computational approaches for predicting pAs from DNA sequences, bulk RNA sequencing (<strong>RNA-seq</strong>) data, and single-cell RNA sequencing (<strong>scRNA-seq</strong>) data. Particularly, we examined several representative tools using bulk RNA-seq and scRNA-seq data from peripheral blood mononuclear cells and put forward operable suggestions on how to assess the reliability of pAs predicted by different tools. We also proposed practical guidelines on choosing appropriate methods applicable to diverse scenarios. Moreover, we discussed in depth the challenges in improving the performance of pA prediction and benchmarking different methods. Additionally, we highlighted outstanding challenges and opportunities using new <strong>machine learning</strong> and integrative multi-omics techniques, and provided our perspective on how computational methodologies might evolve in the future for non-3′ untranslated region, tissue-specific, cross-species, and single-cell pA prediction.</p></div>","PeriodicalId":12528,"journal":{"name":"Genomics, Proteomics & Bioinformatics","volume":null,"pages":null},"PeriodicalIF":9.5,"publicationDate":"2023-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ftp.ncbi.nlm.nih.gov/pub/pmc/oa_pdf/ff/97/main.PMC10372920.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9953906","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xiumei Xing , Cheng Ai , Tianjiao Wang , Yang Li , Huitao Liu , Pengfei Hu , Guiwu Wang , Huamiao Liu , Hongliang Wang , Ranran Zhang , Junjun Zheng , Xiaobo Wang , Lei Wang , Yuxiao Chang , Qian Qian , Jinghua Yu , Lixin Tang , Shigang Wu , Xiujuan Shao , Alun Li , Fuhe Yang
{"title":"The First High-quality Reference Genome of Sika Deer Provides Insights into High-tannin Adaptation","authors":"Xiumei Xing , Cheng Ai , Tianjiao Wang , Yang Li , Huitao Liu , Pengfei Hu , Guiwu Wang , Huamiao Liu , Hongliang Wang , Ranran Zhang , Junjun Zheng , Xiaobo Wang , Lei Wang , Yuxiao Chang , Qian Qian , Jinghua Yu , Lixin Tang , Shigang Wu , Xiujuan Shao , Alun Li , Fuhe Yang","doi":"10.1016/j.gpb.2022.05.008","DOIUrl":"10.1016/j.gpb.2022.05.008","url":null,"abstract":"<div><p><strong>Sika deer</strong> are known to prefer <strong>oak leaves</strong>, which are rich in tannins and toxic to most mammals; however, the genetic mechanisms underlying their unique ability to adapt to living in the jungle are still unclear. In identifying the mechanism responsible for the tolerance of a highly toxic diet, we have made a major advancement by explaining the genome of sika deer. We generated the first high-quality, chromosome-level genome assembly of sika deer and measured the correlation between tannin intake and RNA expression in 15 tissues through 180 experiments. Comparative genome analyses showed that the <em>UGT</em> and <em>CYP</em> gene families are functionally involved in the adaptation of sika deer to high-tannin food, especially the expansion of the <em>UGT</em> family 2 subfamily B of <em>UGT</em> genes. The first chromosome-level assembly and genetic characterization of the tolerance to a highly toxic diet suggest that the sika deer genome may serve as an essential resource for understanding evolutionary events and tannin adaptation. Our study provides a paradigm of comparative expressive genomics that can be applied to the study of unique biological features in non-model animals.</p></div>","PeriodicalId":12528,"journal":{"name":"Genomics, Proteomics & Bioinformatics","volume":null,"pages":null},"PeriodicalIF":9.5,"publicationDate":"2023-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10372904/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9885133","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Qian Zhao , Longqing Shi , Weiyi He , Jinyu Li , Shijun You , Shuai Chen , Jing Lin , Yibin Wang , Liwen Zhang , Guang Yang , Liette Vasseur , Minsheng You
{"title":"Genomic Variations in the Tea Leafhopper Reveal the Basis of Its Adaptive Evolution","authors":"Qian Zhao , Longqing Shi , Weiyi He , Jinyu Li , Shijun You , Shuai Chen , Jing Lin , Yibin Wang , Liwen Zhang , Guang Yang , Liette Vasseur , Minsheng You","doi":"10.1016/j.gpb.2022.05.011","DOIUrl":"10.1016/j.gpb.2022.05.011","url":null,"abstract":"<div><p><strong>Tea green leafhopper</strong> (TGL), <em>Empoasca onukii</em>, is of biological and economic interest. Despite numerous studies, the mechanisms underlying its adaptation and evolution remain enigmatic. Here, we use previously untapped genome and <strong>population genetics</strong> approaches to examine how the pest adapted to different environmental variables and thus has expanded geographically. We complete a chromosome-level assembly and annotation of the <em>E</em>. <em>onukii</em> genome, showing notable expansions of gene families associated with adaptation to chemoreception and detoxification. Genomic signals indicating balancing selection highlight metabolic pathways involved in adaptation to a wide range of tea varieties grown across ecologically diverse regions. Patterns of genetic variations among 54 <em>E</em>. <em>onukii</em> samples unveil the population structure and <strong>evolutionary history</strong> across different tea-growing regions in China. Our results demonstrate that the genomic changes in key pathways, including those linked to metabolism, circadian rhythms, and immune system functions, may underlie the successful spread and adaptation of <em>E</em>. <em>onukii</em>. This work highlights the genetic and molecular basis underlying the evolutionary success of a species with broad economic impacts, and provides insights into insect adaptation to host plants, which will ultimately facilitate more sustainable pest management.</p></div>","PeriodicalId":12528,"journal":{"name":"Genomics, Proteomics & Bioinformatics","volume":null,"pages":null},"PeriodicalIF":9.5,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10225489/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9534744","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}