{"title":"Enhancing Variant Calling in Whole-Exome Sequencing Data Using Population-Matched Reference Genomes.","authors":"Shuming Guo, Zhuo Huang, Yanming Zhang, Yukun He, Xiangju Chen, Wenjuan Wang, Lansheng Li, Yu Kang, Zhancheng Gao, Jun Yu, Zhenglin Du, Yanan Chu","doi":"10.1093/gpbjnl/qzae070","DOIUrl":"https://doi.org/10.1093/gpbjnl/qzae070","url":null,"abstract":"<p><p>Whole-exome sequencing (WES) data are frequently used for cancer diagnosis and genome-wide association studies (GWAS), based on high-coverage read mapping, informative variant calling, and high-quality reference genomes. The center position of the currently used genome assembly, GRCh38, is now challenged by two newly published telomere-to-telomere (T2T) genomes, T2T-CHM13 and T2T-YAO, and it becomes urgent to have a comparative study to test population specificity using the three reference genomes based on real case WES data. Here we report our analysis along this line for 19 tumor samples collected from Chinese patients. The primary comparison of the exon regions among the three references reveals that the sequences in up to ∼ 1% target regions in T2T-YAO are widely diversified from GRCh38 and may lead to off-target in sequence capture. However, T2T-YAO still outperforms GRCh38 genomes by obtaining 7.41% more mapped reads. Due to more reliable read-mapping and closer phylogenetic relationship with the samples than GRCh38, T2T-YAO reduces half of variant calls of clinical significance which are mostly benign, while maintaining sensitivity in identifying pathogenic variants. T2T-YAO also outperforms T2T-CHM13 in reducing calls of Chinese-specific variants. Our findings highlight the critical need for employing population-specific reference genomes in genomic analysis to ensure accurate variant analysis and the significant benefits of tailoring these approaches to the unique genetic backgrounds of each ethnic group.</p>","PeriodicalId":94020,"journal":{"name":"Genomics, proteomics & bioinformatics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142396282","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"CIEC: Cross-tissue Immune Cell Type Enrichment and Expression Map Visualization for Cancer.","authors":"Jinhua He, Haitao Luo, Wei Wang, Dechao Bu, Zhengkai Zou, Haolin Wang, Hongzhen Tang, Zeping Han, Wenfeng Luo, Jian Shen, Fangmei Xie, Yi Zhao, Zhiming Xiang","doi":"10.1093/gpbjnl/qzae067","DOIUrl":"https://doi.org/10.1093/gpbjnl/qzae067","url":null,"abstract":"<p><p>Single-cell transcriptome sequencing technology has been applied to decode the cell types and functional states of immune cells, revealing their tissue-specific gene expression patterns and functions in cancer immunity. Comprehensive assessments of immune cells within and across tissues will provide us with a deeper understanding of the tumor immune system in general. Here, we present Cross-tissue Immune cell type or state Enrichment analysis of gene lists for Cancer (CIEC), the first web-based application that integrates database and enrichment analysis to estimate the cross-tissue immune cell type or state. CIEC version 1.0 consists of 480 samples covering primary tumor, adjacent normal tissue, lymph node, metastasis tissue, and peripheral blood from 323 cancer patients. By applying integrative analysis, we constructed an immune cell-type/state map for each context and adopted our previously developed Kyoto Encyclopedia of Genes and Genomes (KEGG) Orthology Based Annotation System (KOBAS) algorithm to estimate the enrichment for context-specific immune cell type/state. In addition, CIEC also provides an easy-to-use online interface for users to comprehensively analyze the immune cell characteristics mapped across multiple tissues, including expression map, correlation, similar genes detection, signature score, and expression comparison. We believe that CIEC will be a valuable resource for exploring the intrinsic characteristics of immune cells in cancer patients and for potentially guiding novel cancer-immune biomarker development and immunotherapy strategies. CIEC is freely accessible at http://ciec.gene.ac/.</p>","PeriodicalId":94020,"journal":{"name":"Genomics, proteomics & bioinformatics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142373917","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Qingqing Shi, Min Dai, Yingke Ma, Jun Liu, Xiuying Liu, Xiu-Jie Wang
{"title":"DRED: A Comprehensive Database of Genes Related to Repeat Expansion Diseases.","authors":"Qingqing Shi, Min Dai, Yingke Ma, Jun Liu, Xiuying Liu, Xiu-Jie Wang","doi":"10.1093/gpbjnl/qzae068","DOIUrl":"https://doi.org/10.1093/gpbjnl/qzae068","url":null,"abstract":"<p><p>Expansion of tandem repeats in genes often causes severe neuromuscular diseases, such as fragile X syndrome, Huntington's disease, and spinocerebellar ataxia. However, information on genes associated with repeat expansion diseases is scattered throughout the literature, systematic prediction of potential genes that may cause diseases via repeat expansion is also lacking. Here, we develop DRED, a Database of genes related to Repeat Expansion Diseases, as a manually-curated database that covers all known 61 genes related to repeat expansion diseases reported in PubMed and OMIM, along with detailed repeat information for each gene. DRED also includes 516 genes with the potential to cause diseases via repeat expansion, which were predicted based on their repeat composition, genetic variations, genomic features, and disease associations. Various types of information on repeat expansion diseases and their corresponding genes/repeats are presented in DRED, together with links to external resources, such as NCBI and ClinVar. DRED provides user-friendly interfaces with comprehensive functions, and can serve as a central data resource for basic research and repeat expansion disease-related medical diagnosis. DRED is freely accessible at http://omicslab.genetics.ac.cn/dred, and is frequently updated to include newly reported genes related to repeat expansion diseases.</p>","PeriodicalId":94020,"journal":{"name":"Genomics, proteomics & bioinformatics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142335108","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yanfang Lu, Liu Yang, Qi Feng, Yong Liu, Xiaohui Sun, Dongwei Liu, Long Qiao, Zhangsuo Liu
{"title":"RNA 5-Methylcytosine Modification: Regulatory Molecules, Biological Functions, and Human Diseases.","authors":"Yanfang Lu, Liu Yang, Qi Feng, Yong Liu, Xiaohui Sun, Dongwei Liu, Long Qiao, Zhangsuo Liu","doi":"10.1093/gpbjnl/qzae063","DOIUrl":"https://doi.org/10.1093/gpbjnl/qzae063","url":null,"abstract":"<p><p>RNA methylation modifications influence gene expression, and disruptions of these processes are often associated with various human diseases. The common RNA methylation modification 5-methylcytosine (m5C), which is dynamically regulated by writers, erasers, and readers, widely occurs in transfer RNAs (tRNAs), messenger RNAs (mRNAs), ribosomal RNAs (rRNAs), enhancer RNAs (eRNAs), and other non-coding RNAs (ncRNAs). RNA m5C modification regulates metabolism, stability, nuclear export, and translation of RNA molecules. An increasing number of studies have revealed the critical roles of the m5C RNA modification and its regulators in the development, diagnosis, prognosis, and treatment of various human diseases. In this review, we summarized the recent studies on RNA m5C modification and discussed the advances in its detection methodologies, distribution, and regulators. Furthermore, we addressed the significance of RNAs modified with m5C marks in essential biological processes as well as in the development of various human disorders, from neurological diseases to cancers. This review provides a new perspective on the diagnosis, treatment, and monitoring of human diseases by elucidating the complex regulatory network of the epigenetic m5C modification.</p>","PeriodicalId":94020,"journal":{"name":"Genomics, proteomics & bioinformatics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142335110","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Zhenyong Du, Gregory Gelembiuk, Wynne Moss, Andrew Tritt, Carol Eunmi Lee
{"title":"The Genome Architecture of the Copepod Eurytemora carolleeae, the Highly Invasive Atlantic Clade of the E. affinis Species Complex.","authors":"Zhenyong Du, Gregory Gelembiuk, Wynne Moss, Andrew Tritt, Carol Eunmi Lee","doi":"10.1093/gpbjnl/qzae066","DOIUrl":"https://doi.org/10.1093/gpbjnl/qzae066","url":null,"abstract":"<p><p>Copepods are among the most abundant organisms on the planet and play critical functions in aquatic ecosystems. Among copepods, populations of the Eurytemora affinis species complex are numerically dominant in many coastal habitats and are food sources for major fisheries. Intriguingly, certain populations possess the unusual capacity to invade novel salinities on rapid time scales. Despite their ecological importance, high-quality genomic resources have been absent for calanoid copepods, limiting our ability to comprehensively dissect the genome architecture underlying the highly invasive and adaptive capacity of certain populations. Here, we presented the first chromosome-level genome of a calanoid copepod, from the Atlantic clade (Eurytemora carolleeae) of the E. affinis species complex. This genome was assembled using high-coverage long-read and high-throughput chromosome conformation capture sequences of an inbred line, generated through 30 generations of full-sib mating. This genome, consisting of 529.3 megabase (Mb) (contig N50 = 4.2 Mb, scaffold N50 = 140.6 Mb), was anchored onto four chromosomes. Genome annotation predicted 20,262 protein-coding genes, of which ion transporter gene families were substantially expanded based on comparative analyses of 12 additional arthropod genomes. Also, we found genome-wide signatures of historical gene body methylation of the ion transporter genes and the significant clustering of these genes on each chromosome. This genome represents one of the most contiguous copepod genomes to date and among the highest quality marine invertebrate genomes. As such, this genome provides an invaluable resource to help yield fundamental insights into the ability of this copepod to adapt to rapidly changing environments.</p>","PeriodicalId":94020,"journal":{"name":"Genomics, proteomics & bioinformatics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142335111","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Whole-genome Sequencing Association Analysis of Quantitative Platelet Traits in A Large Cohort of β-thalassemia.","authors":"Xingmin Wang, Qianqian Zhang, Xianming Chen, Yushan Huang, Wei Zhang, Liuhua Liao, Xinhua Zhang, Binbin Huang, Yueyan Huang, Yuhua Ye, Mengyang Song, Jinquan Lao, Juanjuan Chen, Xiaoqin Feng, Xingjiang Long, Zhixiang Liu, Weijian Zhu, Lian Yu, Chengwu Fan, Deguo Tang, Tianyu Zhong, Mingyan Fang, Caiyun Li, Chao Niu, Li Huang, Bin Lin, Xiaoyun Hua, Xin Jin, Zilin Li, Xiangmin Xu","doi":"10.1093/gpbjnl/qzae065","DOIUrl":"https://doi.org/10.1093/gpbjnl/qzae065","url":null,"abstract":"<p><p>Platelet acts as a crucial monitoring indicator for hypercoagulability and thrombosis and a key target for drug regulation. Genotype-phenotype association studies have confirmed that platelet traits are quantitatively regulated by multiple genes. However, there is currently a lack of genetic studies on the heterogeneity of platelet traits in β-thalassemia under hypercoagulable state. Here, we studied the phenotypic heterogeneity of platelet count (PLT) and mean platelet volume (MPV) in 1020 β-thalassemia patients. We further performed a functionally informed whole genome sequencing association analysis of common variants and rare variants (RVs) for PLT and MPV in 916 patients through integrative analysis of whole-genome sequencing data and functional annotation data. Extreme phenotypic heterogeneity of platelet traits was observed in β-thalassemia patients. Additionally, the common variant based gene-level analysis identified the novel gene of RNF144B associated with MPV. The RV analysis identified several novel associations in both coding and noncoding genome, including missense RVs of PPP2R5C associated with PLT and missense RVs of TSSK1B associated with MPV. In conclusion, we performed a comprehensive and systematic whole genome scan of platelet traits in the β-thalassemia cohort, demonstrating the specificity of genetic regulation of platelet traits in the context of β-thalassemia, providing potential targets for intervention.</p>","PeriodicalId":94020,"journal":{"name":"Genomics, proteomics & bioinformatics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142335112","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Qianpeng Li, Yang Zhang, Sicheng Luo, Zhang Zhang, Ann L Oberg, David E Kozono, Hua Lu, Jann N Sarkaria, Lina Ma, Liguo Wang
{"title":"Identify Non-mutational p53 Functional Deficiency in Human Cancers.","authors":"Qianpeng Li, Yang Zhang, Sicheng Luo, Zhang Zhang, Ann L Oberg, David E Kozono, Hua Lu, Jann N Sarkaria, Lina Ma, Liguo Wang","doi":"10.1093/gpbjnl/qzae064","DOIUrl":"https://doi.org/10.1093/gpbjnl/qzae064","url":null,"abstract":"<p><p>An accurate assessment of p53's functional status is critical for cancer genomic medicine. However, there is a significant challenge in identifying tumors with non-mutational p53 inactivations that are not detectable through DNA sequencing. These undetected cases are often misclassified as p53-normal, leading to inaccurate prognosis and downstream association analyses. To address this issue, we built the support vector machine (SVM) models to systematically reassess p53's functional status in TP53 wild-type (TP53 WT) tumors from multiple The Cancer Genome Atlas (TCGA) cohorts. Cross-validation demonstrated the good performance of the SVM models with a mean area under curve (AUC) of 0.9822, precision of 0.9747, and recall of 0.9784. Our study revealed that a significant proportion (87%-99%) of TP53 WT tumors actually have compromised p53 function. Additional analyses uncovered that these genetically intact but functionally impaired (termed as predictively reduced function of p53 or TP53 WT-pRF) tumors exhibited genomic and pathophysiologic features akin to TP53 mutant tumors: heightened genomic instability and elevated levels of hypoxia. Clinically, patients with TP53 WT-pRF tumors experienced significantly shortened overall survival or progression-free survival compared to those with predictively normal function of p53 (TP53 WT-pN) tumors, and these patients also displayed increased sensitivity to platinum-based chemotherapy and radiation therapy.</p>","PeriodicalId":94020,"journal":{"name":"Genomics, proteomics & bioinformatics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142335109","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Virus Infection Induces Immune Gene Activation with CTCF Anchored Enhancers and Chromatin Interactions in Pig Genome.","authors":"Jianhua Cao, Ruimin Ren, Xiaolong Li, Xiaoqian Zhang, Yan Sun, Xiaohuan Tian, Ru Liu, Xiangdong Liu, Yijun Ruan, Guoliang Li, Shuhong Zhao","doi":"10.1093/gpbjnl/qzae062","DOIUrl":"https://doi.org/10.1093/gpbjnl/qzae062","url":null,"abstract":"<p><p>Chromatin organization is important for gene transcription in pig genome. However, its three-dimensional (3D) structure and dynamics are much less investigated than those in human. Here we applied the long-reads chromatin interaction analysis by paired-end tag sequencing (ChIA-PET) method to map the whole-genome chromatin interactions mediated by CCCTC-binding factor (CTCF) and RNA polymerase Ⅱ (RNAPⅡ or POLⅡ) in porcine macrophage cells before and after polyinosinic-polycytidylic acid [Poly(I:C)] induction. Our results revealed that Poly(I:C) induction impacts the 3D genome organization in the 3D4/21 cells at the fine-scale chromatin loop level rather than at the large-scale domain level. Furthermore, our findings underscored the pivotal role of CTCF anchored chromatin interactions in reshaping chromatin architecture during immune responses. Knock-out of the CTCF locus further confirmed that the CTCF anchored enhancers are associated with the activation of immune genes via long-range interactions. Notably, ChIA-PET data also supported the spatial relationship between single nucleotide polymorphisms (SNPs) and the related gene transcription in 3D genome aspect. Our findings in this study provide new clues and potential targets to explore key elements related to diseases in swine and are also likely to shed light on elucidating chromatin organization and dynamics underlying the process of mammalian infectious diseases.</p>","PeriodicalId":94020,"journal":{"name":"Genomics, proteomics & bioinformatics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142309474","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Computational Strategies and Algorithms for Inferring Cellular Composition of Spatial Transcriptomics Data.","authors":"Xiuying Liu, Xianwen Ren","doi":"10.1093/gpbjnl/qzae057","DOIUrl":"10.1093/gpbjnl/qzae057","url":null,"abstract":"<p><p>Spatial transcriptomics technology has been an essential and powerful method for delineating tissue architecture at the molecular level. However, due to the limitations of the current spatial techniques, the cellular information cannot be directly measured but instead spatial spots typically varying from a diameter of 0.2 to 100 µm are characterized. Therefore, it is vital to apply computational strategies for inferring the cellular composition within each spatial spot. The main objective of this review is to summarize the most recent progresses in estimating the exact cellular proportions for each spatial spot, and to prospect the future directions of this field.</p>","PeriodicalId":94020,"journal":{"name":"Genomics, proteomics & bioinformatics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11398939/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141903946","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"SCancerRNA: Expression at the Single-cell Level and Interaction Resource of Non-coding RNA Biomarkers for Cancers.","authors":"Hongzhe Guo, Liyuan Zhang, Xinran Cui, Liang Cheng, Tianyi Zhao, Yadong Wang","doi":"10.1093/gpbjnl/qzae023","DOIUrl":"https://doi.org/10.1093/gpbjnl/qzae023","url":null,"abstract":"<p><p>Non-coding RNAs (ncRNAs) participate in multiple biological processes associated with cancers as tumor suppressors or oncogenic drivers. Due to their high stability in plasma, urine, and many other fluids, ncRNAs have the potential to serve as key biomarkers for early diagnosis and screening of cancers. During cancer progression, tumor heterogeneity plays a crucial role, and it is particularly important to understand the gene expression patterns of individual cells. With the development of single-cell RNA sequencing (scRNA-seq) technologies, uncovering gene expression in different cell types for human cancers has become feasible by profiling transcriptomes at the cellular level. However, a well-organized and comprehensive online resource that provides access to the expression of genes corresponding to ncRNA biomarkers in different cell types at the single-cell level is not available yet. Therefore, we developed the SCancerRNA database to summarize experimentally supported data on long ncRNA, microRNA, PIWI-interacting RNA, small nucleolar RNA, and circular RNA biomarkers, as well as data on their differential expression at the cellular level. Furthermore, we collected biological functions and clinical applications of biomarkers to facilitate the application of ncRNA biomarkers to cancer diagnosis, as well as the monitoring of progression and targeted therapies. SCancerRNA also allows users to explore interaction networks of different types of ncRNAs, and build computational models in the future. SCancerRNA is freely accessible at http://www.scancerrna.com/BioMarker.</p>","PeriodicalId":94020,"journal":{"name":"Genomics, proteomics & bioinformatics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142335113","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}