Zhenyong Du, Gregory Gelembiuk, Wynne Moss, Andrew Tritt, Carol Eunmi Lee
{"title":"The Genome Architecture of the Copepod Eurytemora carolleeae - the Highly Invasive Atlantic Clade of the Eurytemoraaffinis Species Complex.","authors":"Zhenyong Du, Gregory Gelembiuk, Wynne Moss, Andrew Tritt, Carol Eunmi Lee","doi":"10.1093/gpbjnl/qzae066","DOIUrl":"10.1093/gpbjnl/qzae066","url":null,"abstract":"<p><p>Copepods are among the most abundant organisms on the planet and play critical functions in aquatic ecosystems. Among copepods, populations of the Eurytemora affinis species complex are numerically dominant in many coastal habitats and serve as food sources for major fisheries. Intriguingly, certain populations possess the unusual capacity to invade novel salinities on rapid time scales. Despite their ecological importance, high-quality genomic resources have been absent for calanoid copepods, limiting our ability to comprehensively dissect the genome architecture underlying the highly invasive and adaptive capacity of certain populations. Here, we present the first chromosome-level genome of a calanoid copepod, from the Atlantic clade (Eurytemora carolleeae) of the E. affinis species complex. This genome was assembled using high-coverage PacBio long-read and Hi-C sequences of an inbred line, generated through 30 generations of full-sib mating. This genome, consisting of 529.3 Mb (contig N50 = 4.2 Mb, scaffold N50 = 140.6 Mb), was anchored onto four chromosomes. Genome annotation predicted 20,262 protein-coding genes, of which ion transport-related gene families were substantially expanded based on comparative analyses of 12 additional arthropod genomes. Also, we found genome-wide signatures of historical gene body methylation of the ion transport-related genes and the significant clustering of these genes on each chromosome. This genome represents one of the most contiguous copepod genomes to date and is among the highest quality marine invertebrate genomes. As such, this genome provides an invaluable resource to help yield fundamental insights into the ability of this copepod to adapt to rapidly changing environments.</p>","PeriodicalId":94020,"journal":{"name":"Genomics, proteomics & bioinformatics","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-12-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11706791/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142335111","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Zheng Fang, Mingming Dong, Hongqiang Qin, Mingliang Ye
{"title":"GP-Plotter: Flexible Spectral Visualization for Proteomics Data with Emphasis on Glycoproteomics Analysis.","authors":"Zheng Fang, Mingming Dong, Hongqiang Qin, Mingliang Ye","doi":"10.1093/gpbjnl/qzae069","DOIUrl":"10.1093/gpbjnl/qzae069","url":null,"abstract":"<p><p>Identification evaluation and result dissemination are essential components in mass spectrometry-based proteomics analysis. The visualization of fragment ions in mass spectrum provides strong evidence for peptide identification and modification localization. Here, we present an easy-to-use tool, named GP-Plotter, for ion annotation of tandem mass spectra and corresponding image output. Identification result files of common searching tools in the community and user-customized files are supported as input of GP-Plotter. Multiple display modes and parameter customization can be achieved in GP-Plotter to present annotated spectra of interest. Different image formats, especially vector graphic formats, are available for image generation which is favorable for data publication. Notably, GP-Plotter is also well-suited for the visualization and evaluation of glycopeptide spectrum assignments with comprehensive annotation of glycan fragment ions. With a user-friendly graphical interface, GP-Plotter is expected to be a universal visualization tool for the community. GP-Plotter has been implemented in the latest version of Glyco-Decipher (v1.0.4) and the standalone GP-Plotter software is also freely available at https://github.com/DICP-1809.</p>","PeriodicalId":94020,"journal":{"name":"Genomics, proteomics & bioinformatics","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-12-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11661977/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142396283","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Shenghan Gao, Yimeng Zhang, Stephen J Bush, Bo Wang, Xiaofei Yang, Kai Ye
{"title":"Centromere Landscapes Resolved from Hundreds of Human Genomes.","authors":"Shenghan Gao, Yimeng Zhang, Stephen J Bush, Bo Wang, Xiaofei Yang, Kai Ye","doi":"10.1093/gpbjnl/qzae071","DOIUrl":"10.1093/gpbjnl/qzae071","url":null,"abstract":"<p><p>High-fidelity (HiFi) sequencing has facilitated the assembly and analysis of the most repetitive region of the genome, the centromere. Nevertheless, our current understanding of human centromeres is based on a relatively small number of telomere-to-telomere assemblies, which have not yet captured its full diversity. In this study, we investigated the genomic diversity of human centromere higher order repeats (HORs) via both HiFi reads and haplotype-resolved assemblies from hundreds of samples drawn from ongoing pangenome-sequencing projects and reprocessed them via a novel HOR annotation pipeline, HiCAT-human. We used this wealth of data to provide a global survey of the centromeric HOR landscape; in particular, we found that 23 HORs presented significant copy number variability between populations. We detected three centromere genotypes with unbalanced population frequencies on chromosomes 5, 8, and 17. An inter-assembly comparison of HOR loci further revealed that while HOR array structures are diverse, they nevertheless tend to form a number of specific landscapes, each exhibiting different levels of HOR subunit expansion and possibly reflecting a cyclical evolutionary transition from homogeneous to nested structures and back.</p>","PeriodicalId":94020,"journal":{"name":"Genomics, proteomics & bioinformatics","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-12-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11652271/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142484010","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mengting Shao, Kaiyang Chen, Shuting Zhang, Min Tian, Yan Shen, Chen Cao, Ning Gu
{"title":"Multiome-wide Association Studies: Novel Approaches for Understanding Diseases.","authors":"Mengting Shao, Kaiyang Chen, Shuting Zhang, Min Tian, Yan Shen, Chen Cao, Ning Gu","doi":"10.1093/gpbjnl/qzae077","DOIUrl":"10.1093/gpbjnl/qzae077","url":null,"abstract":"<p><p>The rapid development of multiome (transcriptome, proteome, cistrome, imaging, and regulome)-wide association study methods have opened new avenues for biologists to understand the susceptibility genes underlying complex diseases. Thorough comparisons of these methods are essential for selecting the most appropriate tool for a given research objective. This review provides a detailed categorization and summary of the statistical models, use cases, and advantages of recent multiome-wide association studies. In addition, to illustrate gene-disease association studies based on transcriptome-wide association study (TWAS), we collected 478 disease entries across 22 categories from 235 manually reviewed publications. Our analysis reveals that mental disorders are the most frequently studied diseases by TWAS, indicating its potential to deepen our understanding of the genetic architecture of complex diseases. In summary, this review underscores the importance of multiome-wide association studies in elucidating complex diseases and highlights the significance of selecting the appropriate method for each study.</p>","PeriodicalId":94020,"journal":{"name":"Genomics, proteomics & bioinformatics","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-12-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11630051/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142549772","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"DeOri 10.0: An Updated Database of Experimentally Identified Eukaryotic Replication Origins.","authors":"Yu-Hao Zeng, Zhen-Ning Yin, Hao Luo, Feng Gao","doi":"10.1093/gpbjnl/qzae076","DOIUrl":"10.1093/gpbjnl/qzae076","url":null,"abstract":"<p><p>DNA replication is a complex and crucial biological process in eukaryotes. To facilitate the study of eukaryotic replication events, we present a database of eukaryotic DNA replication origins (DeOri), which collects genome-wide data on eukaryotic DNA replication origins currently available. With the rapid development of high-throughput experimental technology in recent years, the number of datasets in the new release of DeOri 10.0 increased from 10 to 151 and the number of sequences increased from 16,145 to 9,742,396. Besides nucleotide sequences and browser extensible data (BED) files, corresponding annotation files, such as coding sequences (CDSs), mRNAs, and other biological elements within replication origins, are also provided. The experimental techniques used for each dataset, as well as related statistical data, are also presented on web page. Differences in experimental methods, cell lines, and sequencing technologies have resulted in distinct replication origins, making it challenging to differentiate between cell-specific and non-specific replication origins. Based on multiple replication origin datasets at the species level, we scored and screened replication origins in Homo sapiens, Gallus gallus, Mus musculus, Drosophila melanogaster, and Caenorhabditis elegans. The screened regions with high scores were considered as species-conservative origins, which are integrated and presented as reference replication origins (rORIs). Additionally, we analyzed the distribution of relevant genomic elements associated with replication origins at the genome level, such as CpG island (CGI), transcription start site (TSS), and G-quadruplex (G4). These analysis results can be browsed and downloaded as needed at http://tubic.tju.edu.cn/deori/.</p>","PeriodicalId":94020,"journal":{"name":"Genomics, proteomics & bioinformatics","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-12-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11652270/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142484012","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Qianpeng Li, Yang Zhang, Sicheng Luo, Zhang Zhang, Ann L Oberg, David E Kozono, Hua Lu, Jann N Sarkaria, Lina Ma, Liguo Wang
{"title":"Identify Non-mutational p53 Functional Deficiency in Human Cancers.","authors":"Qianpeng Li, Yang Zhang, Sicheng Luo, Zhang Zhang, Ann L Oberg, David E Kozono, Hua Lu, Jann N Sarkaria, Lina Ma, Liguo Wang","doi":"10.1093/gpbjnl/qzae064","DOIUrl":"10.1093/gpbjnl/qzae064","url":null,"abstract":"<p><p>An accurate assessment of p53's functional statuses is critical for cancer genomic medicine. However, there is a significant challenge in identifying tumors with non-mutational p53 inactivation which is not detectable through DNA sequencing. These undetected cases are often misclassified as p53-normal, leading to inaccurate prognosis and downstream association analyses. To address this issue, we built the support vector machine (SVM) models to systematically reassess p53's functional statuses in TP53 wild-type (TP53WT) tumors from multiple The Cancer Genome Atlas (TCGA) cohorts. Cross-validation demonstrated the good performance of the SVM models with a mean area under the receiver operating characteristic curve (AUROC) of 0.9822, precision of 0.9747, and recall of 0.9784. Our study revealed that a significant proportion (87%-99%) of TP53WT tumors actually had compromised p53 function. Additional analyses uncovered that these genetically intact but functionally impaired (termed as predictively reduced function of p53 or TP53WT-pRF) tumors exhibited genomic and pathophysiologic features akin to TP53-mutant tumors: heightened genomic instability and elevated levels of hypoxia. Clinically, patients with TP53WT-pRF tumors experienced significantly shortened overall survival or progression-free survival compared to those with predictively normal function of p53 (TP53WT-pN) tumors, and these patients also displayed increased sensitivity to platinum-based chemotherapy and radiation therapy.</p>","PeriodicalId":94020,"journal":{"name":"Genomics, proteomics & bioinformatics","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-12-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11702981/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142335109","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Qingqing Shi, Min Dai, Yingke Ma, Jun Liu, Xiuying Liu, Xiu-Jie Wang
{"title":"DRED: A Comprehensive Database of Genes Related to Repeat Expansion Diseases.","authors":"Qingqing Shi, Min Dai, Yingke Ma, Jun Liu, Xiuying Liu, Xiu-Jie Wang","doi":"10.1093/gpbjnl/qzae068","DOIUrl":"10.1093/gpbjnl/qzae068","url":null,"abstract":"<p><p>Expansion of tandem repeats in genes often causes severe diseases, such as fragile X syndrome, Huntington's disease, and spinocerebellar ataxia. However, information on genes associated with repeat expansion diseases is scattered throughout the literature, systematic prediction of potential genes that may cause diseases via repeat expansion is also lacking. Here, we develop DRED, a Database of genes related to Repeat Expansion Diseases, as a manually-curated database that covers all known 61 genes related to repeat expansion diseases reported in PubMed and OMIM, along with the detailed repeat information for each gene. DRED also includes 516 genes with the potential to cause diseases via repeat expansion, which were predicted based on their repeat composition, genetic variations, genomic features, and disease associations. Various types of information on repeat expansion diseases and their corresponding genes/repeats are presented in DRED, together with links to external resources, such as NCBI and ClinVar. DRED provides user-friendly interfaces with comprehensive functions, and can serve as a central data resource for basic research and repeat expansion disease-related medical diagnosis. DRED is freely accessible at http://omicslab.genetics.ac.cn/dred, and will be frequently updated to include newly reported genes related to repeat expansion diseases.</p>","PeriodicalId":94020,"journal":{"name":"Genomics, proteomics & bioinformatics","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-12-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11696699/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142335108","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Enhancing Variant Calling in Whole-exome Sequencing Data Using Population-matched Reference Genomes.","authors":"Shuming Guo, Zhuo Huang, Yanming Zhang, Yukun He, Xiangju Chen, Wenjuan Wang, Lansheng Li, Yu Kang, Zhancheng Gao, Jun Yu, Zhenglin Du, Yanan Chu","doi":"10.1093/gpbjnl/qzae070","DOIUrl":"10.1093/gpbjnl/qzae070","url":null,"abstract":"<p><p>Whole-exome sequencing (WES) data are frequently used for cancer diagnosis and genome-wide association studies (GWAS), based on high-coverage read mapping, informative variant calling, and high-quality reference genomes. The center position of the currently used genome assembly, GRCh38, is now challenged by two newly published telomere-to-telomere (T2T) genomes, T2T-CHM13 and T2T-YAO, and it becomes urgent to have a comparative study to test population specificity using the three reference genomes based on real case WES data. Here, we report our analysis along this line for 19 tumor samples collected from Chinese patients. The primary comparison of the exon regions among the three references reveals that the sequences in up to ∼ 1% of target regions in T2T-YAO are widely diversified from GRCh38 and may lead to off-target in sequence capture. However, T2T-YAO still outperforms GRCh38 by obtaining 7.41% of more mapped reads. Due to more reliable read-mapping and closer phylogenetic relationship with the samples than GRCh38, T2T-YAO reduces half of variant calls of clinical significance which are mostly benign, while maintaining sensitivity in identifying pathogenic variants. T2T-YAO also outperforms T2T-CHM13 in reducing calls of Chinese-specific variants. Our findings highlight the critical need for employing population-specific reference genomes in genomic analysis to ensure accurate variant analysis and the significant benefits of tailoring these approaches to the unique genetic background of each ethnic group.</p>","PeriodicalId":94020,"journal":{"name":"Genomics, proteomics & bioinformatics","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-12-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11687947/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142396282","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ping Xu, Zhiheng Yuan, Xiaohua Lu, Peng Zhou, Ding Qiu, Zhenghao Qiao, Zhongcheng Zhou, Li Guan, Yongkang Jia, Xuan He, Ling Sun, Youzhong Wan, Ming Wang, Yang Yu
{"title":"RAG-seq: NSR-primed and Transposase Tagmentation-mediated Strand-specific Total RNA Sequencing in Single Cells.","authors":"Ping Xu, Zhiheng Yuan, Xiaohua Lu, Peng Zhou, Ding Qiu, Zhenghao Qiao, Zhongcheng Zhou, Li Guan, Yongkang Jia, Xuan He, Ling Sun, Youzhong Wan, Ming Wang, Yang Yu","doi":"10.1093/gpbjnl/qzae072","DOIUrl":"10.1093/gpbjnl/qzae072","url":null,"abstract":"<p><p>Single-cell RNA sequencing (scRNA-seq) has transformed our understanding of cellular diversity with unprecedented resolution. However, many current methods are limited in capturing full-length transcripts and discerning strand orientation. Here, we present RAG-seq, an innovative strand-specific total RNA sequencing technique that combines not-so-random (NSR) primers with Tn5 transposase-mediated tagmentation. RAG-seq overcomes previous limitations by delivering comprehensive transcript coverage and maintaining strand orientation, which are essential for accurate quantification of overlapping genes and detection of antisense transcripts. Through optimized reverse transcription with oligo-dT primers, rRNA depletion via Depletion of Abundant Sequences by Hybridization (DASH), and linear amplification, RAG-seq enhances sensitivity and reproducibility, especially for low-input samples and single cells. Application to mouse oocytes and early embryos highlights RAG-seq's superior performance in identifying stage-specific antisense transcripts, shedding light on their regulatory roles during early development. This advancement represents a significant leap in transcriptome analysis within complex biological contexts.</p>","PeriodicalId":94020,"journal":{"name":"Genomics, proteomics & bioinformatics","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-12-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11658833/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142484016","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Evolution of Plant Genome Size and Composition.","authors":"Bing He, Wanfei Liu, Jianyang Li, Siwei Xiong, Jing Jia, Qiang Lin, Hailin Liu, Peng Cui","doi":"10.1093/gpbjnl/qzae078","DOIUrl":"10.1093/gpbjnl/qzae078","url":null,"abstract":"<p><p>The rapid development of sequencing technology has led to an explosion of plant genome data, opening up more opportunities for research in the field of comparative evolutionary analysis of plant genomes. In this review, we focus on changes in plant genome size and composition, examining the effects of polyploidy, whole-genome duplication, and alternations in transposable elements on plant genome architecture and evolution, respectively. In addition, to address gaps in the available information, we also collected and analyzed 234 representative plant genome data as a supplement. We aim to provide a comprehensive, up-to-date summary of information on plant genome architecture and evolution in this review.</p>","PeriodicalId":94020,"journal":{"name":"Genomics, proteomics & bioinformatics","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-12-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11630846/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142585409","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}