Kai Li, Ping Zhang, Jinsheng Xu, Zi Wen, Junying Zhang, Zhike Zi, Li Li
{"title":"COCOA: A Framework for Fine-scale Mapping of Cell-type-specific Chromatin Compartments Using Epigenomic Information.","authors":"Kai Li, Ping Zhang, Jinsheng Xu, Zi Wen, Junying Zhang, Zhike Zi, Li Li","doi":"10.1093/gpbjnl/qzae091","DOIUrl":"10.1093/gpbjnl/qzae091","url":null,"abstract":"<p><p>Chromatin compartmentalization and epigenomic modifications play crucial roles in cell differentiation and disease development. However, precise mapping of chromatin compartment patterns requires Hi-C or Micro-C data at high sequencing depth. Exploring the systematic relationship between epigenomic modifications and compartment patterns remains challenging. To address these issues, we present COCOA, a deep neural network framework using convolution and attention mechanisms to infer fine-scale chromatin compartment patterns from six histone modification signals. COCOA extracts 1D track features through bidirectional feature reconstruction after resolution-specific binning of epigenomic signals. These track features are then cross-fused with contact features using an attention mechanism and transformed into chromatin compartment patterns through residual feature reduction. COCOA demonstrates accurate inference of chromatin compartmentalization at a fine-scale resolution and exhibits stable performance on test sets. Additionally, we explored the impact of histone modifications on chromatin compartmentalization prediction through in silico epigenomic perturbation experiments. Unlike obscure compartments observed in high-depth experimental data at 1-kb resolution, COCOA generates clear and detailed compartment patterns, highlighting its superior performance. Finally, we demonstrate that COCOA enables cell-type-specific prediction of unrevealed chromatin compartment patterns in various biological processes, making it an effective tool for gaining insights into chromatin compartmentalization from epigenomics in diverse biological scenarios. The COCOA Python code is publicly available at https://github.com/onlybugs/COCOA and https://ngdc.cncb.ac.cn/biocode/tools/BT007498.</p>","PeriodicalId":94020,"journal":{"name":"Genomics, proteomics & bioinformatics","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11993304/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142901425","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Harnessing Type II Cytokines to Reinvigorate Exhausted T Cells for Durable Cancer Immunotherapy.","authors":"Wenle Zhang, Yanwen Wang, Bin Li","doi":"10.1093/gpbjnl/qzae093","DOIUrl":"10.1093/gpbjnl/qzae093","url":null,"abstract":"","PeriodicalId":94020,"journal":{"name":"Genomics, proteomics & bioinformatics","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11760936/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142901426","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Heng Du, Yue Zhuo, Shiyu Lu, Wanying Li, Lei Zhou, Feizhou Sun, Gang Liu, Jian-Feng Liu
{"title":"Pangenome Reveals Gene Content Variations and Structural Variants Contributing to Pig Characteristics.","authors":"Heng Du, Yue Zhuo, Shiyu Lu, Wanying Li, Lei Zhou, Feizhou Sun, Gang Liu, Jian-Feng Liu","doi":"10.1093/gpbjnl/qzae081","DOIUrl":"10.1093/gpbjnl/qzae081","url":null,"abstract":"<p><p>Pigs are one of the most essential sources of high-quality proteins in human diets. Structural variants (SVs) are a major source of genetic variants associated with diverse traits and evolutionary events. However, the current linear reference genome of pigs restricts the accurate presentation of position information for SVs. In this study, we generated a pangenome of pigs and a genome variation map of 599 deeply sequenced genomes across Eurasia. Additionally, we established a section-wide gene repertoire, revealing that core genes are more evolutionarily conserved than variable genes. Furthermore, we identified 546,137 SVs, their enrichment regions, and relationships with genomic features and found significant divergence across Eurasian pigs. More importantly, the pangenome-detected SVs could complement heritability estimates and genome-wide association studies based only on single nucleotide polymorphisms. Among the SVs shaped by selection, we identified an insertion in the promoter region of the TBX19 gene, which may be related to the development, growth, and timidity traits of Asian pigs and may affect the gene expression. The constructed pig pangenome and the identified SVs in this study provide rich resources for future functional genomic research on pigs.</p>","PeriodicalId":94020,"journal":{"name":"Genomics, proteomics & bioinformatics","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12017589/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142635075","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Evaluation of T Cell Receptor Construction Methods from scRNA-Seq Data.","authors":"Ruonan Tian, Zhejian Yu, Ziwei Xue, Jiaxin Wu, Lize Wu, Shuo Cai, Bing Gao, Bing He, Yu Zhao, Jianhua Yao, Linrong Lu, Wanlu Liu","doi":"10.1093/gpbjnl/qzae086","DOIUrl":"10.1093/gpbjnl/qzae086","url":null,"abstract":"<p><p>T cell receptors (TCRs) serve key roles in the adaptive immune system by enabling recognition and response to pathogens and irregular cells. Various methods have been developed for TCR construction from single-cell RNA sequencing (scRNA-seq) datasets, each with its unique characteristics. Yet, a comprehensive evaluation of their relative performance under different conditions remains elusive. In this study, we conducted a benchmark analysis utilizing experimental single-cell immune profiling datasets. Additionally, we introduced a novel simulator, YASIM-scTCR (Yet Another SIMulator for single-cell TCR), capable of generating scTCR-seq reads containing diverse TCR-derived sequences with different sequencing depths and read lengths. Our results consistently showed that TRUST4 and MiXCR outperformed others across multiple datasets, while DeRR demonstrated considerable accuracy. We also discovered that the sequencing depth inherently imposes a critical constraint on successful TCR construction from scRNA-seq data. In summary, we present a benchmark study to aid researchers in choosing the appropriate method for reconstructing TCRs from scRNA-seq data.</p>","PeriodicalId":94020,"journal":{"name":"Genomics, proteomics & bioinformatics","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11846667/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142820279","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Tong Pan, Yue Bi, Xiaoyu Wang, Ying Zhang, Geoffrey I Webb, Robin B Gasser, Lukasz Kurgan, Jiangning Song
{"title":"SCREEN: A Graph-based Contrastive Learning Tool to Infer Catalytic Residues and Assess Enzyme Mutations.","authors":"Tong Pan, Yue Bi, Xiaoyu Wang, Ying Zhang, Geoffrey I Webb, Robin B Gasser, Lukasz Kurgan, Jiangning Song","doi":"10.1093/gpbjnl/qzae094","DOIUrl":"10.1093/gpbjnl/qzae094","url":null,"abstract":"<p><p>The accurate identification of catalytic residues contributes to our understanding of enzyme functions in biological processes and pathways. The increasing number of protein sequences necessitates computational tools for the automated prediction of catalytic residues in enzymes. Here, we introduce SCREEN, a graph neural network for the high-throughput prediction of catalytic residues via the integration of enzyme functional and structural information. SCREEN constructs residue representations based on spatial arrangements and incorporates enzyme function priors into such representations through contrastive learning. We demonstrate that SCREEN (1) consistently outperforms currently-available predictors; (2) provides accurate results when applied to inferred enzyme structures; and (3) generalizes well to enzymes dissimilar from those in the training set. We also show that the putative catalytic residues predicted by SCREEN mimic key structural and biophysical characteristics of native catalytic residues. Moreover, using experimental datasets, we show that SCREEN's predictions can be used to distinguish residues with a high mutation tolerance from those likely to cause functional loss when mutated, indicating that this tool might be used to infer disease-associated mutations. SCREEN is publicly available at https://github.com/BioColLab/SCREEN and https://ngdc.cncb.ac.cn/biocode/tool/7580.</p>","PeriodicalId":94020,"journal":{"name":"Genomics, proteomics & bioinformatics","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11961199/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142901428","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ziyi Li, Cory A Weller, Syed Shah, Nicholas L Johnson, Ying Hao, Paige B Jarreau, Jessica Roberts, Deyaan Guha, Colleen Bereda, Sydney Klaisner, Pedro Machado, Matteo Zanovello, Mercedes Prudencio, Björn Oskarsson, Nathan P Staff, Dennis W Dickson, Pietro Fratta, Leonard Petrucelli, Priyanka Narayan, Mark R Cookson, Michael E Ward, Andrew B Singleton, Mike A Nalls, Yue A Qi
{"title":"ProtPipe: A Multifunctional Data Analysis Pipeline for Proteomics and Peptidomics.","authors":"Ziyi Li, Cory A Weller, Syed Shah, Nicholas L Johnson, Ying Hao, Paige B Jarreau, Jessica Roberts, Deyaan Guha, Colleen Bereda, Sydney Klaisner, Pedro Machado, Matteo Zanovello, Mercedes Prudencio, Björn Oskarsson, Nathan P Staff, Dennis W Dickson, Pietro Fratta, Leonard Petrucelli, Priyanka Narayan, Mark R Cookson, Michael E Ward, Andrew B Singleton, Mike A Nalls, Yue A Qi","doi":"10.1093/gpbjnl/qzae083","DOIUrl":"10.1093/gpbjnl/qzae083","url":null,"abstract":"<p><p>Mass spectrometry (MS) is a technique widely employed for the identification and characterization of proteins, with personalized medicine, systems biology, and biomedical applications. The application of MS-based proteomics advances our understanding of protein function, cellular signaling, and complex biological systems. MS data analysis is a critical process that includes identifying and quantifying proteins and peptides and then exploring their biological functions in downstream analyses. To address the complexities associated with MS data analysis, we developed ProtPipe to streamline and automate the processing and analysis of high-throughput proteomics and peptidomics datasets with DIA-NN preinstalled. The pipeline facilitates data quality control, sample filtering, and normalization, ensuring robust and reliable downstream analyses. ProtPipe provides downstream analyses, including protein and peptide differential abundance identification, pathway enrichment analysis, protein-protein interaction analysis, and major histocompatibility complex (MHC)-peptide binding affinity analysis. ProtPipe generates annotated tables and visualizations by performing statistical post-processing and calculating fold changes between predefined pairwise conditions in an experimental design. It is an open-source, well-documented tool available at https://github.com/NIH-CARD/ProtPipe, with a user-friendly web interface.</p>","PeriodicalId":94020,"journal":{"name":"Genomics, proteomics & bioinformatics","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11842048/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142690171","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xinping Cai, Qianru Zhang, Bolin Liu, Lu Sun, Yuxuan Liu
{"title":"HemaCisDB: An Interactive Database for Analyzing Cis-Regulatory Elements Across Hematopoietic Malignancies.","authors":"Xinping Cai, Qianru Zhang, Bolin Liu, Lu Sun, Yuxuan Liu","doi":"10.1093/gpbjnl/qzae088","DOIUrl":"https://doi.org/10.1093/gpbjnl/qzae088","url":null,"abstract":"<p><p>Noncoding cis-regulatory elements (CREs), such as transcriptional enhancers, are key regulators of gene expression programs. Accessible chromatin and H3K27ac are well-recognized markers for CREs associated with their biological function. Deregulation of CREs is commonly found in hematopoietic malignancies yet the extent to which CRE dysfunction contributes to pathophysiology remains incompletely understood. Here, we developed HemaCisDB, an interactive, comprehensive, and centralized online resource for CRE characterization across hematopoietic malignancies, serving as a useful resource for investigating the pathological roles of CREs in blood disorders. Currently, we collected 922 ATAC-seq, 190 DNase-seq, and 531 H3K27ac ChIP-seq datasets from patient samples and cell lines across different myeloid and lymphoid neoplasms. HemaCisDB provides comprehensive quality control metrics to assess ATAC-seq, DNase-seq, and H3K27ac ChIP-seq data quality. The analytic modules in HemaCisDB include transcription factor (TF) footprinting inference, super-enhancer identification, and core transcriptional regulatory circuitry analysis. Moreover, HemaCisDB also enables the study of TF binding dynamics by comparing TF footprints across different disease types or conditions via web-based interactive analysis. Together, HemaCisDB provides an interactive platform for CRE characterization to facilitate mechanistic studies of transcriptional regulation in hematopoietic malignancies. HemaCisDB is available at https://hemacisdb.chinablood.com.cn/.</p>","PeriodicalId":94020,"journal":{"name":"Genomics, proteomics & bioinformatics","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-12-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142901427","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Virus Infection Induces Immune Gene Activation with CTCF-anchored Enhancers and Chromatin Interactions in Pig Genome.","authors":"Jianhua Cao, Ruimin Ren, Xiaolong Li, Xiaoqian Zhang, Yan Sun, Xiaohuan Tian, Ru Liu, Xiangdong Liu, Yijun Ruan, Guoliang Li, Shuhong Zhao","doi":"10.1093/gpbjnl/qzae062","DOIUrl":"10.1093/gpbjnl/qzae062","url":null,"abstract":"<p><p>Chromatin organization is important for gene transcription in pig genome. However, its three-dimensional (3D) structure and dynamics are much less investigated than those in human. Here, we applied the long-read chromatin interaction analysis by paired-end tag sequencing (ChIA-PET) method to map the whole-genome chromatin interactions mediated by CCCTC-binding factor (CTCF) and RNA polymerase II (RNAPII) in porcine macrophage cells before and after polyinosinic-polycytidylic acid [Poly(I:C)] induction. Our results reveal that Poly(I:C) induction impacts the 3D genome organization in the 3D4/21 cells at the fine-scale chromatin loop level rather than at the large-scale domain level. Furthermore, our findings underscore the pivotal role of CTCF-anchored chromatin interactions in reshaping chromatin architecture during immune responses. Knockout of the CTCF-binding locus further confirms that the CTCF-anchored enhancers are associated with the activation of immune genes via long-range interactions. Notably, the ChIA-PET data also support the spatial relationship between single nucleotide polymorphisms (SNPs) and related gene transcription in 3D genome aspect. Our findings in this study provide new clues and potential targets to explore key elements related to diseases in pigs and are also likely to shed light on elucidating chromatin organization and dynamics underlying the process of mammalian infectious diseases.</p>","PeriodicalId":94020,"journal":{"name":"Genomics, proteomics & bioinformatics","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-12-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11725346/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142309474","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Huimin Chen, Jiaxin Liu, Gege Tang, Gefei Hao, Guangfu Yang
{"title":"Bioinformatic Resources for Exploring Human-virus Protein-protein Interactions Based on Binding Modes.","authors":"Huimin Chen, Jiaxin Liu, Gege Tang, Gefei Hao, Guangfu Yang","doi":"10.1093/gpbjnl/qzae075","DOIUrl":"10.1093/gpbjnl/qzae075","url":null,"abstract":"<p><p>Historically, there have been many outbreaks of viral diseases that have continued to claim millions of lives. Research on human-virus protein-protein interactions (PPIs) is vital to understanding the principles of human-virus relationships, providing an essential foundation for developing virus control strategies to combat diseases. The rapidly accumulating data on human-virus PPIs offer unprecedented opportunities for bioinformatics research around human-virus PPIs. However, available detailed analyses and summaries to help use these resources systematically and efficiently are lacking. Here, we comprehensively review the bioinformatic resources used in human-virus PPI research, and discuss and compare their functions, performance, and limitations. This review aims to provide researchers with a bioinformatic toolbox that will hopefully better facilitate the exploration of human-virus PPIs based on binding modes.</p>","PeriodicalId":94020,"journal":{"name":"Genomics, proteomics & bioinformatics","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-12-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11658832/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142484009","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"MitoSort: Robust Demultiplexing of Pooled Single-cell Genomic Data Using Endogenous Mitochondrial Variants.","authors":"Zhongjie Tang, Weixing Zhang, Peiyu Shi, Sijun Li, Xinhui Li, Yueming Li, Yicong Xu, Yaqing Shu, Zheng Hu, Jin Xu","doi":"10.1093/gpbjnl/qzae073","DOIUrl":"10.1093/gpbjnl/qzae073","url":null,"abstract":"<p><p>Multiplexing across donors has emerged as a popular strategy to increase throughput, reduce costs, overcome technical batch effects, and improve doublet detection in single-cell genomic studies. To eliminate additional experimental steps, endogenous nuclear genome variants are used for demultiplexing pooled single-cell RNA sequencing (scRNA-seq) data by several computational tools. However, these tools have limitations when applied to single-cell sequencing methods that do not cover nuclear genomic regions well, such as single-cell assay for transposase-accessible chromatin with sequencing (scATAC-seq). Here, we demonstrate that mitochondrial germline variants are an alternative, robust, and computationally efficient endogenous barcode for sample demultiplexing. We propose MitoSort, a tool that uses mitochondrial germline variants to assign cells to their donor origins and identify cross-genotype doublets in single-cell genomic datasets. We evaluate its performance by using in silico pooled mitochondrial scATAC-seq (mtscATAC-seq) libraries and experimentally multiplexed data with cell hashtags. MitoSort achieves high accuracy and efficiency in genotype clustering and doublet detection for mtscATAC-seq data, addressing the limitations of current computational techniques tailored for scRNA-seq data. Moreover, MitoSort exhibits versatility, and can be applied to various single-cell sequencing approaches beyond mtscATAC-seq provided that the mitochondrial variants are reliably detected. Furthermore, we demonstrate the application of MitoSort in a case study where B cells from eight donors were pooled and assayed by single-cell multi-omics sequencing. Altogether, our results demonstrate the accuracy and efficiency of MitoSort, which enables reliable sample demultiplexing in various single-cell genomic applications. MitoSort is available at https://github.com/tangzhj/MitoSort.</p>","PeriodicalId":94020,"journal":{"name":"Genomics, proteomics & bioinformatics","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-12-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11671100/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142484015","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}