{"title":"CircAge: A Comprehensive Resource for Aging-associated Circular RNAs Across Species and Tissues.","authors":"Xin Dong, Zhen Zhou, Yanan Wang, Ayesha Nisar, Shaoyan Pu, Longbao Lv, Yijiang Li, Xuemei Lu, Yonghan He","doi":"10.1093/gpbjnl/qzaf044","DOIUrl":"10.1093/gpbjnl/qzaf044","url":null,"abstract":"<p><p>Circular RNAs (circRNAs) represent a novel class of RNA molecules characterized by a circular structure and enhanced stability. Emerging evidence indicates that circRNAs play pivotal regulatory roles in the aging process. However, a systematic resource that integrates aging-associated circRNA data remains lacking. Therefore, we developed a comprehensive database, CircAge, which encompasses 756 aging-related samples from 7 species and 24 tissue types. Through high-throughput sequencing, we also generated 47 new tissue samples from mice and rhesus monkeys. By integrating predictions from multiple bioinformatics tools, we identified over 529,856 unique circRNAs. Our data analysis revealed a general increase in circRNA expression levels with age, with approximately 23% of circRNAs demonstrating sequence conservation across species. The CircAge database systematically predicts potential interactions between circRNAs, microRNAs (miRNAs), and RNA-binding proteins (RBPs), and assesses the coding potential of circRNAs. This resource lays a foundation for elucidating the regulatory mechanisms of circRNAs in aging. As a comprehensive repository of aging-associated circRNAs, CircAge will significantly accelerate research in this field, facilitating the discovery of novel biomarkers and therapeutic targets for aging biology and supporting the development of diagnostic and therapeutic strategies for aging and age-related diseases. CircAge is publicly available at https://circage.kiz.ac.cn.</p>","PeriodicalId":94020,"journal":{"name":"Genomics, proteomics & bioinformatics","volume":" ","pages":""},"PeriodicalIF":7.9,"publicationDate":"2025-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12448220/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144033265","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Haplotype-based Pangenomics: A Blueprint for Climate Adaptation in Plants.","authors":"Wanfei Liu 刘万飞, Peng Cui 崔鹏","doi":"10.1093/gpbjnl/qzaf023","DOIUrl":"10.1093/gpbjnl/qzaf023","url":null,"abstract":"","PeriodicalId":94020,"journal":{"name":"Genomics, proteomics & bioinformatics","volume":" ","pages":""},"PeriodicalIF":7.9,"publicationDate":"2025-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12380446/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143588968","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ran Li 李燃, Yu Zang 臧钰, Zhentang Liu 刘震棠, Jingyi Yang 杨静怡, Nana Wang 汪娜娜, Jiajun Liu 刘佳俊, Enlin Wu 吴恩霖, Riga Wu 乌日嘎, Hongyu Sun 孙宏钰
{"title":"clusIBD: Robust Detection of Identity-by-descent Segments Using Unphased Genetic Data from Poor-quality Samples.","authors":"Ran Li 李燃, Yu Zang 臧钰, Zhentang Liu 刘震棠, Jingyi Yang 杨静怡, Nana Wang 汪娜娜, Jiajun Liu 刘佳俊, Enlin Wu 吴恩霖, Riga Wu 乌日嘎, Hongyu Sun 孙宏钰","doi":"10.1093/gpbjnl/qzaf055","DOIUrl":"10.1093/gpbjnl/qzaf055","url":null,"abstract":"<p><p>The detection of identity-by-descent (IBD) segments is widely used to infer relatedness in many fields, including forensics and ancient DNA analysis. However, existing methods are often ineffective for poor-quality DNA samples. Here, we propose a method, clusIBD, which can robustly detect IBD segments using unphased genetic data with a high rate of genotyping error. We evaluated and compared the performance of clusIBD with that of IBIS, TRUFFLE, and IBDseq using simulated data, artificial poor-quality materials, and ancient DNA samples. The results show that clusIBD outperforms these existing tools and could be used for kinship inference in fields such as ancient DNA analysis and criminal investigation. clusIBD is publicly available at GitHub (https://github.com/Ryan620/clusIBD/) and BioCode (https://ngdc.cncb.ac.cn/biocode/tool/BT007882).</p>","PeriodicalId":94020,"journal":{"name":"Genomics, proteomics & bioinformatics","volume":" ","pages":""},"PeriodicalIF":7.9,"publicationDate":"2025-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12449261/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144512858","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Chen Wang 王晨, Hong Zhao 赵洪, Hongkui Zhang 张洪魁, Sijie Sun 孙思杰, Yongbiao Xue 薛勇彪
{"title":"PSIA: A Comprehensive Knowledgebase of Plant Self-incompatibility.","authors":"Chen Wang 王晨, Hong Zhao 赵洪, Hongkui Zhang 张洪魁, Sijie Sun 孙思杰, Yongbiao Xue 薛勇彪","doi":"10.1093/gpbjnl/qzaf046","DOIUrl":"10.1093/gpbjnl/qzaf046","url":null,"abstract":"<p><p>Self-incompatibility (SI) is an important genetic mechanism in angiosperms that prevents inbreeding and promotes outcrossing, with significant implications for crop breeding, including genetic diversity, hybrid seed production, and yield optimization. In eudicots, SI is typically governed by a single S-locus containing tightly linked pistil and pollen S-determinant genes. Despite major advances in SI research, a centralized, comprehensive resource for SI-related genomic data remains lacking. To address this gap, we developed the Plant Self-Incompatibility Atlas (PSIA), a systematically curated knowledgebase providing an extensive compilation of plant SI, including genomic resources for SI species, S gene annotations, molecular mechanisms, phylogenetic relationships, and comparative genomic analyses. The current release of PSIA includes over 500 genome assemblies from 469 SI species. Using known S genes as queries, we manually identified and rigorously curated 3700 S genes. PSIA provides detailed S-locus information from assembled genomes of SI species and offers an interactive platform for browsing, BLAST searches, S gene analysis, and data retrieval. Additionally, PSIA serves as a unique platform for comparative genomic studies of S-loci, facilitating exploration of the dynamic processes underlying the origin, loss, and regain of SI. As a comprehensive and user-friendly resource, PSIA will greatly advance our understanding of angiosperm SI and serve as a valuable tool for crop breeding and hybrid seed production. PSIA is freely available at http://www.plantsi.cn.</p>","PeriodicalId":94020,"journal":{"name":"Genomics, proteomics & bioinformatics","volume":" ","pages":""},"PeriodicalIF":7.9,"publicationDate":"2025-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12396629/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144113067","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Miaomiao Chen, Yujie Gou, Ming Lei, Leming Xiao, Miaoying Zhao, Xinhe Huang, Dan Liu, Zihao Feng, Di Peng, Yu Xue
{"title":"EPSD 2.0: An Updated Database of Protein Phosphorylation Sites Across Eukaryotic Species.","authors":"Miaomiao Chen, Yujie Gou, Ming Lei, Leming Xiao, Miaoying Zhao, Xinhe Huang, Dan Liu, Zihao Feng, Di Peng, Yu Xue","doi":"10.1093/gpbjnl/qzaf057","DOIUrl":"10.1093/gpbjnl/qzaf057","url":null,"abstract":"<p><p>As one of the most crucial post-translational modifications, protein phosphorylation regulates a broad range of biological processes in eukaryotes. Biocuration, integration, and annotation of reported phosphorylation events will deliver a valuable resource for the community. Here, we present an updated database, the eukaryotic phosphorylation site database 2.0 (EPSD 2.0), which includes 2,769,163 experimentally identified phosphorylation sites (p-sites) in 362,707 phosphoproteins from 223 eukaryotes. From the literature, 873,718 new p-sites identified through high-throughput phosphoproteomic research were first collected, and 1,078,888 original phosphopeptides together with primary references were reserved. Then, this dataset was merged into EPSD 1.0, comprising 1,616,804 p-sites within 209,326 proteins across 68 eukaryotic organisms. We also integrated 362,190 additional known p-sites from 10 public databases. After redundancy clearance, we manually re-checked each p-site and annotated 88,074 functional events for 32,762 p-sites, covering 58 types of downstream effects on phosphoproteins, and regulatory impacts on 107 biological processes. In addition, phosphoproteins and p-sites in 8 model organisms were meticulously annotated utilizing information supplied by 100 external platforms encompassing 15 areas. These areas included kinase/phosphatase, transcription regulators, three-dimensional structures, physicochemical characteristics, genomic variations, functional descriptions, protein domains, molecular interactions, drug-target associations, disease-related data, orthologs, transcript expression levels, proteomics, subcellular localization, and regulatory pathways. We expect that EPSD 2.0 will become a useful database supporting comprehensive studies on phosphorylation in eukaryotes. The EPSD 2.0 database is freely accessible online at https://epsd.biocuckoo.cn/.</p>","PeriodicalId":94020,"journal":{"name":"Genomics, proteomics & bioinformatics","volume":" ","pages":""},"PeriodicalIF":7.9,"publicationDate":"2025-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12448286/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144531957","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"GCRP: Integrated Global Chicken Reference Panel from 11,951 Chicken Genomes.","authors":"Di Zhu, Yuzhan Wang, Hao Qu, Chungang Feng, Hui Zhang, Zheya Sheng, Yunliang Jiang, Qinghua Nie, Suqiao Chu, Dingming Shu, Ziqin Jiang, Dexiang Zhang, Lingzhao Fang, Hui Li, Zhenqiang Xu, Yiqiang Zhao, Yuzhe Wang, Xiaoxiang Hu","doi":"10.1093/gpbjnl/qzaf032","DOIUrl":"10.1093/gpbjnl/qzaf032","url":null,"abstract":"<p><p>Chickens are a crucial source of protein for humans and a popular model animal for bird research. Despite the emergence of imputation as a reliable genotyping strategy for large populations, the lack of a high-quality chicken reference panel has hindered progress in chicken genome research. To address this, here we introduce the first phase of the 100K Global Chicken Reference Panel (100K GCRP). Currently, two panels are available: a comprehensive mix panel (CMP) for domestication diversity research and a commercial breed panel (CBP) for breeding broilers specifically. Evaluation of genotype imputation quality showed that CMP had the highest imputation accuracy compared to imputation using existing chicken panels in Animal-SNPAtlas and Animal Genotype Imputation Database (AGIDB), whereas CBP performed stably in the imputation of commercial populations. Additionally, we found that genome-wide association studies using GCRP-imputed data, whether on simulated or real phenotypes, exhibited greater statistical power. In conclusion, our study indicates that the GCRP effectively fills the gap in high-quality reference panels for chickens, providing an effective imputation platform for future genetic and breeding research. The project includes 11,951 samples and provides services for various applications on its website at http://farmrefpanel.com/GCRP/#/.</p>","PeriodicalId":94020,"journal":{"name":"Genomics, proteomics & bioinformatics","volume":" ","pages":""},"PeriodicalIF":7.9,"publicationDate":"2025-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12458076/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144033324","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"CRISPR Technology and Its Emerging Applications.","authors":"Xuejing Zhang, Dongyuan Ma, Feng Liu","doi":"10.1093/gpbjnl/qzaf034","DOIUrl":"10.1093/gpbjnl/qzaf034","url":null,"abstract":"<p><p>The discovery and iteration of clustered regularly interspaced short palindromic repeats (CRISPR) systems have revolutionized genome editing due to their remarkable efficiency and easy programmability, enabling precise manipulation of genomic elements. Owing to these unique advantages, CRISPR technology has the transformative potential to elucidate biological mechanisms and develop clinical treatments. This review provides a comprehensive overview of the development and applications of CRISPR technology. After describing the three primary CRISPR-Cas systems - CRISPR-associated protein 9 (Cas9) and Cas12a targeting DNA, and Cas13 targeting RNA - which serve as the cornerstone for technological advancements, we describe a series of novel CRISPR-Cas systems that offer new avenues for research, and then explore the applications of CRISPR technology in large-scale genetic screening, lineage tracing, genetic diagnosis, and gene therapy. As this technology evolves, it holds significant promise for studying gene functions and treating human diseases in the near future.</p>","PeriodicalId":94020,"journal":{"name":"Genomics, proteomics & bioinformatics","volume":" ","pages":""},"PeriodicalIF":7.9,"publicationDate":"2025-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12449060/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144033323","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Guangshuo Cao 曹广硕, Haoyu Chao 晁好瑜, Wenqi Zheng, Yangming Lan, Kaiyan Lu, Yueyi Wang, Ming Chen 陈铭, He Zhang 张和, Dijun Chen 陈迪俊
{"title":"scPlantLLM: A Foundation Model for Exploring Single-cell Expression Atlases in Plants.","authors":"Guangshuo Cao 曹广硕, Haoyu Chao 晁好瑜, Wenqi Zheng, Yangming Lan, Kaiyan Lu, Yueyi Wang, Ming Chen 陈铭, He Zhang 张和, Dijun Chen 陈迪俊","doi":"10.1093/gpbjnl/qzaf024","DOIUrl":"10.1093/gpbjnl/qzaf024","url":null,"abstract":"<p><p>Single-cell RNA sequencing (scRNA-seq) provides unprecedented insights into plant cellular diversity by enabling high-resolution analyses of gene expression at the single-cell level. However, the complexity of scRNA-seq data, including challenges in batch integration, cell type annotation, and gene regulatory network (GRN) inference, demands advanced computational approaches. To address these challenges, we developed scPlantLLM, a Transformer model trained on millions of plant single-cell data points. Using a sequential pretraining strategy incorporating masked language modeling and cell type annotation tasks, scPlantLLM generates robust and interpretable single-cell data embeddings. When applied to Arabidopsis thaliana datasets, scPlantLLM excels in clustering, cell type annotation, and batch integration, achieving an accuracy of up to 0.91 in zero-shot learning scenarios. Furthermore, the model demonstrates an ability to identify biologically meaningful GRNs and subtle cellular subtypes, showcasing its potential to advance plant biology research. Compared to traditional methods, scPlantLLM outperforms in key metrics such as adjusted rand index (ARI), normalized mutual information (NMI), and silhouette score (SIL), highlighting its superior clustering accuracy and biological relevance. scPlantLLM represents a foundation model for exploring plant single-cell expression atlases, offering unprecedented capabilities to resolve cellular heterogeneity and regulatory dynamics across diverse plant systems. The code used in this study is available at https://github.com/compbioNJU/scPlantLLM.</p>","PeriodicalId":94020,"journal":{"name":"Genomics, proteomics & bioinformatics","volume":" ","pages":""},"PeriodicalIF":7.9,"publicationDate":"2025-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12417071/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143652870","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jenea I Adams, Eric Kutschera, Qiang Hu, Chun-Jie Liu, Qian Liu, Kathryn Kadash-Edmondson, Song Liu, Yi Xing
{"title":"rMATS-cloud: Large-scale Alternative Splicing Analysis in the Cloud.","authors":"Jenea I Adams, Eric Kutschera, Qiang Hu, Chun-Jie Liu, Qian Liu, Kathryn Kadash-Edmondson, Song Liu, Yi Xing","doi":"10.1093/gpbjnl/qzaf036","DOIUrl":"10.1093/gpbjnl/qzaf036","url":null,"abstract":"<p><p>Although gene expression analysis pipelines are often a standard part of bioinformatics analysis, with many publicly available cloud workflows, cloud-based alternative splicing analysis tools remain limited. Our lab released rMATS in 2014 and has continuously maintained it, providing a fast and versatile solution for quantifying alternative splicing from RNA sequencing (RNA-seq) data. Here, we present rMATS-cloud, a portable version of the rMATS workflow that can be run in virtually any cloud environment suited for biomedical research. We compared the time and cost of running rMATS-cloud with two RNA-seq datasets on three different platforms (Cavatica, Terra, and Seqera). Our findings demonstrate that rMATS-cloud handles RNA-seq datasets with thousands of samples, and therefore is ideally suited for the storage capacities of many cloud data repositories. rMATS-cloud is available at https://dockstore.org/workflows/github.com/Xinglab/rmats-turbo/rmats-turbo-cwl, https://dockstore.org/workflows/github.com/Xinglab/rmats-turbo/rmats-turbo-wdl, and https://dockstore.org/workflows/github.com/Xinglab/rmats-turbo/rmats-turbo-nextflow.</p>","PeriodicalId":94020,"journal":{"name":"Genomics, proteomics & bioinformatics","volume":" ","pages":""},"PeriodicalIF":7.9,"publicationDate":"2025-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12248417/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144015485","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Foundation Model: A New Era for Plant Single-cell Genomics.","authors":"Yuansong Zeng 曾远松, Yuedong Yang 杨跃东","doi":"10.1093/gpbjnl/qzaf059","DOIUrl":"10.1093/gpbjnl/qzaf059","url":null,"abstract":"","PeriodicalId":94020,"journal":{"name":"Genomics, proteomics & bioinformatics","volume":" ","pages":""},"PeriodicalIF":7.9,"publicationDate":"2025-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12380448/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144487498","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}