Qingxin Yang, Yuntao Sun, Shuhan Duan, Shengjie Nie, Chao Liu, Hong Deng, Mengge Wang, Guanglin He
{"title":"High-quality Population-specific Haplotype-resolved Reference Panel in the Genomic and Pangenomic Eras.","authors":"Qingxin Yang, Yuntao Sun, Shuhan Duan, Shengjie Nie, Chao Liu, Hong Deng, Mengge Wang, Guanglin He","doi":"10.1093/gpbjnl/qzaf022","DOIUrl":"https://doi.org/10.1093/gpbjnl/qzaf022","url":null,"abstract":"<p><p>Large-scale international and regional human genomic and pangenomic resources derived from population-scale biobanks and ancient DNA sequences have provided significant insights into human evolution and the genetic determinants of complex diseases and traits. Despite these advances, challenges persist in optimizing the integration of phasing tools, merging haplotype reference panels (HRPs), developing imputation algorithms, and fully exploiting the diverse applications of post-imputation data. This review comprehensively summarizes the advancements, applications, limitations, and future directions of HRPs in human genomics research. Recent progress in the reconstruction of HRPs, based on over 830,000 human whole-genome sequences, has been synthesized, highlighting the broad spectrum of human genetic diversity captured. Additionally, we recapitulate advancements in fifty-six HRPs for global and regional populations. The evaluation of imputation accuracy indicated that Beagle and GLIMPSE are the most effective tools for phasing and imputing data from genotyping arrays and low-coverage sequencing, respectively. A critical strategy for selecting an appropriate HRP involves matching the population background of target groups with HRP reference populations and considering multi-ancestry or homogeneous genetic structures. The necessity of a single, integrative, high-quality HRP that captures haplotype structures and genetic diversity across various genetic variation types from globally representative populations is emphasized to support both modern and ancient genomic research and advance human precision medicine.</p>","PeriodicalId":94020,"journal":{"name":"Genomics, proteomics & bioinformatics","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-03-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143588971","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"HumanTestisDB: A Comprehensive Atlas of Testicular Transcriptomics and Cellular Interactions.","authors":"Mengjie Wang, Laihua Li, Qing Cheng, Hao Zhang, Zhaode Liu, Yiqiang Cui, Jiahao Sha, Yan Yuan","doi":"10.1093/gpbjnl/qzaf015","DOIUrl":"https://doi.org/10.1093/gpbjnl/qzaf015","url":null,"abstract":"<p><p>Advances in single-cell technology have enabled the detailed mapping of testicular cell transcriptomes, which is essential for understanding spermatogenesis. However, the fragmented nature of age-specific data from various literature sources has hindered comprehensive analysis. To overcome this, the Human Testis Database (HumanTestisDB) was developed, consolidating multiple human testicular sequencing datasets to address this limitation. Through extensive investigation, 38 unique cell types were identified, providing a detailed perspective on cellular variety. Furthermore, the database systematically categorizes samples into eight developmental stages, offering a structured framework to comprehend the temporal dynamics of testicular development. Each stage features comprehensive maps of cell-cell interactions, elucidating the complex communication network inside the testicular microenvironment at particular developmental stages. Moreover, by facilitating comparisons of interactions among various cell types at different stages, the database permits examining alterations that transpire during critical transitions in spermatogenesis. HumanTestisDB, available at https://shalab.njmu.edu.cn/humantestisdb, offers vital insights into testicular transcriptomics and interactions, serving as an essential resource for advancing research in reproductive biology.</p>","PeriodicalId":94020,"journal":{"name":"Genomics, proteomics & bioinformatics","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-03-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143569060","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ying Yi, Yongfei Hu, Juanjuan Kang, Qifa Liu, Yan Huang, Dong Wang
{"title":"Biological Data Resources and Machine Learning Frameworks for Hematology Research.","authors":"Ying Yi, Yongfei Hu, Juanjuan Kang, Qifa Liu, Yan Huang, Dong Wang","doi":"10.1093/gpbjnl/qzaf021","DOIUrl":"https://doi.org/10.1093/gpbjnl/qzaf021","url":null,"abstract":"<p><p>Hematology research has greatly benefited from the integration of diverse biological data resources and advanced machine learning frameworks. This integration has not only deepened our understanding of blood diseases such as leukemia and lymphoma, but also enhanced diagnostic accuracy and personalized treatment strategies. By applying machine learning algorithms to analyze large-scale biological data, researchers are able to more effectively identify disease patterns, predict treatment responses, and provide new perspectives for the diagnosis and treatment of hematologic disorders. Here, we provide an overview of the current landscape of biological data resources and the application of machine learning frameworks pertinent to hematology research.</p>","PeriodicalId":94020,"journal":{"name":"Genomics, proteomics & bioinformatics","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143560398","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Wenbin Huang, Zhenwei Qian, Jieni Zhang, Yi Ding, Bin Wang, Jiuxiang Lin, Xiannian Zhang, Huaxiang Zhao, Feng Chen
{"title":"Single-cell Atlas of Developing Mouse Palates Reveals Cellular and Molecular Transitions in Periderm Cell Fate.","authors":"Wenbin Huang, Zhenwei Qian, Jieni Zhang, Yi Ding, Bin Wang, Jiuxiang Lin, Xiannian Zhang, Huaxiang Zhao, Feng Chen","doi":"10.1093/gpbjnl/qzaf013","DOIUrl":"https://doi.org/10.1093/gpbjnl/qzaf013","url":null,"abstract":"<p><p>Cleft palate is one of the most common congenital craniofacial disorders that affects children's appearance and oral functions. Investigating the transcriptomics during palatogenesis is crucial for comprehending the etiology of this disorder and facilitating prenatal molecular diagnosis. However, there is limited knowledge about the single-cell differentiation dynamics during mid-palatogenesis and late-palatogenesis, specifically regarding the subpopulations and developmental trajectories of periderm, a rare but critical cell population. Here we explored the single-cell landscape of mouse developing palates from embryonic day (E) 10.5 to E16.5. We systematically depicted the single-cell transcriptomics of mesenchymal and epithelial cells during palatogenesis, including subpopulations and differentiation dynamics. Additionally, we identified four subclusters of palatal periderm and constructed two distinct trajectories of cell fates for periderm cells. Our findings reveal that claudin-family coding genes and Arhgap29 play a role in the non-stick function of the periderm before the palatal shelves contact, and Pitx2 mediates the adhesion of periderm during the contact of opposing palatal shelves. Furthermore, we demonstrated that epithelial-mesenchymal transition (EMT), apoptosis, and migration collectively contribute to the degeneration of periderm cells in the medial epithelial seam. Taken together, our study suggests a novel model of periderm development during palatogenesis and delineates the cellular and molecular transitions in periderm cell determination.</p>","PeriodicalId":94020,"journal":{"name":"Genomics, proteomics & bioinformatics","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143560327","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Taoyu Chen, Guoguo Tang, Tianhao Li, Zhining Yanghong, Chao Hou, Zezhou Du, Kaiqiang You, Liwei Ma, Tingting Li
{"title":"PhaSeDis: A Manually Curated Database of Phase Separation-Disease Associations and Corresponding Small Molecules.","authors":"Taoyu Chen, Guoguo Tang, Tianhao Li, Zhining Yanghong, Chao Hou, Zezhou Du, Kaiqiang You, Liwei Ma, Tingting Li","doi":"10.1093/gpbjnl/qzaf014","DOIUrl":"https://doi.org/10.1093/gpbjnl/qzaf014","url":null,"abstract":"<p><p>Biomacromolecules form membraneless organelles through liquid-liquid phase separation in order to regulate the efficiency of particular biochemical reactions. Dysregulation of phase separation might result in pathological condensation or sequestration of biomolecules, leading to diseases. Thus, phase separation and phase separating factors may serve as drug targets for disease treatment. Nevertheless, such associations have not yet been integrated into phase separation related databases. Therefore, based on MloDisDB, a database for membraneless organelle factor-disease association previously developed by our lab, we constructed PhaSeDis, the phase separation-disease association database. We increased the number of phase separation entries from 52 to 185, and supplemented the evidence provided by the original article verifying the phase separation nature of the factors. Moreover, we included the information of interacting small molecules with low-throughput or high-throughput evidence that might serve as potential drugs for phase separation entries. PhaSeDis strives to offer comprehensive descriptions of each entry, elucidating how phase separating factors induce pathological conditions via phase separation and the mechanisms by which small molecules intervene. We believe that PhaSeDis would be very important in the application of phase separation regulation in treating related diseases. PhaSeDis is available at http://mlodis.phasep.pro.</p>","PeriodicalId":94020,"journal":{"name":"Genomics, proteomics & bioinformatics","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143560320","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Bingru Zhao, Hanpeng Luo, Xuefeng Fu, Guoming Zhang, Emily L Clark, Feng Wang, Brian Paul Dalrymple, V Hutton Oddy, Philip E Vercoe, Cuiling Wu, George E Liu, Cong-Jun Li, Ruidong Xiang, Kechuan Tian, Yanli Zhang, Lingzhao Fang
{"title":"A Developmental Gene Expression Atlas Reveals Novel Biological Basis of Complex Phenotypes in Sheep.","authors":"Bingru Zhao, Hanpeng Luo, Xuefeng Fu, Guoming Zhang, Emily L Clark, Feng Wang, Brian Paul Dalrymple, V Hutton Oddy, Philip E Vercoe, Cuiling Wu, George E Liu, Cong-Jun Li, Ruidong Xiang, Kechuan Tian, Yanli Zhang, Lingzhao Fang","doi":"10.1093/gpbjnl/qzaf020","DOIUrl":"https://doi.org/10.1093/gpbjnl/qzaf020","url":null,"abstract":"<p><p>Sheep (Ovis aries) represents one of the most important livestock species for animal protein and wool production worldwide. However, little is known about the genetic and biological basis of ovine phenotypes, particularly for those of high economic value and environmental impact. Here, by integrating 1413 RNA-seq samples from 51 distinct tissues across 14 developmental time points, representing early prenatal, late prenatal, neonate, lamb, juvenile, adult, and elderly stages, we built a high-resolution developmental Gene Expression Atlas (dGEA) in sheep. We observed dynamic patterns of gene expression and regulatory networks across tissues and developmental stages. When harnessing this resource for interpreting genetic associations of 48 monogenetic and 12 complex traits in sheep, we found that genes upregulated at prenatal developmental stages played more important roles in shaping these phenotypes than those upregulated at postnatal stages. For instance, genetic associations of crimp number, mean staple length (MSL), and individual birth weight were significantly enriched in the prenatal rather than postnatal skin and immune tissues. By comprehensively integrating GWAS fine-mapping results and the sheep dGEA, we proposed several candidate genes for complex traits in sheep, such as SOX9 for MSL, GNRHR for litter size at birth, and PRKDC for live weight. These results provide novel insights into the developmental and molecular architecture underlying ovine phenotypes. The dGEA (https://sheepdgea.njau.edu.cn/) will serve as an invaluable resource for sheep developmental biology, genetics, genomics, and selective breeding.</p>","PeriodicalId":94020,"journal":{"name":"Genomics, proteomics & bioinformatics","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143560395","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Guohao Han, Peng Yang, Yongjin Zhang, Qiaowei Li, Xinhao Fan, Ruipu Chen, Chao Yan, Mu Zeng, Yalan Yang, Zhonglin Tang
{"title":"PIGOME: An Integrated and Comprehensive Multi-omics Database for Pig Functional Genomics Studies.","authors":"Guohao Han, Peng Yang, Yongjin Zhang, Qiaowei Li, Xinhao Fan, Ruipu Chen, Chao Yan, Mu Zeng, Yalan Yang, Zhonglin Tang","doi":"10.1093/gpbjnl/qzaf016","DOIUrl":"https://doi.org/10.1093/gpbjnl/qzaf016","url":null,"abstract":"<p><p>In addition to being a major source of animal protein, pigs are an important model for the study of development and diseases in humans. During the past two decades, thousands of high-throughput sequencing studies in pigs have been performed using a variety of tissues from different breeds and developmental stages. However, the multi-omics database specifically used for pig functional genomic research is still limited. Here, we present a user-friendly database of pig multi-omics named PIGOME. PIGOME currently contains seven types of pig omics datasets, including whole-genome sequencing (WGS), RNA sequencing (RNA-seq), microRNA sequencing (miRNA-seq), chromatin immunoprecipitation sequencing (ChIP-seq), assay for transposase-accessible chromatin sequencing (ATAC-seq), bisulfite sequencing (BS-seq), and methylated RNA immunoprecipitation sequencing (MeRIP-seq), from 6901 samples and 392 projects with manually curated metadata, integrated gene annotation, and quantitative trait locus information. Furthermore, various \"Explore\" and \"Browse\" functions have been established for user-friendly access to omics information. PIGOME implemented several tools to visualize genomic variants, gene expression, and epigenetic signals of a given gene in the pig genome, enabling efficient exploration of spatial-temporal gene expression/epigenetic pattern, function, regulatory mechanism, and associated economic traits. Collectively, PIGOME provides valuable resources for pig breeding and is helpful for human biomedical research. PIGOME is available at https://pigome.com.</p>","PeriodicalId":94020,"journal":{"name":"Genomics, proteomics & bioinformatics","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-02-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143560323","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Natália Aniceto, Nuno Martinho, Ismael Rufino, Rita C Guedes
{"title":"LigExtract: Large-scale Automated Identification of Ligands from Protein Structures in the Protein Data Bank.","authors":"Natália Aniceto, Nuno Martinho, Ismael Rufino, Rita C Guedes","doi":"10.1093/gpbjnl/qzaf018","DOIUrl":"https://doi.org/10.1093/gpbjnl/qzaf018","url":null,"abstract":"<p><p>The Protein Data Bank is an ever-growing database of 3D macromolecular structures that has become a crucial resource for the drug discovery process. Exploring complexed proteins and accessing the ligands in these proteins is paramount to help researchers understand biological processes and design new compounds of pharmaceutical interest. However, currently available tools to perform large-scale ligand identification do not address many of the more complex ways in which ligands are stored and represented in PDB structures. Therefore, a new tool called LigExtract was specifically developed for the large-scale processing of PDB structures and the identification of their ligands. This is a fully open-source tool available to the scientific community, designed to provide end-to-end processing whereby the user simply provides a list of UniProt IDs and LigExtract returns a list of ligands, their individual PDB files, a PDB file of the protein chains engaged with the ligand and a series of log files that inform the user of the decisions made during the ligand extraction process as well as potential flagging of additional scenarios that might have to be considered during any follow-up use of the processed files (e.g., ligands covalently bound to the protein). LigExtract is available, open-source, on GitHub (https://github.com/comp-medchem/LigExtract).</p>","PeriodicalId":94020,"journal":{"name":"Genomics, proteomics & bioinformatics","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-02-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143560407","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Challenges in AI-driven Biomedical Multimodal Data Fusion and Analysis.","authors":"Junwei Liu, Xiaoping Cen, Chenxin Yi, Feng-Ao Wang, Junxiang Ding, Jinyu Cheng, Qinhua Wu, Baowen Gai, Yiwen Zhou, Ruikun He, Feng Gao, Yixue Li","doi":"10.1093/gpbjnl/qzaf011","DOIUrl":"https://doi.org/10.1093/gpbjnl/qzaf011","url":null,"abstract":"<p><p>The rapid development of biological and medical examination methods has vastly expanded personal biomedical information, including molecular, cellular, image, and electronic health record datasets. Integrating this wealth of information enables precise disease diagnosis, biomarker identification, and treatment design in clinical settings. Artificial intelligence (AI) techniques, particularly deep learning models, have been extensively employed in biomedical applications, demonstrating increased precision, efficiency, and generalization. The success of the large language and vision models further significantly extends their biomedical applications. However, challenges remain in learning these multimodal biomedical datasets, such as data privacy, fusion, and model interpretation. In this review, we provided a comprehensive overview of various biomedical data modalities, multi-modal representation learning methods, and the applications of AI in biomedical data integrative analysis. Additionally, we discussed the challenges in applying these deep learning methods and how to better integrate them into biomedical scenarios. We then proposed future directions for adapting deep learning methods with model pre-training and knowledge integration to advance biomedical research and benefit their clinical applications.</p>","PeriodicalId":94020,"journal":{"name":"Genomics, proteomics & bioinformatics","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-02-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143560401","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}