GigaSciencePub Date : 2025-01-06DOI: 10.1093/gigascience/giae112
Yang Zhou, Jiazheng Jin, Xuemei Li, Gregory Gedman, Sarah Pelan, Arang Rhie, Chuan Jiang, Olivier Fedrigo, Kerstin Howe, Adam M Phillippy, Erich D Jarvis, Frank Grutzner, Qi Zhou, Guojie Zhang
{"title":"Chromosome-level echidna genome illuminates evolution of multiple sex chromosome system in monotremes.","authors":"Yang Zhou, Jiazheng Jin, Xuemei Li, Gregory Gedman, Sarah Pelan, Arang Rhie, Chuan Jiang, Olivier Fedrigo, Kerstin Howe, Adam M Phillippy, Erich D Jarvis, Frank Grutzner, Qi Zhou, Guojie Zhang","doi":"10.1093/gigascience/giae112","DOIUrl":"10.1093/gigascience/giae112","url":null,"abstract":"<p><strong>Background: </strong>A thorough analysis of genome evolution is fundamental for biodiversity understanding. The iconic monotremes (platypus and echidna) feature extraordinary biology. However, they also exhibit rearrangements in several chromosomes, especially in the sex chromosome chain. Therefore, the lack of a chromosome-level echidna genome has limited insights into genome evolution in monotremes, in particular the multiple sex chromosomes complex.</p><p><strong>Results: </strong>Here, we present a new long reads-based chromosome-level short-beaked echidna (Tachyglossus aculeatus) genome, which allowed the inference of chromosomal rearrangements in the monotreme ancestor (2n = 64) and each extant species. Analysis of the more complete sex chromosomes uncovered homology between 1 Y chromosome and multiple X chromosomes, suggesting that it is the ancestral X that has undergone reciprocal translocation with ancestral autosomes to form the complex. We also identified dozens of ampliconic genes on the sex chromosomes, with several ancestral ones expressed during male meiosis, suggesting selective constraints in pairing the multiple sex chromosomes.</p><p><strong>Conclusion: </strong>The new echidna genome provides an important basis for further study of the unique biology and conservation of this species.</p>","PeriodicalId":12581,"journal":{"name":"GigaScience","volume":"14 ","pages":""},"PeriodicalIF":11.8,"publicationDate":"2025-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11710854/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142947512","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
GigaSciencePub Date : 2025-01-06DOI: 10.1093/gigascience/giaf054
Eugenie C Yen, James D Gilbert, Alice Balard, Albert Taxonera, Kirsten Fairweather, Heather L Ford, Doko-Miles J Thorburn, Stephen J Rossiter, José M Martín-Durán, Christophe Eizaguirre
{"title":"Chromosome-level genome assembly and methylome profile yield insights for the conservation of endangered loggerhead sea turtles.","authors":"Eugenie C Yen, James D Gilbert, Alice Balard, Albert Taxonera, Kirsten Fairweather, Heather L Ford, Doko-Miles J Thorburn, Stephen J Rossiter, José M Martín-Durán, Christophe Eizaguirre","doi":"10.1093/gigascience/giaf054","DOIUrl":"https://doi.org/10.1093/gigascience/giaf054","url":null,"abstract":"<p><strong>Background: </strong>Characterizing genetic and epigenetic diversity is crucial for assessing the adaptive potential of threatened populations and species in the face of climate change. Sea turtles are particularly vulnerable due to their temperature-dependent sex determination (TSD) system, which heightens the risk of extreme sex ratio bias and extinction under future climate scenarios. High-quality genomic and epigenomic resources will therefore support conservation efforts for these endangered flagship species with such plastic traits.</p><p><strong>Findings: </strong>We generated a chromosome-level genome assembly for the loggerhead sea turtle (Caretta caretta) from the globally important Cabo Verde rookery. Using Oxford Nanopore Technology (ONT) and Illumina reads followed by homology-guided scaffolding to the same species, we achieved a contiguous (N50: 129.7 Mbp) and complete (BUSCO: 97.1%) assembly, with 98.9% of the genome scaffolded into 28 chromosomes and 33,887 annotated genes. We also extracted the blood methylome profile from our ONT reads, which was confirmed to be representative of the reference population via whole-genome bisulfite sequencing of 10 additional loggerheads from the same population. Applying our novel resources, we revealed high conservation of synteny between sea turtle species, reconstructed population size fluctuations in line with major climatic events, and identified microchromosomes as key regions for monitoring genetic diversity and epigenetic flexibility. Isolating 199 TSD-linked genes, we further built a large network of functional protein associations and blood-based methylation patterns.</p><p><strong>Conclusions: </strong>We present a high-quality loggerhead sea turtle genome and methylome from the globally significant East Atlantic population. By leveraging ONT sequencing, we generate genomic and epigenomic resources simultaneously and showcase the potential of this approach for driving molecular insights for conservation of endangered sea turtles.</p>","PeriodicalId":12581,"journal":{"name":"GigaScience","volume":"14 ","pages":""},"PeriodicalIF":11.8,"publicationDate":"2025-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12143204/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144247442","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Unveiling patterns in spatial transcriptomics data: a novel approach utilizing graph attention autoencoder and multiscale deep subspace clustering network.","authors":"Liqian Zhou, Xinhuai Peng, Min Chen, Xianzhi He, Geng Tian, Jialiang Yang, Lihong Peng","doi":"10.1093/gigascience/giae103","DOIUrl":"10.1093/gigascience/giae103","url":null,"abstract":"<p><strong>Background: </strong>The accurate deciphering of spatial domains, along with the identification of differentially expressed genes and the inference of cellular trajectory based on spatial transcriptomic (ST) data, holds significant potential for enhancing our understanding of tissue organization and biological functions. However, most of spatial clustering methods can neither decipher complex structures in ST data nor entirely employ features embedded in different layers.</p><p><strong>Results: </strong>This article introduces STMSGAL, a novel framework for analyzing ST data by incorporating graph attention autoencoder and multiscale deep subspace clustering. First, STMSGAL constructs ctaSNN, a cell type-aware shared nearest neighbor graph, using Louvian clustering exclusively based on gene expression profiles. Subsequently, it integrates expression profiles and ctaSNN to generate spot latent representations using a graph attention autoencoder and multiscale deep subspace clustering. Lastly, STMSGAL implements spatial clustering, differential expression analysis, and trajectory inference, providing comprehensive capabilities for thorough data exploration and interpretation. STMSGAL was evaluated against 7 methods, including SCANPY, SEDR, CCST, DeepST, GraphST, STAGATE, and SiGra, using four 10x Genomics Visium datasets, 1 mouse visual cortex STARmap dataset, and 2 Stereo-seq mouse embryo datasets. The comparison showcased STMSGAL's remarkable performance across Davies-Bouldin, Calinski-Harabasz, S_Dbw, and ARI values. STMSGAL significantly enhanced the identification of layer structures across ST data with different spatial resolutions and accurately delineated spatial domains in 2 breast cancer tissues, adult mouse brain (FFPE), and mouse embryos.</p><p><strong>Conclusions: </strong>STMSGAL can serve as an essential tool for bridging the analysis of cellular spatial organization and disease pathology, offering valuable insights for researchers in the field.</p>","PeriodicalId":12581,"journal":{"name":"GigaScience","volume":"14 ","pages":""},"PeriodicalIF":11.8,"publicationDate":"2025-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11727722/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142978066","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
GigaSciencePub Date : 2025-01-06DOI: 10.1093/gigascience/giaf063
Zhi Song, Dehan Cai, Yanni Sun, Lusheng Wang
{"title":"PVGA: a precise viral genome assembler using an iterative alignment graph.","authors":"Zhi Song, Dehan Cai, Yanni Sun, Lusheng Wang","doi":"10.1093/gigascience/giaf063","DOIUrl":"10.1093/gigascience/giaf063","url":null,"abstract":"<p><strong>Background: </strong>Viral genome analysis is crucial for understanding virus evolution and mutation. Investigations into viral evolutionary dynamics and mutation patterns have garnered significant research attention since the outbreak of COVID-19. The basic structure of many virus genomes is highly conserved [1]. RNA viruses have high mutation rates, and single-nucleotide variations may induce substantial phenotypic alterations in terms of viral function and pathogenicity. Thus, special assembly methods are required for viral genome analysis.</p><p><strong>Result: </strong>PVGA starts with a reference genome and the sequencing reads. The first step in PVGA involves constructing an alignment graph based on a reference genome and the set of input sequencing reads. Then the optimal genomic path is determined through dynamic programming, maximizing the cumulative edge weights that reflect read support density across the alignment graph. The obtained path corresponds to a refined genome. Finally, we repeat the process by using the new reference genomes until no further improvement is possible. We evaluate PVGA's performance across both assembly and polishing tasks using simulated and real datasets, including both long reads and short reads. The experiments demonstrate that PVGA always outperforms popular existing programs in terms of the quality of assembly results, while the running time of our method is compatible to others. In particular, simulated Nanopore datasets show that our method can correctly report the true genomes with 0 mismatches and 0 indels.</p><p><strong>Conclusions: </strong>PVGA is a novel viral genome assembler that seamlessly integrates assembly and polishing into a unified workflow. Its design prioritizes high accuracy, enabling the detection of subtle genomic variations that can impact viral function and pathogenicity. By addressing the unique challenges of viral genome assembly, PVGA provides a reliable and precise solution for advancing our understanding of viral evolution and behavior.</p>","PeriodicalId":12581,"journal":{"name":"GigaScience","volume":"14 ","pages":""},"PeriodicalIF":11.8,"publicationDate":"2025-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12206156/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144474818","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
GigaSciencePub Date : 2025-01-06DOI: 10.1093/gigascience/giaf062
Ilya B Slizovskiy, Tara N Gaire, Peter M Ferm, Carissa A Odland, Scott A Dee, Joel Nerem, Jonathan E Bravo, Alejandro D Kimball, Christina Boucher, Noelle R Noyes
{"title":"Reducing skin microbiome exposure impacts through swine farm biosecurity.","authors":"Ilya B Slizovskiy, Tara N Gaire, Peter M Ferm, Carissa A Odland, Scott A Dee, Joel Nerem, Jonathan E Bravo, Alejandro D Kimball, Christina Boucher, Noelle R Noyes","doi":"10.1093/gigascience/giaf062","DOIUrl":"10.1093/gigascience/giaf062","url":null,"abstract":"<p><strong>Background: </strong>Livestock work is unique due to worker exposure to animal-associated microbiomes within the workplace. Swine workers are a unique cohort within the US livestock labor force, as they have direct daily contact with pigs and undertake mandatory biosecurity interventions. However, investigating this occupational cohort is challenging, particularly within tightly regulated commercial swine operations. Thus, little is known about the impacts of animal exposure and biosecurity protocols on the swine worker microbiome. We obtained unique samples from US swine workers, using a longitudinal study design to investigate temporal microbiome dynamics.</p><p><strong>Results: </strong>We observed a significant increase in bacterial DNA load on worker skin during the workday, with concurrent changes in the composition and abundance of microbial taxa, resistance genes, and mobile genetic elements. However, mandatory showering at the end of the workday partially returned the skin's microbiome and resistome to their original state.</p><p><strong>Conclusions: </strong>These novel results from a human cohort demonstrate that existing biosecurity practices can ameliorate work-associated microbiome impacts.</p>","PeriodicalId":12581,"journal":{"name":"GigaScience","volume":"14 ","pages":""},"PeriodicalIF":11.8,"publicationDate":"2025-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144729616","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
GigaSciencePub Date : 2025-01-06DOI: 10.1093/gigascience/giaf073
Chao Bian, Rujingwen Huan, Qiong Shi
{"title":"Telomere-to-telomere chromosome-scale genome assemblies of black and golden koi carp variants support construction of an ancient karyotype of Cypriniformes.","authors":"Chao Bian, Rujingwen Huan, Qiong Shi","doi":"10.1093/gigascience/giaf073","DOIUrl":"10.1093/gigascience/giaf073","url":null,"abstract":"<p><strong>Background: </strong>Koi carp, a variant of the common carp, is one of the most popular ornamental fish. Its genomic resources can help us better understand chromosome evolution and color phenotypes in cyprinid fish.</p><p><strong>Results: </strong>We constructed telomere-to-telomere chromosome-level genome assemblies for 2 koi carp variants (black and golden) by integrating MGI, PacBio HiFi, ONT, and Hi-C sequencing technologies. Haplotypic genomes comprised 50 chromosomes with 100 and 99 telomeres, respectively, with BUSCO results showing at least 98.8% completeness. We annotated a total of 55,023 and 54,569 protein-coding genes for black and golden koi carps, respectively, with over 96% assigned functional roles. Repetitive sequences occupy an estimated 636 Mb (41%) of the genomes. With phylogenetic analysis, we predict the koi carp variants to have split 5.3 million years ago, and we constructed an ancient karyotype of 25 ancestral chromosomes to reveal 9 major chromosomal rearrangements.</p><p><strong>Conclusions: </strong>Our study offers genome assemblies capable of predicting an ancient karyotype of Cypriniformes, with genomic resources available for in-depth investigations into diverse skin coloration in koi and other cypriniforms.</p>","PeriodicalId":12581,"journal":{"name":"GigaScience","volume":"14 ","pages":""},"PeriodicalIF":11.8,"publicationDate":"2025-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144729617","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
GigaSciencePub Date : 2024-04-22DOI: 10.1093/gigascience/giae018
Hengchuang Yin, Shufang Wu, Jie Tan, Qian Guo, Mo Li, Jinyuan Guo, Yaqi Wang, Xiaoqing Jiang, Huaiqiu Zhu
{"title":"IPEV: identification of prokaryotic and eukaryotic virus-derived sequences in virome using deep learning","authors":"Hengchuang Yin, Shufang Wu, Jie Tan, Qian Guo, Mo Li, Jinyuan Guo, Yaqi Wang, Xiaoqing Jiang, Huaiqiu Zhu","doi":"10.1093/gigascience/giae018","DOIUrl":"https://doi.org/10.1093/gigascience/giae018","url":null,"abstract":"Background The virome obtained through virus-like particle enrichment contains a mixture of prokaryotic and eukaryotic virus-derived fragments. Accurate identification and classification of these elements are crucial to understanding their roles and functions in microbial communities. However, the rapid mutation rates of viral genomes pose challenges in developing high-performance tools for classification, potentially limiting downstream analyses. Findings We present IPEV, a novel method to distinguish prokaryotic and eukaryotic viruses in viromes, with a 2-dimensional convolutional neural network combining trinucleotide pair relative distance and frequency. Cross-validation assessments of IPEV demonstrate its state-of-the-art precision, significantly improving the F1-score by approximately 22% on an independent test set compared to existing methods when query viruses share less than 30% sequence similarity with known viruses. Furthermore, IPEV outperforms other methods in accuracy on marine and gut virome samples based on annotations by sequence alignments. IPEV reduces runtime by at most 1,225 times compared to existing methods under the same computing configuration. We also utilized IPEV to analyze longitudinal samples and found that the gut virome exhibits a higher degree of temporal stability than previously observed in persistent personal viromes, providing novel insights into the resilience of the gut virome in individuals. Conclusions IPEV is a high-performance, user-friendly tool that assists biologists in identifying and classifying prokaryotic and eukaryotic viruses within viromes. The tool is available at https://github.com/basehc/IPEV.","PeriodicalId":12581,"journal":{"name":"GigaScience","volume":"46 1","pages":""},"PeriodicalIF":9.2,"publicationDate":"2024-04-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140804753","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
GigaSciencePub Date : 2024-04-22DOI: 10.1093/gigascience/giae017
Yiyan Yang, Keith Dufault-Thompson, Wei Yan, Tian Cai, Lei Xie, Xiaofang Jiang
{"title":"Large-scale genomic survey with deep learning-based method reveals strain-level phage specificity determinants","authors":"Yiyan Yang, Keith Dufault-Thompson, Wei Yan, Tian Cai, Lei Xie, Xiaofang Jiang","doi":"10.1093/gigascience/giae017","DOIUrl":"https://doi.org/10.1093/gigascience/giae017","url":null,"abstract":"Background Phage therapy, reemerging as a promising approach to counter antimicrobial-resistant infections, relies on a comprehensive understanding of the specificity of individual phages. Yet the significant diversity within phage populations presents a considerable challenge. Currently, there is a notable lack of tools designed for large-scale characterization of phage receptor-binding proteins, which are crucial in determining the phage host range. Results In this study, we present SpikeHunter, a deep learning method based on the ESM-2 protein language model. With SpikeHunter, we identified 231,965 diverse phage-encoded tailspike proteins, a crucial determinant of phage specificity that targets bacterial polysaccharide receptors, across 787,566 bacterial genomes from 5 virulent, antibiotic-resistant pathogens. Notably, 86.60% (143,200) of these proteins exhibited strong associations with specific bacterial polysaccharides. We discovered that phages with identical tailspike proteins can infect different bacterial species with similar polysaccharide receptors, underscoring the pivotal role of tailspike proteins in determining host range. The specificity is mainly attributed to the protein’s C-terminal domain, which strictly correlates with host specificity during domain swapping in tailspike proteins. Importantly, our dataset-driven predictions of phage–host specificity closely match the phage–host pairs observed in real-world phage therapy cases we studied. Conclusions Our research provides a rich resource, including both the method and a database derived from a large-scale genomics survey. This substantially enhances understanding of phage specificity determinants at the strain level and offers a valuable framework for guiding phage selection in therapeutic applications.","PeriodicalId":12581,"journal":{"name":"GigaScience","volume":"41 1","pages":""},"PeriodicalIF":9.2,"publicationDate":"2024-04-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140804820","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An effective strategy for assembling the sex-limited chromosome","authors":"Xiao-Bo Wang, Hong-Wei Lu, Qing-You Liu, A-Lun Li, Hong-Ling Zhou, Yong Zhang, Tian-Qi Zhu, Jue Ruan","doi":"10.1093/gigascience/giae015","DOIUrl":"https://doi.org/10.1093/gigascience/giae015","url":null,"abstract":"Background Most currently available reference genomes lack the sequence map of sex-limited (such as Y and W) chromosomes, which results in incomplete assemblies that hinder further research on sex chromosomes. Recent advancements in long-read sequencing and population sequencing have provided the opportunity to assemble sex-limited chromosomes without the traditional complicated experimental efforts. Findings We introduce the first computational method, Sorting long Reads of Y or other sex-limited chromosome (SRY), which achieves improved assembly results compared to flow sorting. Specifically, SRY outperforms in the heterochromatic region and demonstrates comparable performance in other regions. Furthermore, SRY enhances the capabilities of the hybrid assembly software, resulting in improved continuity and accuracy. Conclusions Our method enables true complete genome assembly and facilitates downstream research of sex-limited chromosomes.","PeriodicalId":12581,"journal":{"name":"GigaScience","volume":"21 1","pages":""},"PeriodicalIF":9.2,"publicationDate":"2024-04-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140613961","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
GigaSciencePub Date : 2024-04-16DOI: 10.1093/gigascience/giae019
Hamid Beiki, Brenda M Murdoch, Carissa A Park, Chandlar Kern, Denise Kontechy, Gabrielle Becker, Gonzalo Rincon, Honglin Jiang, Huaijun Zhou, Jacob Thorne, James E Koltes, Jennifer J Michal, Kimberly Davenport, Monique Rijnkels, Pablo J Ross, Rui Hu, Sarah Corum, Stephanie McKay, Timothy P L Smith, Wansheng Liu, Wenzhi Ma, Xiaohui Zhang, Xiaoqing Xu, Xuelei Han, Zhihua Jiang, Zhi-Liang Hu, James M Reecy
{"title":"Enhanced bovine genome annotation through integration of transcriptomics and epi-transcriptomics datasets facilitates genomic biology","authors":"Hamid Beiki, Brenda M Murdoch, Carissa A Park, Chandlar Kern, Denise Kontechy, Gabrielle Becker, Gonzalo Rincon, Honglin Jiang, Huaijun Zhou, Jacob Thorne, James E Koltes, Jennifer J Michal, Kimberly Davenport, Monique Rijnkels, Pablo J Ross, Rui Hu, Sarah Corum, Stephanie McKay, Timothy P L Smith, Wansheng Liu, Wenzhi Ma, Xiaohui Zhang, Xiaoqing Xu, Xuelei Han, Zhihua Jiang, Zhi-Liang Hu, James M Reecy","doi":"10.1093/gigascience/giae019","DOIUrl":"https://doi.org/10.1093/gigascience/giae019","url":null,"abstract":"Background The accurate identification of the functional elements in the bovine genome is a fundamental requirement for high-quality analysis of data informing both genome biology and genomic selection. Functional annotation of the bovine genome was performed to identify a more complete catalog of transcript isoforms across bovine tissues. Results A total of 160,820 unique transcripts (50% protein coding) representing 34,882 unique genes (60% protein coding) were identified across tissues. Among them, 118,563 transcripts (73% of the total) were structurally validated by independent datasets (PacBio isoform sequencing data, Oxford Nanopore Technologies sequencing data, de novo assembled transcripts from RNA sequencing data) and comparison with Ensembl and NCBI gene sets. In addition, all transcripts were supported by extensive data from different technologies such as whole transcriptome termini site sequencing, RNA Annotation and Mapping of Promoters for the Analysis of Gene Expression, chromatin immunoprecipitation sequencing, and assay for transposase-accessible chromatin using sequencing. A large proportion of identified transcripts (69%) were unannotated, of which 86% were produced by annotated genes and 14% by unannotated genes. A median of two 5′ untranslated regions were expressed per gene. Around 50% of protein-coding genes in each tissue were bifunctional and transcribed both coding and noncoding isoforms. Furthermore, we identified 3,744 genes that functioned as noncoding genes in fetal tissues but as protein-coding genes in adult tissues. Our new bovine genome annotation extended more than 11,000 annotated gene borders compared to Ensembl or NCBI annotations. The resulting bovine transcriptome was integrated with publicly available quantitative trait loci data to study tissue–tissue interconnection involved in different traits and construct the first bovine trait similarity network. Conclusions These validated results show significant improvement over current bovine genome annotations.","PeriodicalId":12581,"journal":{"name":"GigaScience","volume":"19 1","pages":""},"PeriodicalIF":9.2,"publicationDate":"2024-04-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140614176","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}