Cinta Pegueroles, Carles Galià-Camps, Marta Pascual, Marta Bassitta, Didac González, Carola Greve, Enrique Macpherson, Núria Raventós, Tilman Schell, Héctor Torrado, Carlos Carreras
{"title":"尖吻鲷(Diplodus puntazzo)染色体级基因组组装与注释。","authors":"Cinta Pegueroles, Carles Galià-Camps, Marta Pascual, Marta Bassitta, Didac González, Carola Greve, Enrique Macpherson, Núria Raventós, Tilman Schell, Héctor Torrado, Carlos Carreras","doi":"10.1038/s41597-025-04902-3","DOIUrl":null,"url":null,"abstract":"<p><p>Diplodus puntazzo is a demersal fish inhabiting the Mediterranean Sea and the eastern Atlantic and plays an important ecological role in coastal areas. Here, we present the first nuclear genome assembly and annotation of this species and genus. We used a combination of PacBio CLR long reads, Illumina short reads and chromatin capture reads (Omni-C) to generate a chromosome-level assembly. The nuclear genome assembly has a total span of 788 Mb, containing 24 chromosome-scale scaffolds (98.76% of the total length), coinciding with its known karyotype. By using RNA-Seq data from D. puntazzo and gene models from closely related species, we also generated a high-quality nuclear annotation. We predicted a total of 87,572 transcripts from the nuclear genome, 26,838 coding, and 60,734 non-coding that included lncRNA, snoRNA, and tRNAs. We also assembled and annotated the mitochondrial genome, circularized in 16,642 bp comprising 13 protein-coding genes, 2 rRNA, and 22 tRNA. This high-quality reference genome will enrich the current genomic resources available to the large fish scientific community.</p>","PeriodicalId":21597,"journal":{"name":"Scientific Data","volume":"12 1","pages":"576"},"PeriodicalIF":5.8000,"publicationDate":"2025-04-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Chromosome-level genome assembly and annotation of the sharpsnout seabream (Diplodus puntazzo).\",\"authors\":\"Cinta Pegueroles, Carles Galià-Camps, Marta Pascual, Marta Bassitta, Didac González, Carola Greve, Enrique Macpherson, Núria Raventós, Tilman Schell, Héctor Torrado, Carlos Carreras\",\"doi\":\"10.1038/s41597-025-04902-3\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Diplodus puntazzo is a demersal fish inhabiting the Mediterranean Sea and the eastern Atlantic and plays an important ecological role in coastal areas. Here, we present the first nuclear genome assembly and annotation of this species and genus. We used a combination of PacBio CLR long reads, Illumina short reads and chromatin capture reads (Omni-C) to generate a chromosome-level assembly. The nuclear genome assembly has a total span of 788 Mb, containing 24 chromosome-scale scaffolds (98.76% of the total length), coinciding with its known karyotype. By using RNA-Seq data from D. puntazzo and gene models from closely related species, we also generated a high-quality nuclear annotation. We predicted a total of 87,572 transcripts from the nuclear genome, 26,838 coding, and 60,734 non-coding that included lncRNA, snoRNA, and tRNAs. We also assembled and annotated the mitochondrial genome, circularized in 16,642 bp comprising 13 protein-coding genes, 2 rRNA, and 22 tRNA. This high-quality reference genome will enrich the current genomic resources available to the large fish scientific community.</p>\",\"PeriodicalId\":21597,\"journal\":{\"name\":\"Scientific Data\",\"volume\":\"12 1\",\"pages\":\"576\"},\"PeriodicalIF\":5.8000,\"publicationDate\":\"2025-04-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Scientific Data\",\"FirstCategoryId\":\"103\",\"ListUrlMain\":\"https://doi.org/10.1038/s41597-025-04902-3\",\"RegionNum\":2,\"RegionCategory\":\"综合性期刊\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"MULTIDISCIPLINARY SCIENCES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Scientific Data","FirstCategoryId":"103","ListUrlMain":"https://doi.org/10.1038/s41597-025-04902-3","RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MULTIDISCIPLINARY SCIENCES","Score":null,"Total":0}
Chromosome-level genome assembly and annotation of the sharpsnout seabream (Diplodus puntazzo).
Diplodus puntazzo is a demersal fish inhabiting the Mediterranean Sea and the eastern Atlantic and plays an important ecological role in coastal areas. Here, we present the first nuclear genome assembly and annotation of this species and genus. We used a combination of PacBio CLR long reads, Illumina short reads and chromatin capture reads (Omni-C) to generate a chromosome-level assembly. The nuclear genome assembly has a total span of 788 Mb, containing 24 chromosome-scale scaffolds (98.76% of the total length), coinciding with its known karyotype. By using RNA-Seq data from D. puntazzo and gene models from closely related species, we also generated a high-quality nuclear annotation. We predicted a total of 87,572 transcripts from the nuclear genome, 26,838 coding, and 60,734 non-coding that included lncRNA, snoRNA, and tRNAs. We also assembled and annotated the mitochondrial genome, circularized in 16,642 bp comprising 13 protein-coding genes, 2 rRNA, and 22 tRNA. This high-quality reference genome will enrich the current genomic resources available to the large fish scientific community.
期刊介绍:
Scientific Data is an open-access journal focused on data, publishing descriptions of research datasets and articles on data sharing across natural sciences, medicine, engineering, and social sciences. Its goal is to enhance the sharing and reuse of scientific data, encourage broader data sharing, and acknowledge those who share their data.
The journal primarily publishes Data Descriptors, which offer detailed descriptions of research datasets, including data collection methods and technical analyses validating data quality. These descriptors aim to facilitate data reuse rather than testing hypotheses or presenting new interpretations, methods, or in-depth analyses.