Wellcome Open ResearchPub Date : 2024-10-15eCollection Date: 2024-01-01DOI: 10.12688/wellcomeopenres.22949.1
Mark Blaxter, Joana Pauperio, Conrad Schoch, Kerstin Howe
{"title":"Taxonomy Identifiers (TaxId) for Biodiversity Genomics: a guide to getting TaxId for submission of data to public databases.","authors":"Mark Blaxter, Joana Pauperio, Conrad Schoch, Kerstin Howe","doi":"10.12688/wellcomeopenres.22949.1","DOIUrl":"https://doi.org/10.12688/wellcomeopenres.22949.1","url":null,"abstract":"<p><p>Biodiversity genomics critically depends on correct taxonomic identification of the sample from which data are derived. Tracking of that taxonomic information through systems that archive data and report on genome sequencing efforts. For submission of data to the International Nucleotide Sequence Database Collaboration (INSDC) databases (DNA DataBank of Japan [DDBJ], European Nucleotide Archive [ENA] and National Center for Biotechnology Information [NCBI]), samples and data derived from them must be assigned a species-level NCBI Taxonomy taxonomic identifier (TaxId, sometimes referred to as taxId or txid). We thus need to be able to identify the TaxId for a target species efficiently. Because the NCBI Taxonomy does not include all known species and cannot preemptively represent unknown taxa, we also need an efficient process for generating new TaxIds for species not yet listed. This document provides workflows for different kinds of TaxId acquisition scenarios and was created to guide users in these processes. Although developed for European projects such as Darwin Tree of Life and the European Reference Genome Atlas, the workflows are universally applicable and describe the use of ENA in resolving taxonomic issues. Too Long: Didn't Read (TL;DR): Use the ENA REST API programmatically to retrieve TaxIds for target species and confirm that sequence data can be submitted to those TaxIds.Use the NCBI Web interface to NCBI Taxonomy to identify potential homotypic synonyms.Request a new TaxId from ENA for a species not yet in NCBI Taxonomy, and for species-like entries for which the full Linnaean binomen is not determined (see https://ena-docs.readthedocs.io/en/latest/faq/taxonomy_requests.html#creating-taxon-requests).Discuss directly with the NCBI Taxonomy curators or the curators at ENA and NCBI whenever you think there is an opportunity to improve their database.</p>","PeriodicalId":23677,"journal":{"name":"Wellcome Open Research","volume":"9 ","pages":"591"},"PeriodicalIF":0.0,"publicationDate":"2024-10-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11544195/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142628811","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Wellcome Open ResearchPub Date : 2024-10-15eCollection Date: 2024-01-01DOI: 10.12688/wellcomeopenres.23192.1
Gavin R Broad
{"title":"The genome sequence of Langmaid's Yellow Underwing moth, <i>Noctua janthina</i> (Denis & Schiffermüller) 1775.","authors":"Gavin R Broad","doi":"10.12688/wellcomeopenres.23192.1","DOIUrl":"10.12688/wellcomeopenres.23192.1","url":null,"abstract":"<p><p>We present a genome assembly from an individual male <i>Noctua janthina</i> (Langmaid's Yellow Underwing; Arthropoda; Insecta; Lepidoptera; Noctuidae). The genome sequence has a total length of 539.70 megabases. Most of the assembly (99.99%) is scaffolded into 31 chromosomal pseudomolecules, including the Z sex chromosome. The mitochondrial genome has also been assembled and is 15.36 kilobases in length. Gene annotation of this assembly on Ensembl identified 12,089 protein-coding genes.</p>","PeriodicalId":23677,"journal":{"name":"Wellcome Open Research","volume":"9 ","pages":"592"},"PeriodicalIF":0.0,"publicationDate":"2024-10-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11569387/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142650535","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Wellcome Open ResearchPub Date : 2024-10-15eCollection Date: 2024-01-01DOI: 10.12688/wellcomeopenres.23160.1
Maarten J M Christenhusz, Claudia A Martin
{"title":"The genome sequence of lesser burdock, <i>Arctium minus</i> (Hill) Bernh. (Asteraceae).","authors":"Maarten J M Christenhusz, Claudia A Martin","doi":"10.12688/wellcomeopenres.23160.1","DOIUrl":"10.12688/wellcomeopenres.23160.1","url":null,"abstract":"<p><p>We present a genome assembly of a diploid specimen of <i>Arctium minus</i> (lesser burdock; Tracheophyta; Magnoliopsida; Asterales; Asteraceae). The genome sequence is 1,903.1 megabases in span. Most of the assembly is scaffolded into 18 chromosomal pseudomolecules. The mitochondrial and plastid genome assemblies have lengths of 312.58 kilobases and 152.71 kilobases, respectively. Gene annotation of this assembly on Ensembl identified 27,734 protein-coding genes.</p>","PeriodicalId":23677,"journal":{"name":"Wellcome Open Research","volume":"9 ","pages":"589"},"PeriodicalIF":0.0,"publicationDate":"2024-10-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11574341/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142677221","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Wellcome Open ResearchPub Date : 2024-10-15eCollection Date: 2023-01-01DOI: 10.12688/wellcomeopenres.19480.2
Jamie C Weir, Douglas Boyes
{"title":"The genome sequence of the Vapourer moth, <i>Orgyia antiqua</i> (Linnaeus, 1758).","authors":"Jamie C Weir, Douglas Boyes","doi":"10.12688/wellcomeopenres.19480.2","DOIUrl":"10.12688/wellcomeopenres.19480.2","url":null,"abstract":"<p><p>We present a genome assembly from an individual male <i>Orgyia antiqua</i> specimen (the Vapourer moth; Arthropoda; Insecta; Lepidoptera; Erebidae). The genome sequence is 480.1 megabases in span. Most of the assembly is scaffolded into 14 chromosomal pseudomolecules, including the Z sex chromosome. The mitochondrial genome has also been assembled and is 15.4 kilobases in length. Gene annotation of this assembly on Ensembl identified 12,475 protein coding genes.</p>","PeriodicalId":23677,"journal":{"name":"Wellcome Open Research","volume":"8 ","pages":"314"},"PeriodicalIF":0.0,"publicationDate":"2024-10-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11502997/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142509134","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Wellcome Open ResearchPub Date : 2024-10-15eCollection Date: 2024-01-01DOI: 10.12688/wellcomeopenres.22779.2
Michelle F O'Brien, Rosa Lopez Colom
{"title":"The genome sequence of the Long-tailed duck, <i>Clangula hyemalis</i> (Linnaeus, 1758).","authors":"Michelle F O'Brien, Rosa Lopez Colom","doi":"10.12688/wellcomeopenres.22779.2","DOIUrl":"10.12688/wellcomeopenres.22779.2","url":null,"abstract":"<p><p>We present a genome assembly from an individual male <i>Clangula hyemalis</i> (the Long-tailed duck; Chordata; Aves; Anseriformes; Anatidae). The genome sequence spans 1,206.10 megabases. Most of the assembly is scaffolded into 41 chromosomal pseudomolecules, including the Z sex chromosome. The mitochondrial genome has also been assembled and is 16.63 kilobases in length.</p>","PeriodicalId":23677,"journal":{"name":"Wellcome Open Research","volume":"9 ","pages":"475"},"PeriodicalIF":0.0,"publicationDate":"2024-10-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11599803/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142740602","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Wellcome Open ResearchPub Date : 2024-10-15eCollection Date: 2024-01-01DOI: 10.12688/wellcomeopenres.23195.1
Roger Booth
{"title":"The genome sequence of a leaf beetle, <i>Galeruca laticollis</i> Sahlberg, C.R., 1838.","authors":"Roger Booth","doi":"10.12688/wellcomeopenres.23195.1","DOIUrl":"10.12688/wellcomeopenres.23195.1","url":null,"abstract":"<p><p>We present a genome assembly from an individual leaf beetle, <i>Galeruca laticollis</i> (Arthropoda; Insecta; Coleoptera; Chrysomelidae). The genome sequence has a total length of 2,154.60 megabases. Most of the assembly (99.92%) is scaffolded into 12 chromosomal pseudomolecules, including the X and Y sex chromosomes. The mitochondrial genome has also been assembled and is 19.98 kilobases in length. Gene annotation of this assembly on Ensembl identified 32,229 protein-coding genes.</p>","PeriodicalId":23677,"journal":{"name":"Wellcome Open Research","volume":"9 ","pages":"594"},"PeriodicalIF":0.0,"publicationDate":"2024-10-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11579587/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142688575","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Wellcome Open ResearchPub Date : 2024-10-15eCollection Date: 2024-01-01DOI: 10.12688/wellcomeopenres.23099.1
Steven Falk
{"title":"The genome sequence of a lauxaniid fly, <i>Tricholauxania praeusta</i> (Fallén, 1820).","authors":"Steven Falk","doi":"10.12688/wellcomeopenres.23099.1","DOIUrl":"10.12688/wellcomeopenres.23099.1","url":null,"abstract":"<p><p>We present a genome assembly from an individual female <i>Tricholauxania praeusta</i> (a lauxaniid fly; Arthropoda; Insecta; Diptera; Lauxaniidae). The genome sequence has a total length of 661.30 megabases. Most of the assembly is scaffolded into 5 chromosomal pseudomolecules. The mitochondrial genome has also been assembled and is 16.31 kilobases in length. Gene annotation of this assembly on Ensembl identified 25,606 protein-coding genes.</p>","PeriodicalId":23677,"journal":{"name":"Wellcome Open Research","volume":"9 ","pages":"586"},"PeriodicalIF":0.0,"publicationDate":"2024-10-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11574340/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142677220","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Wellcome Open ResearchPub Date : 2024-10-11eCollection Date: 2024-01-01DOI: 10.12688/wellcomeopenres.23143.1
Maarten J M Christenhusz, Michael F Fay
{"title":"The genome sequence of common reed, <i>Phragmites australis</i> (Cav.) Steud. (Poaceae).","authors":"Maarten J M Christenhusz, Michael F Fay","doi":"10.12688/wellcomeopenres.23143.1","DOIUrl":"https://doi.org/10.12688/wellcomeopenres.23143.1","url":null,"abstract":"<p><p>We present a genome assembly from an individual <i>Phragmites australis</i> (the common reed; Streptophyta; Magnoliopsida; Poales; Poaceae). The genome sequence has a total length of 848.70 megabases. Most of the assembly is scaffolded into 24 chromosomal pseudomolecules, supporting the specimen being an allotetraploid (2 <i>n</i> = 4 <i>x</i> = 48). The three mitochondrial assemblies had lengths of 304.58, 92.24, and 76.54 kilobases and the plastid genome assembly had a length of 137.67 kilobases. Gene annotation of this assembly on Ensembl identified 47,513 protein-coding genes.</p>","PeriodicalId":23677,"journal":{"name":"Wellcome Open Research","volume":"9 ","pages":"577"},"PeriodicalIF":0.0,"publicationDate":"2024-10-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11549543/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142628839","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Wellcome Open ResearchPub Date : 2024-10-11eCollection Date: 2024-01-01DOI: 10.12688/wellcomeopenres.22920.2
James Hammond
{"title":"The genome sequence of the Sprawler moth, <i>Asteroscopus sphinx</i> Hufnagel, 1766.","authors":"James Hammond","doi":"10.12688/wellcomeopenres.22920.2","DOIUrl":"10.12688/wellcomeopenres.22920.2","url":null,"abstract":"<p><p>We present a genome assembly from an individual male <i>Asteroscopus sphinx</i> (the Sprawler moth; Arthropoda; Insecta; Lepidoptera; Noctuidae). The genome sequence has a total length of 857.30 megabases. Most of the assembly is scaffolded into 32 chromosomal pseudomolecules, including the Z sex chromosome and a putative B chromosome. The mitochondrial genome has also been assembled and is 15.35 kilobases in length.</p>","PeriodicalId":23677,"journal":{"name":"Wellcome Open Research","volume":"9 ","pages":"505"},"PeriodicalIF":0.0,"publicationDate":"2024-10-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11589413/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142733094","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Wellcome Open ResearchPub Date : 2024-10-11eCollection Date: 2023-01-01DOI: 10.12688/wellcomeopenres.18639.1
Jean Golding, Iain Bickerstaffe, Yasmin Iles-Caven, Kate Northstone
{"title":"Paternal health in the first 12-13 years of the ALSPAC study.","authors":"Jean Golding, Iain Bickerstaffe, Yasmin Iles-Caven, Kate Northstone","doi":"10.12688/wellcomeopenres.18639.1","DOIUrl":"10.12688/wellcomeopenres.18639.1","url":null,"abstract":"<p><p>The Avon Longitudinal Study of Parents and Children (ALSPAC) collected information from the enrolled pregnancy onwards to identify features of the environment in which the study child was brought up. Among data collected were features concerning the health of the mothers' partners - generally the study father. This was an important feature since the father's physical and mental health can have a long-term effect on the family. In this Data Note we describe the data available on the father's health from pregnancy until 12 years after the offspring was born. Not only is this a valuable addition to the environmental information available for studies of the child's development and the mental health of the mother over time, but it will provide a useful description of the father himself during adulthood.</p>","PeriodicalId":23677,"journal":{"name":"Wellcome Open Research","volume":"8 ","pages":"8"},"PeriodicalIF":0.0,"publicationDate":"2024-10-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10354460/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10227098","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}