A dataset of 40 assembled and annotated transcriptomes from 34 species in Silene and related genera.

IF 1.4 Q3 MULTIDISCIPLINARY SCIENCES

Data in Brief Pub Date : 2024-11-01 eCollection Date: 2024-12-01 DOI:10.1016/j.dib.2024.111094

Patrik Cangren, Yann J K Bertrand, John M Braverman, Gregor Duncan Gilfillan, Matthew B Hamilton, Bengt Oxelman

{"title":"A dataset of 40 assembled and annotated transcriptomes from 34 species in Silene and related genera.","authors":"Patrik Cangren, Yann J K Bertrand, John M Braverman, Gregor Duncan Gilfillan, Matthew B Hamilton, Bengt Oxelman","doi":"10.1016/j.dib.2024.111094","DOIUrl":null,"url":null,"abstract":"A dataset of 40 assembled and annotated transcriptomes from 34 different species sampled from phylogenetically diverse parts of the flowering plant genus Silene (Caryophyllaceae) and the related genera Agrostemma, Atocion, Eudianthe, Heliosperma, Petrocoptis and Viscaria. RNA extracted from roots, stems, leaves, buds and flowers were sequenced using paired end reads on the Illumina Hiseq platform. A total of 716 million raw reads were produced and assembled into 2.67 million isogroups (\"genes\"). Contigs from all samples were annotated using UniProt/SwissProt and assigned with GO-terms. A total of 974274 annotations were made (per sample average 24357, stdev 7034), giving an annotation proportion of 37% (per sample average 39%, stdev 9.75%). 741087 of the annotations had taxonomic identities within Magnoliopsida (per sample average 18527, stdev 3931), resulting in assignment of 4519488 GO-terms (per sample average 112987, stdev 22536). The data set can be further utilized for biological research and phylogenetic studies, evolutionary questions, functional analyses of genes, polyploidy as well as for marker development.","PeriodicalId":10973,"journal":{"name":"Data in Brief","volume":"57 ","pages":"111094"},"PeriodicalIF":1.4000,"publicationDate":"2024-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11615531/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Data in Brief","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1016/j.dib.2024.111094","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/12/1 0:00:00","PubModel":"eCollection","JCR":"Q3","JCRName":"MULTIDISCIPLINARY SCIENCES","Score":null,"Total":0}

引用次数: 0

Abstract

A dataset of 40 assembled and annotated transcriptomes from 34 different species sampled from phylogenetically diverse parts of the flowering plant genus Silene (Caryophyllaceae) and the related genera Agrostemma, Atocion, Eudianthe, Heliosperma, Petrocoptis and Viscaria. RNA extracted from roots, stems, leaves, buds and flowers were sequenced using paired end reads on the Illumina Hiseq platform. A total of 716 million raw reads were produced and assembled into 2.67 million isogroups ("genes"). Contigs from all samples were annotated using UniProt/SwissProt and assigned with GO-terms. A total of 974274 annotations were made (per sample average 24357, stdev 7034), giving an annotation proportion of 37% (per sample average 39%, stdev 9.75%). 741087 of the annotations had taxonomic identities within Magnoliopsida (per sample average 18527, stdev 3931), resulting in assignment of 4519488 GO-terms (per sample average 112987, stdev 22536). The data set can be further utilized for biological research and phylogenetic studies, evolutionary questions, functional analyses of genes, polyploidy as well as for marker development.

查看原文本刊更多论文

来自Silene及其相关属34个物种的40个组装和注释的转录组数据集。

从开花植物Silene属（石楠科）及其相关属Agrostemma、Atocion、Eudianthe、Heliosperma、Petrocoptis和Viscaria中采集的34个不同物种的40个转录组数据集。从根、茎、叶、芽和花中提取的RNA在Illumina Hiseq平台上使用配对端读测序。总共产生了7.16亿个原始reads，并组装成267万个同群（“基因”）。使用UniProt/SwissProt对所有样本的Contigs进行注释，并赋予go术语。共注释974274条（每个样本平均24357条，stdev为7034条），注释比例为37%（每个样本平均39%，stdev为9.75%）。741087个注释在Magnoliopsida中具有分类一致性（每个样本平均18527，stdev 3931），从而分配了4519488个go术语（每个样本平均112987，stdev 22536）。该数据集可以进一步用于生物学研究和系统发育研究、进化问题、基因功能分析、多倍体以及标记开发。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Data in Brief MULTIDISCIPLINARY SCIENCES-

CiteScore

3.10

自引率

0.00%

发文量

996

审稿时长

70 days

期刊介绍： Data in Brief provides a way for researchers to easily share and reuse each other''s datasets by publishing data articles that: -Thoroughly describe your data, facilitating reproducibility. -Make your data, which is often buried in supplementary material, easier to find. -Increase traffic towards associated research articles and data, leading to more citations. -Open up doors for new collaborations. Because you never know what data will be useful to someone else, Data in Brief welcomes submissions that describe data from all research areas.