通过 RNA-Seq 鉴定和验证小茴香（Cuminum Cyminum L.）转录组序列中的 EST-SSR

IF 1.6 Q3 GENETICS & HEREDITY

Plant Gene Pub Date : 2024-11-16 DOI:10.1016/j.plgene.2024.100477

Seyed Mohammad Mahdi Mortazavian , Mahdieh Arshadi-Bidgoli , Dariush Sadeghi , Mohammad Reza Bakhtiarizadeh

{"title":"通过 RNA-Seq 鉴定和验证小茴香（Cuminum Cyminum L.）转录组序列中的 EST-SSR","authors":"Seyed Mohammad Mahdi Mortazavian , Mahdieh Arshadi-Bidgoli , Dariush Sadeghi , Mohammad Reza Bakhtiarizadeh","doi":"10.1016/j.plgene.2024.100477","DOIUrl":null,"url":null,"abstract":"<div><div>Cumin (<em>Cuminum cyminum</em> L.), a member of the Apiaceae family, exhibits a wide range of native ecotypes from the Eastern Mediterranean to India. Despite its significant culinary and medicinal applications, the availability of transcriptomic and genomic data for cumin remains limited, hindering advances in molecular genetics and breeding research. This study presents the first sequencing of the cumin transcriptome using RNA sequencing technology, generating 34,711,979, 48,649,265, 127,370,622, and 52,990,923 reads from the flowers of cumin plants. In total, 51,777 transcripts were de novo assembled, with an average length of 717.09 bp and an N50 value of 1110 bp. Approximately 70 % (36,166) of these transcripts were annotated in at least one public database (UniprotKB, Nr, Pfam, GO, and KEGG). Furthermore, 1556 simple sequence repeats (SSRs) were identified, distributed across 1465 transcripts. The most prevalent SSR motifs were di-nucleotide (70.05 %) and tri-nucleotide (26.16 %) repeats, followed by tetra-nucleotide (2.18 %), penta-nucleotide (0.90 %), and hexanucleotide repeats (0.71 %). The most frequent di-nucleotide and tri-nucleotide repeats were GA/TC (33.58 %) and CAG/CTG (10.32 %), respectively. Functional enrichment analysis indicated that transcripts containing SSRs play significant roles in metabolic processes, DNA/nucleotide binding, protein modification processes, and biosynthetic/developmental processes. For marker validation, 10 EST-SSR primer pairs were tested across 31 cumin genotypes, identifying 34 alleles with polymorphism information content (PIC) values ranging from 0.32 to 0.46. The mean genetic diversity index (MI) and effective multiplex ratio (EMR) were 1.22 and 2.98, respectively. Additionally, two clusters were identified through UPGMA analysis. The SSR markers identified in this study hold potential for applications in genetic mapping, population genetic analysis, genetic diversity studies, and marker-assisted breeding in cumin and related species.</div></div>","PeriodicalId":38041,"journal":{"name":"Plant Gene","volume":"40 ","pages":"Article 100477"},"PeriodicalIF":1.6000,"publicationDate":"2024-11-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Identified and validation of EST-SSR in the transcriptome sequences by RNA-Seq in cumin (Cuminum Cyminum L.)\",\"authors\":\"Seyed Mohammad Mahdi Mortazavian , Mahdieh Arshadi-Bidgoli , Dariush Sadeghi , Mohammad Reza Bakhtiarizadeh\",\"doi\":\"10.1016/j.plgene.2024.100477\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Cumin (<em>Cuminum cyminum</em> L.), a member of the Apiaceae family, exhibits a wide range of native ecotypes from the Eastern Mediterranean to India. Despite its significant culinary and medicinal applications, the availability of transcriptomic and genomic data for cumin remains limited, hindering advances in molecular genetics and breeding research. This study presents the first sequencing of the cumin transcriptome using RNA sequencing technology, generating 34,711,979, 48,649,265, 127,370,622, and 52,990,923 reads from the flowers of cumin plants. In total, 51,777 transcripts were de novo assembled, with an average length of 717.09 bp and an N50 value of 1110 bp. Approximately 70 % (36,166) of these transcripts were annotated in at least one public database (UniprotKB, Nr, Pfam, GO, and KEGG). Furthermore, 1556 simple sequence repeats (SSRs) were identified, distributed across 1465 transcripts. The most prevalent SSR motifs were di-nucleotide (70.05 %) and tri-nucleotide (26.16 %) repeats, followed by tetra-nucleotide (2.18 %), penta-nucleotide (0.90 %), and hexanucleotide repeats (0.71 %). The most frequent di-nucleotide and tri-nucleotide repeats were GA/TC (33.58 %) and CAG/CTG (10.32 %), respectively. Functional enrichment analysis indicated that transcripts containing SSRs play significant roles in metabolic processes, DNA/nucleotide binding, protein modification processes, and biosynthetic/developmental processes. For marker validation, 10 EST-SSR primer pairs were tested across 31 cumin genotypes, identifying 34 alleles with polymorphism information content (PIC) values ranging from 0.32 to 0.46. The mean genetic diversity index (MI) and effective multiplex ratio (EMR) were 1.22 and 2.98, respectively. Additionally, two clusters were identified through UPGMA analysis. The SSR markers identified in this study hold potential for applications in genetic mapping, population genetic analysis, genetic diversity studies, and marker-assisted breeding in cumin and related species.</div></div>\",\"PeriodicalId\":38041,\"journal\":{\"name\":\"Plant Gene\",\"volume\":\"40 \",\"pages\":\"Article 100477\"},\"PeriodicalIF\":1.6000,\"publicationDate\":\"2024-11-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Plant Gene\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2352407324000325\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"GENETICS & HEREDITY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Plant Gene","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2352407324000325","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"GENETICS & HEREDITY","Score":null,"Total":0}

引用次数: 0

摘要

孜然（Cuminum cyminum L.）是天南星科植物，从东地中海到印度有多种原生生态型。尽管孜然在烹饪和药用方面有着重要的应用，但其转录组和基因组数据的可用性仍然有限，阻碍了分子遗传学和育种研究的进展。本研究首次利用 RNA 测序技术对小茴香转录组进行了测序，从小茴香植株的花中分别获得了 34,711,979, 48,649,265, 127,370,622 和 52,990,923 个读数。总共有 51,777 个转录本被重新组装，平均长度为 717.09 bp，N50 值为 1110 bp。其中约 70% 的转录本（36,166 个）在至少一个公共数据库（UniprotKB、Nr、Pfam、GO 和 KEGG）中进行了注释。此外，还发现了 1556 个简单序列重复序列（SSR），分布在 1465 个转录本中。最常见的 SSR 主题是二核苷酸（70.05%）和三核苷酸（26.16%）重复，其次是四核苷酸（2.18%）、五核苷酸（0.90%）和六核苷酸重复（0.71%）。最常见的二核苷酸和三核苷酸重复序列分别是 GA/TC（33.58%）和 CAG/CTG（10.32%）。功能富集分析表明，含有 SSR 的转录本在代谢过程、DNA/核苷酸结合、蛋白质修饰过程和生物合成/发育过程中发挥着重要作用。为了验证标记，在 31 个小茴香基因型中测试了 10 个 EST-SSR 引物对，鉴定出 34 个等位基因，其多态性信息含量（PIC）值从 0.32 到 0.46 不等。平均遗传多样性指数（MI）和有效多重比（EMR）分别为 1.22 和 2.98。此外，通过 UPGMA 分析还发现了两个聚类。本研究鉴定的 SSR 标记有望应用于孜然及相关物种的遗传图谱绘制、群体遗传分析、遗传多样性研究和标记辅助育种。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Identified and validation of EST-SSR in the transcriptome sequences by RNA-Seq in cumin (Cuminum Cyminum L.)

Cumin (Cuminum cyminum L.), a member of the Apiaceae family, exhibits a wide range of native ecotypes from the Eastern Mediterranean to India. Despite its significant culinary and medicinal applications, the availability of transcriptomic and genomic data for cumin remains limited, hindering advances in molecular genetics and breeding research. This study presents the first sequencing of the cumin transcriptome using RNA sequencing technology, generating 34,711,979, 48,649,265, 127,370,622, and 52,990,923 reads from the flowers of cumin plants. In total, 51,777 transcripts were de novo assembled, with an average length of 717.09 bp and an N50 value of 1110 bp. Approximately 70 % (36,166) of these transcripts were annotated in at least one public database (UniprotKB, Nr, Pfam, GO, and KEGG). Furthermore, 1556 simple sequence repeats (SSRs) were identified, distributed across 1465 transcripts. The most prevalent SSR motifs were di-nucleotide (70.05 %) and tri-nucleotide (26.16 %) repeats, followed by tetra-nucleotide (2.18 %), penta-nucleotide (0.90 %), and hexanucleotide repeats (0.71 %). The most frequent di-nucleotide and tri-nucleotide repeats were GA/TC (33.58 %) and CAG/CTG (10.32 %), respectively. Functional enrichment analysis indicated that transcripts containing SSRs play significant roles in metabolic processes, DNA/nucleotide binding, protein modification processes, and biosynthetic/developmental processes. For marker validation, 10 EST-SSR primer pairs were tested across 31 cumin genotypes, identifying 34 alleles with polymorphism information content (PIC) values ranging from 0.32 to 0.46. The mean genetic diversity index (MI) and effective multiplex ratio (EMR) were 1.22 and 2.98, respectively. Additionally, two clusters were identified through UPGMA analysis. The SSR markers identified in this study hold potential for applications in genetic mapping, population genetic analysis, genetic diversity studies, and marker-assisted breeding in cumin and related species.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Plant Gene Agricultural and Biological Sciences-Plant Science

CiteScore

4.50

自引率

0.00%

发文量

审稿时长

51 days

期刊介绍： Plant Gene publishes papers that focus on the regulation, expression, function and evolution of genes in plants, algae and other photosynthesizing organisms (e.g., cyanobacteria), and plant-associated microorganisms. Plant Gene strives to be a diverse plant journal and topics in multiple fields will be considered for publication. Although not limited to the following, some general topics include: Gene discovery and characterization, Gene regulation in response to environmental stress (e.g., salinity, drought, etc.), Genetic effects of transposable elements, Genetic control of secondary metabolic pathways and metabolic enzymes. Herbal Medicine - regulation and medicinal properties of plant products, Plant hormonal signaling, Plant evolutionary genetics, molecular evolution, population genetics, and phylogenetics, Profiling of plant gene expression and genetic variation, Plant-microbe interactions (e.g., influence of endophytes on gene expression; horizontal gene transfer studies; etc.), Agricultural genetics - biotechnology and crop improvement.