Seyed Mohammad Mahdi Mortazavian , Mahdieh Arshadi-Bidgoli , Dariush Sadeghi , Mohammad Reza Bakhtiarizadeh
{"title":"通过 RNA-Seq 鉴定和验证小茴香(Cuminum Cyminum L.)转录组序列中的 EST-SSR","authors":"Seyed Mohammad Mahdi Mortazavian , Mahdieh Arshadi-Bidgoli , Dariush Sadeghi , Mohammad Reza Bakhtiarizadeh","doi":"10.1016/j.plgene.2024.100477","DOIUrl":null,"url":null,"abstract":"<div><div>Cumin (<em>Cuminum cyminum</em> L.), a member of the Apiaceae family, exhibits a wide range of native ecotypes from the Eastern Mediterranean to India. Despite its significant culinary and medicinal applications, the availability of transcriptomic and genomic data for cumin remains limited, hindering advances in molecular genetics and breeding research. This study presents the first sequencing of the cumin transcriptome using RNA sequencing technology, generating 34,711,979, 48,649,265, 127,370,622, and 52,990,923 reads from the flowers of cumin plants. In total, 51,777 transcripts were de novo assembled, with an average length of 717.09 bp and an N50 value of 1110 bp. Approximately 70 % (36,166) of these transcripts were annotated in at least one public database (UniprotKB, Nr, Pfam, GO, and KEGG). Furthermore, 1556 simple sequence repeats (SSRs) were identified, distributed across 1465 transcripts. The most prevalent SSR motifs were di-nucleotide (70.05 %) and tri-nucleotide (26.16 %) repeats, followed by tetra-nucleotide (2.18 %), penta-nucleotide (0.90 %), and hexanucleotide repeats (0.71 %). The most frequent di-nucleotide and tri-nucleotide repeats were GA/TC (33.58 %) and CAG/CTG (10.32 %), respectively. Functional enrichment analysis indicated that transcripts containing SSRs play significant roles in metabolic processes, DNA/nucleotide binding, protein modification processes, and biosynthetic/developmental processes. For marker validation, 10 EST-SSR primer pairs were tested across 31 cumin genotypes, identifying 34 alleles with polymorphism information content (PIC) values ranging from 0.32 to 0.46. The mean genetic diversity index (MI) and effective multiplex ratio (EMR) were 1.22 and 2.98, respectively. Additionally, two clusters were identified through UPGMA analysis. The SSR markers identified in this study hold potential for applications in genetic mapping, population genetic analysis, genetic diversity studies, and marker-assisted breeding in cumin and related species.</div></div>","PeriodicalId":38041,"journal":{"name":"Plant Gene","volume":"40 ","pages":"Article 100477"},"PeriodicalIF":2.2000,"publicationDate":"2024-11-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Identified and validation of EST-SSR in the transcriptome sequences by RNA-Seq in cumin (Cuminum Cyminum L.)\",\"authors\":\"Seyed Mohammad Mahdi Mortazavian , Mahdieh Arshadi-Bidgoli , Dariush Sadeghi , Mohammad Reza Bakhtiarizadeh\",\"doi\":\"10.1016/j.plgene.2024.100477\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Cumin (<em>Cuminum cyminum</em> L.), a member of the Apiaceae family, exhibits a wide range of native ecotypes from the Eastern Mediterranean to India. Despite its significant culinary and medicinal applications, the availability of transcriptomic and genomic data for cumin remains limited, hindering advances in molecular genetics and breeding research. This study presents the first sequencing of the cumin transcriptome using RNA sequencing technology, generating 34,711,979, 48,649,265, 127,370,622, and 52,990,923 reads from the flowers of cumin plants. In total, 51,777 transcripts were de novo assembled, with an average length of 717.09 bp and an N50 value of 1110 bp. Approximately 70 % (36,166) of these transcripts were annotated in at least one public database (UniprotKB, Nr, Pfam, GO, and KEGG). Furthermore, 1556 simple sequence repeats (SSRs) were identified, distributed across 1465 transcripts. The most prevalent SSR motifs were di-nucleotide (70.05 %) and tri-nucleotide (26.16 %) repeats, followed by tetra-nucleotide (2.18 %), penta-nucleotide (0.90 %), and hexanucleotide repeats (0.71 %). The most frequent di-nucleotide and tri-nucleotide repeats were GA/TC (33.58 %) and CAG/CTG (10.32 %), respectively. Functional enrichment analysis indicated that transcripts containing SSRs play significant roles in metabolic processes, DNA/nucleotide binding, protein modification processes, and biosynthetic/developmental processes. For marker validation, 10 EST-SSR primer pairs were tested across 31 cumin genotypes, identifying 34 alleles with polymorphism information content (PIC) values ranging from 0.32 to 0.46. The mean genetic diversity index (MI) and effective multiplex ratio (EMR) were 1.22 and 2.98, respectively. Additionally, two clusters were identified through UPGMA analysis. The SSR markers identified in this study hold potential for applications in genetic mapping, population genetic analysis, genetic diversity studies, and marker-assisted breeding in cumin and related species.</div></div>\",\"PeriodicalId\":38041,\"journal\":{\"name\":\"Plant Gene\",\"volume\":\"40 \",\"pages\":\"Article 100477\"},\"PeriodicalIF\":2.2000,\"publicationDate\":\"2024-11-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Plant Gene\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2352407324000325\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"GENETICS & HEREDITY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Plant Gene","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2352407324000325","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"GENETICS & HEREDITY","Score":null,"Total":0}
Identified and validation of EST-SSR in the transcriptome sequences by RNA-Seq in cumin (Cuminum Cyminum L.)
Cumin (Cuminum cyminum L.), a member of the Apiaceae family, exhibits a wide range of native ecotypes from the Eastern Mediterranean to India. Despite its significant culinary and medicinal applications, the availability of transcriptomic and genomic data for cumin remains limited, hindering advances in molecular genetics and breeding research. This study presents the first sequencing of the cumin transcriptome using RNA sequencing technology, generating 34,711,979, 48,649,265, 127,370,622, and 52,990,923 reads from the flowers of cumin plants. In total, 51,777 transcripts were de novo assembled, with an average length of 717.09 bp and an N50 value of 1110 bp. Approximately 70 % (36,166) of these transcripts were annotated in at least one public database (UniprotKB, Nr, Pfam, GO, and KEGG). Furthermore, 1556 simple sequence repeats (SSRs) were identified, distributed across 1465 transcripts. The most prevalent SSR motifs were di-nucleotide (70.05 %) and tri-nucleotide (26.16 %) repeats, followed by tetra-nucleotide (2.18 %), penta-nucleotide (0.90 %), and hexanucleotide repeats (0.71 %). The most frequent di-nucleotide and tri-nucleotide repeats were GA/TC (33.58 %) and CAG/CTG (10.32 %), respectively. Functional enrichment analysis indicated that transcripts containing SSRs play significant roles in metabolic processes, DNA/nucleotide binding, protein modification processes, and biosynthetic/developmental processes. For marker validation, 10 EST-SSR primer pairs were tested across 31 cumin genotypes, identifying 34 alleles with polymorphism information content (PIC) values ranging from 0.32 to 0.46. The mean genetic diversity index (MI) and effective multiplex ratio (EMR) were 1.22 and 2.98, respectively. Additionally, two clusters were identified through UPGMA analysis. The SSR markers identified in this study hold potential for applications in genetic mapping, population genetic analysis, genetic diversity studies, and marker-assisted breeding in cumin and related species.
Plant GeneAgricultural and Biological Sciences-Plant Science
CiteScore
4.50
自引率
0.00%
发文量
42
审稿时长
51 days
期刊介绍:
Plant Gene publishes papers that focus on the regulation, expression, function and evolution of genes in plants, algae and other photosynthesizing organisms (e.g., cyanobacteria), and plant-associated microorganisms. Plant Gene strives to be a diverse plant journal and topics in multiple fields will be considered for publication. Although not limited to the following, some general topics include: Gene discovery and characterization, Gene regulation in response to environmental stress (e.g., salinity, drought, etc.), Genetic effects of transposable elements, Genetic control of secondary metabolic pathways and metabolic enzymes. Herbal Medicine - regulation and medicinal properties of plant products, Plant hormonal signaling, Plant evolutionary genetics, molecular evolution, population genetics, and phylogenetics, Profiling of plant gene expression and genetic variation, Plant-microbe interactions (e.g., influence of endophytes on gene expression; horizontal gene transfer studies; etc.), Agricultural genetics - biotechnology and crop improvement.