Maria-Anna Trapotsi, Jasper van Lopik, Gregory J Hannon, Benjamin Czech Nicholson, Susanne Bornelöv
{"title":"FlaHMM:利用隐马尔可夫模型预测果蝇物种中的单链弗拉门戈式 piRNA 簇。","authors":"Maria-Anna Trapotsi, Jasper van Lopik, Gregory J Hannon, Benjamin Czech Nicholson, Susanne Bornelöv","doi":"10.1093/nargab/lqae119","DOIUrl":null,"url":null,"abstract":"<p><p>PIWI-interacting RNAs (piRNAs) are a class of small non-coding RNAs that are essential for transposon control in animal gonads. In <i>Drosophila</i> ovarian somatic cells, piRNAs are transcribed from large genomic regions called piRNA clusters, which are enriched for transposon fragments and act as a memory of past invasions. Despite being widely present across <i>Drosophila</i> species, somatic piRNA clusters are difficult to identify and study due to their lack of sequence conservation and limited synteny. Current identification methods rely on either extensive manual curation or availability of high-throughput small RNA sequencing data, limiting large-scale comparative studies. We now present FlaHMM, a hidden Markov model developed to automate genomic annotation of <i>flamenco</i>-like unistrand piRNA clusters in <i>Drosophila</i> species, requiring only a genome assembly and transposon annotations. FlaHMM uses transposable element content across 5- or 10-kb bins, which can be calculated from genome sequence alone, and is thus able to detect candidate piRNA clusters without the need to obtain flies and experimentally perform small RNA sequencing. We show that FlaHMM performs on par with piRNA-guided or manual methods, and thus provides a scalable and efficient approach to piRNA cluster annotation in new genome assemblies. FlaHMM is freely available at https://github.com/Hannon-lab/FlaHMM under an MIT licence.</p>","PeriodicalId":33994,"journal":{"name":"NAR Genomics and Bioinformatics","volume":"6 3","pages":"lqae119"},"PeriodicalIF":4.0000,"publicationDate":"2024-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11400887/pdf/","citationCount":"0","resultStr":"{\"title\":\"FlaHMM: unistrand <i>flamenco</i>-like piRNA cluster prediction in <i>Drosophila</i> species using hidden Markov models.\",\"authors\":\"Maria-Anna Trapotsi, Jasper van Lopik, Gregory J Hannon, Benjamin Czech Nicholson, Susanne Bornelöv\",\"doi\":\"10.1093/nargab/lqae119\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>PIWI-interacting RNAs (piRNAs) are a class of small non-coding RNAs that are essential for transposon control in animal gonads. In <i>Drosophila</i> ovarian somatic cells, piRNAs are transcribed from large genomic regions called piRNA clusters, which are enriched for transposon fragments and act as a memory of past invasions. Despite being widely present across <i>Drosophila</i> species, somatic piRNA clusters are difficult to identify and study due to their lack of sequence conservation and limited synteny. Current identification methods rely on either extensive manual curation or availability of high-throughput small RNA sequencing data, limiting large-scale comparative studies. We now present FlaHMM, a hidden Markov model developed to automate genomic annotation of <i>flamenco</i>-like unistrand piRNA clusters in <i>Drosophila</i> species, requiring only a genome assembly and transposon annotations. FlaHMM uses transposable element content across 5- or 10-kb bins, which can be calculated from genome sequence alone, and is thus able to detect candidate piRNA clusters without the need to obtain flies and experimentally perform small RNA sequencing. We show that FlaHMM performs on par with piRNA-guided or manual methods, and thus provides a scalable and efficient approach to piRNA cluster annotation in new genome assemblies. FlaHMM is freely available at https://github.com/Hannon-lab/FlaHMM under an MIT licence.</p>\",\"PeriodicalId\":33994,\"journal\":{\"name\":\"NAR Genomics and Bioinformatics\",\"volume\":\"6 3\",\"pages\":\"lqae119\"},\"PeriodicalIF\":4.0000,\"publicationDate\":\"2024-09-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11400887/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"NAR Genomics and Bioinformatics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1093/nargab/lqae119\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2024/9/1 0:00:00\",\"PubModel\":\"eCollection\",\"JCR\":\"Q1\",\"JCRName\":\"GENETICS & HEREDITY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"NAR Genomics and Bioinformatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1093/nargab/lqae119","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/9/1 0:00:00","PubModel":"eCollection","JCR":"Q1","JCRName":"GENETICS & HEREDITY","Score":null,"Total":0}
FlaHMM: unistrand flamenco-like piRNA cluster prediction in Drosophila species using hidden Markov models.
PIWI-interacting RNAs (piRNAs) are a class of small non-coding RNAs that are essential for transposon control in animal gonads. In Drosophila ovarian somatic cells, piRNAs are transcribed from large genomic regions called piRNA clusters, which are enriched for transposon fragments and act as a memory of past invasions. Despite being widely present across Drosophila species, somatic piRNA clusters are difficult to identify and study due to their lack of sequence conservation and limited synteny. Current identification methods rely on either extensive manual curation or availability of high-throughput small RNA sequencing data, limiting large-scale comparative studies. We now present FlaHMM, a hidden Markov model developed to automate genomic annotation of flamenco-like unistrand piRNA clusters in Drosophila species, requiring only a genome assembly and transposon annotations. FlaHMM uses transposable element content across 5- or 10-kb bins, which can be calculated from genome sequence alone, and is thus able to detect candidate piRNA clusters without the need to obtain flies and experimentally perform small RNA sequencing. We show that FlaHMM performs on par with piRNA-guided or manual methods, and thus provides a scalable and efficient approach to piRNA cluster annotation in new genome assemblies. FlaHMM is freely available at https://github.com/Hannon-lab/FlaHMM under an MIT licence.