{"title":"Improving prediction of bacterial sRNA regulatory targets with expression data.","authors":"Yildiz Derinkok, Haiqi Wang, Brian Tjaden","doi":"10.1093/nargab/lqaf055","DOIUrl":null,"url":null,"abstract":"<p><p>Small regulatory RNAs (sRNAs) are widespread in bacteria. However, characterizing the targets of sRNA regulation in a way that scales with the increasing number of identified sRNAs has proven challenging. Computational methods offer one means for efficient characterization of sRNA targets, but the sensitivity and precision of such computational methods is limited. Here, we investigate whether publicly available expression data from RNA-seq experiments can improve the accuracy of computational prediction of sRNA regulatory targets. Using compendia of 2143 <i>Escherichia coli</i> RNA-seq samples and 177 <i>Salmonella</i> RNA-seq samples, we identify groups of co-expressed genes in each organism and incorporate this expression information into computational prediction of sRNA targets based on machine learning methods. We find that integrating expression information significantly improves the accuracy of computational results. Further, we observe that computational methods perform better when trained on smaller, higher quality sets of targets rather than on larger, noisier sets of targets identified by high-throughput methods.</p>","PeriodicalId":33994,"journal":{"name":"NAR Genomics and Bioinformatics","volume":"7 2","pages":"lqaf055"},"PeriodicalIF":2.8000,"publicationDate":"2025-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12060007/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"NAR Genomics and Bioinformatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1093/nargab/lqaf055","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/6/1 0:00:00","PubModel":"eCollection","JCR":"Q1","JCRName":"GENETICS & HEREDITY","Score":null,"Total":0}
引用次数: 0
Abstract
Small regulatory RNAs (sRNAs) are widespread in bacteria. However, characterizing the targets of sRNA regulation in a way that scales with the increasing number of identified sRNAs has proven challenging. Computational methods offer one means for efficient characterization of sRNA targets, but the sensitivity and precision of such computational methods is limited. Here, we investigate whether publicly available expression data from RNA-seq experiments can improve the accuracy of computational prediction of sRNA regulatory targets. Using compendia of 2143 Escherichia coli RNA-seq samples and 177 Salmonella RNA-seq samples, we identify groups of co-expressed genes in each organism and incorporate this expression information into computational prediction of sRNA targets based on machine learning methods. We find that integrating expression information significantly improves the accuracy of computational results. Further, we observe that computational methods perform better when trained on smaller, higher quality sets of targets rather than on larger, noisier sets of targets identified by high-throughput methods.