Teresa Müller, Stefan Mautner, Pavankumar Videm, Florian Eggenhofer, Martin Raden, Rolf Backofen
{"title":"CheRRI-Accurate classification of the biological relevance of putative RNA-RNA interaction sites.","authors":"Teresa Müller, Stefan Mautner, Pavankumar Videm, Florian Eggenhofer, Martin Raden, Rolf Backofen","doi":"10.1093/gigascience/giae022","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>RNA-RNA interactions are key to a wide range of cellular functions. The detection of potential interactions helps to understand the underlying processes. However, potential interactions identified via in silico or experimental high-throughput methods can lack precision because of a high false-positive rate.</p><p><strong>Results: </strong>We present CheRRI, the first tool to evaluate the biological relevance of putative RNA-RNA interaction sites. CheRRI filters candidates via a machine learning-based model trained on experimental RNA-RNA interactome data. Its unique setup combines interactome data and an established thermodynamic prediction tool to integrate experimental data with state-of-the-art computational models. Applying these data to an automated machine learning approach provides the opportunity to not only filter data for potential false positives but also tailor the underlying interaction site model to specific needs.</p><p><strong>Conclusions: </strong>CheRRI is a stand-alone postprocessing tool to filter either predicted or experimentally identified potential RNA-RNA interactions on a genomic level to enhance the quality of interaction candidates. It is easy to install (via conda, pip packages), use (via Galaxy), and integrate into existing RNA-RNA interaction pipelines.</p>","PeriodicalId":12581,"journal":{"name":"GigaScience","volume":"13 ","pages":""},"PeriodicalIF":11.8000,"publicationDate":"2024-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11152173/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"GigaScience","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1093/gigascience/giae022","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MULTIDISCIPLINARY SCIENCES","Score":null,"Total":0}
引用次数: 0
Abstract
Background: RNA-RNA interactions are key to a wide range of cellular functions. The detection of potential interactions helps to understand the underlying processes. However, potential interactions identified via in silico or experimental high-throughput methods can lack precision because of a high false-positive rate.
Results: We present CheRRI, the first tool to evaluate the biological relevance of putative RNA-RNA interaction sites. CheRRI filters candidates via a machine learning-based model trained on experimental RNA-RNA interactome data. Its unique setup combines interactome data and an established thermodynamic prediction tool to integrate experimental data with state-of-the-art computational models. Applying these data to an automated machine learning approach provides the opportunity to not only filter data for potential false positives but also tailor the underlying interaction site model to specific needs.
Conclusions: CheRRI is a stand-alone postprocessing tool to filter either predicted or experimentally identified potential RNA-RNA interactions on a genomic level to enhance the quality of interaction candidates. It is easy to install (via conda, pip packages), use (via Galaxy), and integrate into existing RNA-RNA interaction pipelines.
期刊介绍:
GigaScience seeks to transform data dissemination and utilization in the life and biomedical sciences. As an online open-access open-data journal, it specializes in publishing "big-data" studies encompassing various fields. Its scope includes not only "omic" type data and the fields of high-throughput biology currently serviced by large public repositories, but also the growing range of more difficult-to-access data, such as imaging, neuroscience, ecology, cohort data, systems biology and other new types of large-scale shareable data.