Mohammad Rostami, Amin Ghariyazi, Hamed Dashti, Mohammad Hossein Rohban, Hamid R. Rabiee
{"title":"CRISPR: Ensemble Model","authors":"Mohammad Rostami, Amin Ghariyazi, Hamed Dashti, Mohammad Hossein Rohban, Hamid R. Rabiee","doi":"arxiv-2403.03018","DOIUrl":null,"url":null,"abstract":"Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) is a gene\nediting technology that has revolutionized the fields of biology and medicine.\nHowever, one of the challenges of using CRISPR is predicting the on-target\nefficacy and off-target sensitivity of single-guide RNAs (sgRNAs). This is\nbecause most existing methods are trained on separate datasets with different\ngenes and cells, which limits their generalizability. In this paper, we propose\na novel ensemble learning method for sgRNA design that is accurate and\ngeneralizable. Our method combines the predictions of multiple machine learning\nmodels to produce a single, more robust prediction. This approach allows us to\nlearn from a wider range of data, which improves the generalizability of our\nmodel. We evaluated our method on a benchmark dataset of sgRNA designs and\nfound that it outperformed existing methods in terms of both accuracy and\ngeneralizability. Our results suggest that our method can be used to design\nsgRNAs with high sensitivity and specificity, even for new genes or cells. This\ncould have important implications for the clinical use of CRISPR, as it would\nallow researchers to design more effective and safer treatments for a variety\nof diseases.","PeriodicalId":501070,"journal":{"name":"arXiv - QuanBio - Genomics","volume":"271 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-03-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - QuanBio - Genomics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2403.03018","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) is a gene
editing technology that has revolutionized the fields of biology and medicine.
However, one of the challenges of using CRISPR is predicting the on-target
efficacy and off-target sensitivity of single-guide RNAs (sgRNAs). This is
because most existing methods are trained on separate datasets with different
genes and cells, which limits their generalizability. In this paper, we propose
a novel ensemble learning method for sgRNA design that is accurate and
generalizable. Our method combines the predictions of multiple machine learning
models to produce a single, more robust prediction. This approach allows us to
learn from a wider range of data, which improves the generalizability of our
model. We evaluated our method on a benchmark dataset of sgRNA designs and
found that it outperformed existing methods in terms of both accuracy and
generalizability. Our results suggest that our method can be used to design
sgRNAs with high sensitivity and specificity, even for new genes or cells. This
could have important implications for the clinical use of CRISPR, as it would
allow researchers to design more effective and safer treatments for a variety
of diseases.