A Long-Short Term Memory Network for Detecting CRISPR Arrays

Shantanu Deshmukh, P. Heller, Natalia Khuri
{"title":"A Long-Short Term Memory Network for Detecting CRISPR Arrays","authors":"Shantanu Deshmukh, P. Heller, Natalia Khuri","doi":"10.1109/ICMLA.2019.00114","DOIUrl":null,"url":null,"abstract":"Clustered Regularly Interspaced Short Palindromic Repeat is a pattern found in the DNA sequences of some archeal and bacterial organisms. Together with CRISPR associated genes, CRISPR arrays provide immunity against phages and other mobile exogenous elements. CRISPR-based immunity mechanism can be manipulated to perform genome editing at low cost. To improve the specificity of CRISPR-based genome editing, better software and experimental tools are needed, and accurate detection of CRISPR arrays in DNA sequences is the first step toward this goal. In this work, a CRISPR array detection pipeline, CRISPRLstm, is presented that leverages the power of artificial intelligence. More specifically, Long-Short Term Memory models are used to discriminate between valid and invalid arrays. The predictions by CRISPRLstm are better or in good agreement with other freely available tools, and CRISPRLstm outperforms Random forest classifier in identifying valid repeat sequences. CRISPRLstm predictor is publicly available as a web-based application with an interactive user interface.","PeriodicalId":436714,"journal":{"name":"2019 18th IEEE International Conference On Machine Learning And Applications (ICMLA)","volume":"110 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 18th IEEE International Conference On Machine Learning And Applications (ICMLA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICMLA.2019.00114","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

Clustered Regularly Interspaced Short Palindromic Repeat is a pattern found in the DNA sequences of some archeal and bacterial organisms. Together with CRISPR associated genes, CRISPR arrays provide immunity against phages and other mobile exogenous elements. CRISPR-based immunity mechanism can be manipulated to perform genome editing at low cost. To improve the specificity of CRISPR-based genome editing, better software and experimental tools are needed, and accurate detection of CRISPR arrays in DNA sequences is the first step toward this goal. In this work, a CRISPR array detection pipeline, CRISPRLstm, is presented that leverages the power of artificial intelligence. More specifically, Long-Short Term Memory models are used to discriminate between valid and invalid arrays. The predictions by CRISPRLstm are better or in good agreement with other freely available tools, and CRISPRLstm outperforms Random forest classifier in identifying valid repeat sequences. CRISPRLstm predictor is publicly available as a web-based application with an interactive user interface.
一种检测CRISPR阵列的长短期记忆网络
集群规则间隔短回文重复是在一些原始生物和细菌有机体的DNA序列中发现的一种模式。与CRISPR相关基因一起,CRISPR阵列提供对噬菌体和其他移动外源元件的免疫。利用基于crispr的免疫机制可以低成本地进行基因组编辑。为了提高基于CRISPR的基因组编辑的特异性,需要更好的软件和实验工具,而准确检测DNA序列中的CRISPR阵列是实现这一目标的第一步。在这项工作中,提出了一种利用人工智能力量的CRISPR阵列检测管道CRISPRLstm。更具体地说,长短期内存模型用于区分有效和无效数组。CRISPRLstm的预测结果优于其他免费工具,并且在识别有效重复序列方面优于随机森林分类器。CRISPRLstm预测器是一个基于web的应用程序,具有交互式用户界面。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信