CRISPR-OTE: Prediction of CRISPR On-Target Efficiency Based on Multi-Dimensional Feature Fusion

IF 5.6 4区 医学 Q1 ENGINEERING, BIOMEDICAL
Irbm Pub Date : 2023-02-01 DOI:10.1016/j.irbm.2022.07.003
J. Xie , M. Liu , L. Zhou
{"title":"CRISPR-OTE: Prediction of CRISPR On-Target Efficiency Based on Multi-Dimensional Feature Fusion","authors":"J. Xie ,&nbsp;M. Liu ,&nbsp;L. Zhou","doi":"10.1016/j.irbm.2022.07.003","DOIUrl":null,"url":null,"abstract":"<div><h3>Objective</h3><p>Clustered Regularly Interspaced Short Palindromic Repeats<span> (CRISPR) is a powerful genome editing<span> technology. Guide RNA (gRNA) plays an essential guiding role in the CRISPR system by complementary base pairing with target DNA. Since the CRISPR targeting mechanism problem has not yet been fully resolved, it remains a challenge to predict gRNA on-target efficiency. Current gRNA design tools often lack efficient information extraction and cannot learn the target efficiency patterns thoroughly.</span></span></p></div><div><h3>Material and methods</h3><p>In this study, CRISPR-OTE is proposed to consider both multi-dimensional sequence information and important complementary prior knowledge based on a simple but effective framework. CRISPR-OTE consists of the local-contextual information branch and the prior knowledge branch. The local-contextual information branch extracts multi-dimensional sequence features from the DNA primary sequence by a parallel framework of Convolutional Neural Networks<span> (CNN) and bidirectional Long Short-Term Memory networks (biLSTM). The prior knowledge branch selects the optimal subset of physicochemical features to provide the neural network with complementary knowledge, such as complex secondary structures. A simple feature fusion strategy is also adopted to fully utilize multi-modal data from the two branches.</span></p></div><div><h3>Results</h3><p>The experimental results show that the optimal subset of physicochemical features (RNA secondary structure and melting temperature of 34nt target) can effectively improve the prediction performance. Additionally, combining multi-dimensional sequence features and multi-modal features can extract information more comprehensively. Through transfer learning, CRISPR-OTE trained on the CRISPR-Cpf1 system can also be successfully applied to the CRISPR-Cas9 system.</p></div><div><h3>Conclusion</h3><p>The performance of CRISPR-OTE is superior to other methods in different CRISPR systems and species. Therefore, CRISPR-OTE is a simple on-target efficiency prediction framework with better accuracy and generalization performance.</p></div>","PeriodicalId":14605,"journal":{"name":"Irbm","volume":"44 1","pages":"Article 100732"},"PeriodicalIF":5.6000,"publicationDate":"2023-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Irbm","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S195903182200080X","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, BIOMEDICAL","Score":null,"Total":0}
引用次数: 2

Abstract

Objective

Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) is a powerful genome editing technology. Guide RNA (gRNA) plays an essential guiding role in the CRISPR system by complementary base pairing with target DNA. Since the CRISPR targeting mechanism problem has not yet been fully resolved, it remains a challenge to predict gRNA on-target efficiency. Current gRNA design tools often lack efficient information extraction and cannot learn the target efficiency patterns thoroughly.

Material and methods

In this study, CRISPR-OTE is proposed to consider both multi-dimensional sequence information and important complementary prior knowledge based on a simple but effective framework. CRISPR-OTE consists of the local-contextual information branch and the prior knowledge branch. The local-contextual information branch extracts multi-dimensional sequence features from the DNA primary sequence by a parallel framework of Convolutional Neural Networks (CNN) and bidirectional Long Short-Term Memory networks (biLSTM). The prior knowledge branch selects the optimal subset of physicochemical features to provide the neural network with complementary knowledge, such as complex secondary structures. A simple feature fusion strategy is also adopted to fully utilize multi-modal data from the two branches.

Results

The experimental results show that the optimal subset of physicochemical features (RNA secondary structure and melting temperature of 34nt target) can effectively improve the prediction performance. Additionally, combining multi-dimensional sequence features and multi-modal features can extract information more comprehensively. Through transfer learning, CRISPR-OTE trained on the CRISPR-Cpf1 system can also be successfully applied to the CRISPR-Cas9 system.

Conclusion

The performance of CRISPR-OTE is superior to other methods in different CRISPR systems and species. Therefore, CRISPR-OTE is a simple on-target efficiency prediction framework with better accuracy and generalization performance.

CRISPR- ote:基于多维特征融合的CRISPR靶效率预测
目的聚集规则间隔短回文重复序列(CRISPR)是一种强大的基因组编辑技术。引导RNA(gRNA)通过与靶DNA的互补碱基配对在CRISPR系统中发挥重要的引导作用。由于CRISPR靶向机制问题尚未完全解决,预测gRNA的靶向效率仍然是一个挑战。目前的gRNA设计工具往往缺乏有效的信息提取,无法彻底了解目标效率模式。材料和方法在本研究中,CRISPR-OTE基于一个简单但有效的框架,同时考虑多维序列信息和重要的互补先验知识。CRISPR-OTE由局部上下文信息分支和先验知识分支组成。局部上下文信息分支通过卷积神经网络(CNN)和双向长短期记忆网络(biLSTM)的并行框架从DNA主序列中提取多维序列特征。先验知识分支选择物理化学特征的最优子集,为神经网络提供互补知识,例如复杂的二级结构。为了充分利用来自两个分支的多模态数据,还采用了一种简单的特征融合策略。结果实验结果表明,物理化学特征的最佳子集(RNA二级结构和34nt靶标的熔化温度)可以有效地提高预测性能。此外,将多维序列特征和多模态特征相结合可以更全面地提取信息。通过迁移学习,在CRISPR-Cpf1系统上训练的CRISPR-OTE也可以成功应用于CRISPR-Cas9系统。结论在不同的CRISPR系统和物种中,CRISPR-OTE的性能优于其他方法。因此,CRISPR-OTE是一个简单的目标效率预测框架,具有更好的精度和泛化性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Irbm
Irbm ENGINEERING, BIOMEDICAL-
CiteScore
10.30
自引率
4.20%
发文量
81
审稿时长
57 days
期刊介绍: IRBM is the journal of the AGBM (Alliance for engineering in Biology an Medicine / Alliance pour le génie biologique et médical) and the SFGBM (BioMedical Engineering French Society / Société française de génie biologique médical) and the AFIB (French Association of Biomedical Engineers / Association française des ingénieurs biomédicaux). As a vehicle of information and knowledge in the field of biomedical technologies, IRBM is devoted to fundamental as well as clinical research. Biomedical engineering and use of new technologies are the cornerstones of IRBM, providing authors and users with the latest information. Its six issues per year propose reviews (state-of-the-art and current knowledge), original articles directed at fundamental research and articles focusing on biomedical engineering. All articles are submitted to peer reviewers acting as guarantors for IRBM''s scientific and medical content. The field covered by IRBM includes all the discipline of Biomedical engineering. Thereby, the type of papers published include those that cover the technological and methodological development in: -Physiological and Biological Signal processing (EEG, MEG, ECG…)- Medical Image processing- Biomechanics- Biomaterials- Medical Physics- Biophysics- Physiological and Biological Sensors- Information technologies in healthcare- Disability research- Computational physiology- …
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信