Detection and Quantification of 5moU RNA Modification from Direct RNA Sequencing Data

IF 16.4 1区 化学 Q1 CHEMISTRY, MULTIDISCIPLINARY
Jiayi Li, Feiyang Sun, Kunyang He, Lin Zhang, Jia Meng, Daiyun Huang, Yuxin Zhang
{"title":"Detection and Quantification of 5moU RNA Modification from Direct RNA Sequencing Data","authors":"Jiayi Li, Feiyang Sun, Kunyang He, Lin Zhang, Jia Meng, Daiyun Huang, Yuxin Zhang","doi":"10.2174/0113892029288843240402042529","DOIUrl":null,"url":null,"abstract":"Background: Chemically modified therapeutic mRNAs have gained momentum recently. In addition to commonly used modifications (e.g., pseudouridine), 5moU is considered a promising substitution for uridine in therapeutic mRNAs. Accurate identification of 5-methoxyuridine (5moU) would be crucial for the study and quality control of relevant in vitro-transcribed (IVT) mRNAs. However, current methods exhibit deficiencies in providing quantitative methodologies for detecting such modification. Utilizing the capabilities of Oxford nanopore direct RNA sequencing, in this study, we present NanoML-5moU, a machine-learning framework designed specifically for the read-level detection and quantification of 5moU modification for IVT data. Method: Nanopore direct RNA sequencing data from both 5moU-modified and unmodified control samples were collected. Subsequently, a comprehensive analysis and modeling of signal event characteristics (mean, median current intensities, standard deviations, and dwell times) were performed. Furthermore, classical machine learning algorithms, notably the Support Vector Machine (SVM), Random Forest (RF), and XGBoost were employed to discern 5moU modifications within NNUNN (where N represents A, C, U, or G) 5-mers. Result: Notably, the signal event attributes pertaining to each constituent base of the NNUNN 5-mers, in conjunction with the utilization of the XGBoost algorithm, exhibited remarkable performance levels (with a maximum AUROC of 0.9567 in the \"AGTTC\" reference 5-mer dataset and a minimum AUROC of 0.8113 in the \"TGTGC\" reference 5-mer dataset). This accomplishment markedly exceeded the efficacy of the prevailing background error comparison model (ELIGOs AUC 0.751 for site-level prediction). The model's performance was further validated through a series of curated datasets, which featured customized modification ratios designed to emulate broader data patterns, demonstrating its general applicability in quality control of IVT mRNA vaccines. The NanoML-5moU framework is publicly available on GitHub (https://github.com/JiayiLi21/Nano ML-5moU). Conclusion: NanoML-5moU enables accurate read-level profiling of 5moU modification with nanopore direct RNA-sequencing, which is a powerful tool specialized in unveiling signal patterns in in vitro-transcribed (IVT) mRNAs.","PeriodicalId":1,"journal":{"name":"Accounts of Chemical Research","volume":null,"pages":null},"PeriodicalIF":16.4000,"publicationDate":"2024-04-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Accounts of Chemical Research","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.2174/0113892029288843240402042529","RegionNum":1,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CHEMISTRY, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0

Abstract

Background: Chemically modified therapeutic mRNAs have gained momentum recently. In addition to commonly used modifications (e.g., pseudouridine), 5moU is considered a promising substitution for uridine in therapeutic mRNAs. Accurate identification of 5-methoxyuridine (5moU) would be crucial for the study and quality control of relevant in vitro-transcribed (IVT) mRNAs. However, current methods exhibit deficiencies in providing quantitative methodologies for detecting such modification. Utilizing the capabilities of Oxford nanopore direct RNA sequencing, in this study, we present NanoML-5moU, a machine-learning framework designed specifically for the read-level detection and quantification of 5moU modification for IVT data. Method: Nanopore direct RNA sequencing data from both 5moU-modified and unmodified control samples were collected. Subsequently, a comprehensive analysis and modeling of signal event characteristics (mean, median current intensities, standard deviations, and dwell times) were performed. Furthermore, classical machine learning algorithms, notably the Support Vector Machine (SVM), Random Forest (RF), and XGBoost were employed to discern 5moU modifications within NNUNN (where N represents A, C, U, or G) 5-mers. Result: Notably, the signal event attributes pertaining to each constituent base of the NNUNN 5-mers, in conjunction with the utilization of the XGBoost algorithm, exhibited remarkable performance levels (with a maximum AUROC of 0.9567 in the "AGTTC" reference 5-mer dataset and a minimum AUROC of 0.8113 in the "TGTGC" reference 5-mer dataset). This accomplishment markedly exceeded the efficacy of the prevailing background error comparison model (ELIGOs AUC 0.751 for site-level prediction). The model's performance was further validated through a series of curated datasets, which featured customized modification ratios designed to emulate broader data patterns, demonstrating its general applicability in quality control of IVT mRNA vaccines. The NanoML-5moU framework is publicly available on GitHub (https://github.com/JiayiLi21/Nano ML-5moU). Conclusion: NanoML-5moU enables accurate read-level profiling of 5moU modification with nanopore direct RNA-sequencing, which is a powerful tool specialized in unveiling signal patterns in in vitro-transcribed (IVT) mRNAs.
从直接 RNA 测序数据中检测和量化 5moU RNA 修饰
背景:经过化学修饰的治疗用 mRNA 近来发展势头迅猛。除了常用的修饰(如假尿苷)外,5moU 被认为是治疗 mRNA 中尿苷的一种有前途的替代物。5-methoxyuridine (5moU) 的准确鉴定对于相关体外转录(IVT)mRNA 的研究和质量控制至关重要。然而,目前的方法在提供定量检测这种修饰的方法方面存在缺陷。在本研究中,我们利用牛津纳米孔直接 RNA 测序的功能,提出了 NanoML-5moU,这是一个机器学习框架,专门用于对 IVT 数据的 5moU 修饰进行读数级检测和定量。方法收集 5moU 修饰和未修改对照样本的 Nanopore 直接 RNA 测序数据。随后,对信号事件特征(平均值、中值电流强度、标准偏差和停留时间)进行了综合分析和建模。此外,研究人员还采用了经典的机器学习算法,特别是支持向量机(SVM)、随机森林(RF)和 XGBoost,来识别 NNUNN(其中 N 代表 A、C、U 或 G)5-mers 中的 5moU 修饰。结果值得注意的是,与 NNUNN 5-聚合体各组成基相关的信号事件属性,结合 XGBoost 算法的使用,表现出了卓越的性能水平("AGTTC "参考 5-聚合体数据集的最大 AUROC 为 0.9567,"TGTGC "参考 5-聚合体数据集的最小 AUROC 为 0.8113)。这一成绩明显超过了现有的背景误差比较模型(ELIGOs AUC 0.751,用于位点级预测)。该模型的性能通过一系列策划数据集得到了进一步验证,这些数据集具有定制的修饰比率,旨在模仿更广泛的数据模式,证明了其在 IVT mRNA 疫苗质量控制中的普遍适用性。NanoML-5moU 框架在 GitHub 上公开发布(https://github.com/JiayiLi21/Nano ML-5moU)。结论NanoML-5moU 可通过纳米孔直接 RNA 测序对 5moU 修饰进行精确的读数级剖析,是专门揭示体外转录 (IVT) mRNA 信号模式的强大工具。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Accounts of Chemical Research
Accounts of Chemical Research 化学-化学综合
CiteScore
31.40
自引率
1.10%
发文量
312
审稿时长
2 months
期刊介绍: Accounts of Chemical Research presents short, concise and critical articles offering easy-to-read overviews of basic research and applications in all areas of chemistry and biochemistry. These short reviews focus on research from the author’s own laboratory and are designed to teach the reader about a research project. In addition, Accounts of Chemical Research publishes commentaries that give an informed opinion on a current research problem. Special Issues online are devoted to a single topic of unusual activity and significance. Accounts of Chemical Research replaces the traditional article abstract with an article "Conspectus." These entries synopsize the research affording the reader a closer look at the content and significance of an article. Through this provision of a more detailed description of the article contents, the Conspectus enhances the article's discoverability by search engines and the exposure for the research.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信