理解单个核苷酸对牛津纳米孔电流信号的影响与可解释的预测模型。

IF 2.4 Q3 BIOCHEMICAL RESEARCH METHODS
Bioinformatics and Biology Insights Pub Date : 2025-09-22 eCollection Date: 2025-01-01 DOI:10.1177/11779322251378620
Yenan Wang, Zhixing Wu, Jia Meng
{"title":"理解单个核苷酸对牛津纳米孔电流信号的影响与可解释的预测模型。","authors":"Yenan Wang, Zhixing Wu, Jia Meng","doi":"10.1177/11779322251378620","DOIUrl":null,"url":null,"abstract":"<p><p>Oxford nanopore sequencing enabled real-time, long-read analysis of DNA by detecting ionic current signals associated with K-mer sequences. Although many studies analyzed sequence and modification detection, our understanding of how multiple nucleotides of the K-mer sequence determine nanopore signals together is still limited. In this study, we seek to unveil the positional impact of individual nucleotide through interpretable prediction models. Multiple machine learning models were trained and optimized. To increase model interpretability and explore underlying mechanisms, the tool of SHapley Additive exPlanations was applied to make an assessment of both nucleotides and positions. Our results show that previously unseen Oxford nanopore signals were accurately predicted, and results were consistent on two different modes (R<sup>2</sup> = 0.9984 for 260 bps, R<sup>2</sup> = 0.9983 for 400 bps, R10.4 flow cell, XGBoost). Thymine bases (T) at positions 6 and 7 were the most influential, while nucleotides at positions 1, 2, 3, 4, and 9 have minimal impacts on signals. In addition, heatmap analysis toward transitions of bases revealed the impact of individual nucleotide on signal changes in a position-specific manner. Briefly, our work provided predictive and interpretable modeling of nanopore signals, concentrating on influential bases and positions among all obtainable features, which enhanced understanding of nanopore sequencing mechanisms and nucleotide/position-related signal variations.</p>","PeriodicalId":9065,"journal":{"name":"Bioinformatics and Biology Insights","volume":"19 ","pages":"11779322251378620"},"PeriodicalIF":2.4000,"publicationDate":"2025-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12457769/pdf/","citationCount":"0","resultStr":"{\"title\":\"Understanding the Impact of Individual Nucleotide on Oxford Nanopore Current Signals With Interpretable Prediction Models.\",\"authors\":\"Yenan Wang, Zhixing Wu, Jia Meng\",\"doi\":\"10.1177/11779322251378620\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Oxford nanopore sequencing enabled real-time, long-read analysis of DNA by detecting ionic current signals associated with K-mer sequences. Although many studies analyzed sequence and modification detection, our understanding of how multiple nucleotides of the K-mer sequence determine nanopore signals together is still limited. In this study, we seek to unveil the positional impact of individual nucleotide through interpretable prediction models. Multiple machine learning models were trained and optimized. To increase model interpretability and explore underlying mechanisms, the tool of SHapley Additive exPlanations was applied to make an assessment of both nucleotides and positions. Our results show that previously unseen Oxford nanopore signals were accurately predicted, and results were consistent on two different modes (R<sup>2</sup> = 0.9984 for 260 bps, R<sup>2</sup> = 0.9983 for 400 bps, R10.4 flow cell, XGBoost). Thymine bases (T) at positions 6 and 7 were the most influential, while nucleotides at positions 1, 2, 3, 4, and 9 have minimal impacts on signals. In addition, heatmap analysis toward transitions of bases revealed the impact of individual nucleotide on signal changes in a position-specific manner. Briefly, our work provided predictive and interpretable modeling of nanopore signals, concentrating on influential bases and positions among all obtainable features, which enhanced understanding of nanopore sequencing mechanisms and nucleotide/position-related signal variations.</p>\",\"PeriodicalId\":9065,\"journal\":{\"name\":\"Bioinformatics and Biology Insights\",\"volume\":\"19 \",\"pages\":\"11779322251378620\"},\"PeriodicalIF\":2.4000,\"publicationDate\":\"2025-09-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12457769/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Bioinformatics and Biology Insights\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1177/11779322251378620\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2025/1/1 0:00:00\",\"PubModel\":\"eCollection\",\"JCR\":\"Q3\",\"JCRName\":\"BIOCHEMICAL RESEARCH METHODS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Bioinformatics and Biology Insights","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1177/11779322251378620","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/1 0:00:00","PubModel":"eCollection","JCR":"Q3","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}
引用次数: 0

摘要

牛津纳米孔测序通过检测与K-mer序列相关的离子电流信号,实现了DNA的实时、长读分析。尽管许多研究分析了序列和修饰检测,但我们对K-mer序列的多个核苷酸如何共同决定纳米孔信号的理解仍然有限。在这项研究中,我们试图通过可解释的预测模型揭示单个核苷酸的位置影响。对多个机器学习模型进行训练和优化。为了提高模型的可解释性并探索潜在的机制,应用SHapley加性解释工具对核苷酸和位置进行评估。我们的研究结果表明,以前未见过的Oxford纳米孔信号被准确预测,并且结果在两种不同模式下是一致的(260 bps时R2 = 0.9984, 400 bps时R2 = 0.9983, R10.4流动电池,XGBoost)。胸腺嘧啶碱基(T)在位置6和7的影响最大,而核苷酸在位置1、2、3、4和9对信号的影响最小。此外,对碱基转换的热图分析揭示了单个核苷酸对信号变化的位置特异性影响。简而言之,我们的工作提供了纳米孔信号的预测和可解释的模型,集中在所有可获得的特征中有影响的碱基和位置,这增强了对纳米孔测序机制和核苷酸/位置相关信号变化的理解。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Understanding the Impact of Individual Nucleotide on Oxford Nanopore Current Signals With Interpretable Prediction Models.

Oxford nanopore sequencing enabled real-time, long-read analysis of DNA by detecting ionic current signals associated with K-mer sequences. Although many studies analyzed sequence and modification detection, our understanding of how multiple nucleotides of the K-mer sequence determine nanopore signals together is still limited. In this study, we seek to unveil the positional impact of individual nucleotide through interpretable prediction models. Multiple machine learning models were trained and optimized. To increase model interpretability and explore underlying mechanisms, the tool of SHapley Additive exPlanations was applied to make an assessment of both nucleotides and positions. Our results show that previously unseen Oxford nanopore signals were accurately predicted, and results were consistent on two different modes (R2 = 0.9984 for 260 bps, R2 = 0.9983 for 400 bps, R10.4 flow cell, XGBoost). Thymine bases (T) at positions 6 and 7 were the most influential, while nucleotides at positions 1, 2, 3, 4, and 9 have minimal impacts on signals. In addition, heatmap analysis toward transitions of bases revealed the impact of individual nucleotide on signal changes in a position-specific manner. Briefly, our work provided predictive and interpretable modeling of nanopore signals, concentrating on influential bases and positions among all obtainable features, which enhanced understanding of nanopore sequencing mechanisms and nucleotide/position-related signal variations.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Bioinformatics and Biology Insights
Bioinformatics and Biology Insights BIOCHEMICAL RESEARCH METHODS-
CiteScore
6.80
自引率
1.70%
发文量
36
审稿时长
8 weeks
期刊介绍: Bioinformatics and Biology Insights is an open access, peer-reviewed journal that considers articles on bioinformatics methods and their applications which must pertain to biological insights. All papers should be easily amenable to biologists and as such help bridge the gap between theories and applications.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信