{"title":"Understanding the Impact of Individual Nucleotide on Oxford Nanopore Current Signals With Interpretable Prediction Models.","authors":"Yenan Wang, Zhixing Wu, Jia Meng","doi":"10.1177/11779322251378620","DOIUrl":null,"url":null,"abstract":"<p><p>Oxford nanopore sequencing enabled real-time, long-read analysis of DNA by detecting ionic current signals associated with K-mer sequences. Although many studies analyzed sequence and modification detection, our understanding of how multiple nucleotides of the K-mer sequence determine nanopore signals together is still limited. In this study, we seek to unveil the positional impact of individual nucleotide through interpretable prediction models. Multiple machine learning models were trained and optimized. To increase model interpretability and explore underlying mechanisms, the tool of SHapley Additive exPlanations was applied to make an assessment of both nucleotides and positions. Our results show that previously unseen Oxford nanopore signals were accurately predicted, and results were consistent on two different modes (R<sup>2</sup> = 0.9984 for 260 bps, R<sup>2</sup> = 0.9983 for 400 bps, R10.4 flow cell, XGBoost). Thymine bases (T) at positions 6 and 7 were the most influential, while nucleotides at positions 1, 2, 3, 4, and 9 have minimal impacts on signals. In addition, heatmap analysis toward transitions of bases revealed the impact of individual nucleotide on signal changes in a position-specific manner. Briefly, our work provided predictive and interpretable modeling of nanopore signals, concentrating on influential bases and positions among all obtainable features, which enhanced understanding of nanopore sequencing mechanisms and nucleotide/position-related signal variations.</p>","PeriodicalId":9065,"journal":{"name":"Bioinformatics and Biology Insights","volume":"19 ","pages":"11779322251378620"},"PeriodicalIF":2.4000,"publicationDate":"2025-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12457769/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Bioinformatics and Biology Insights","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1177/11779322251378620","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/1 0:00:00","PubModel":"eCollection","JCR":"Q3","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}
引用次数: 0
Abstract
Oxford nanopore sequencing enabled real-time, long-read analysis of DNA by detecting ionic current signals associated with K-mer sequences. Although many studies analyzed sequence and modification detection, our understanding of how multiple nucleotides of the K-mer sequence determine nanopore signals together is still limited. In this study, we seek to unveil the positional impact of individual nucleotide through interpretable prediction models. Multiple machine learning models were trained and optimized. To increase model interpretability and explore underlying mechanisms, the tool of SHapley Additive exPlanations was applied to make an assessment of both nucleotides and positions. Our results show that previously unseen Oxford nanopore signals were accurately predicted, and results were consistent on two different modes (R2 = 0.9984 for 260 bps, R2 = 0.9983 for 400 bps, R10.4 flow cell, XGBoost). Thymine bases (T) at positions 6 and 7 were the most influential, while nucleotides at positions 1, 2, 3, 4, and 9 have minimal impacts on signals. In addition, heatmap analysis toward transitions of bases revealed the impact of individual nucleotide on signal changes in a position-specific manner. Briefly, our work provided predictive and interpretable modeling of nanopore signals, concentrating on influential bases and positions among all obtainable features, which enhanced understanding of nanopore sequencing mechanisms and nucleotide/position-related signal variations.
期刊介绍:
Bioinformatics and Biology Insights is an open access, peer-reviewed journal that considers articles on bioinformatics methods and their applications which must pertain to biological insights. All papers should be easily amenable to biologists and as such help bridge the gap between theories and applications.