深度神经网络间歇放电检测的外部验证优化。

IF 6.6 1区 医学 Q1 CLINICAL NEUROLOGY
Epilepsia Pub Date : 2025-05-23 DOI:10.1111/epi.18411
Marleen C. Tjepkema-Cloostermans, Michel J. A. M. van Putten
{"title":"深度神经网络间歇放电检测的外部验证优化。","authors":"Marleen C. Tjepkema-Cloostermans,&nbsp;Michel J. A. M. van Putten","doi":"10.1111/epi.18411","DOIUrl":null,"url":null,"abstract":"<p>We appreciate the opportunity to respond to the letter regarding our study.<span><sup>1</sup></span> Although we welcome constructive discussion, the concerns raised reflect a misunderstanding of our methodology and the principles of external validation.</p><p>First, external validation assesses a model's generalizability in an independent clinical setting, and our study followed best practices by evaluating the deep learning model on an external dataset. The suggestion that our process introduced confirmation bias is misleading. Expert review of artificial intelligence (AI)-detected events is a widely accepted practice in clinical AI validation, particularly in the absence of a universally accepted gold standard for interictal epileptiform discharge (IED) detection.<span><sup>2</sup></span> Our inclusion of a multiexpert panel further strengthens the validation process and mitigates individual bias.</p><p>Second, although one of the original model developers participated in reviewing IEDs flagged by the model, final adjudication was conducted by a panel of five experts. This is a standard and accepted approach in electroencephalographic (EEG) studies, where interrater agreement naturally varies.<span><sup>3</sup></span> The assertion that this process introduces bias disregards that clinical neurophysiology often relies on expert consensus in the absence of a definitive ground truth.</p><p>Third, the claim that two authors who achieved perfect agreement (Cohen <i>κ</i> = 1.0) were involved in both training and external validation, indicating a lack of “assessor independence,” misrepresents our study. The interrater variability in the internal validation set ranged from .71 to 1. The one pair of experts who achieved perfect agreement were from different institutions, each with &gt;20 years of experience in EEG interpretation. Their high <i>κ</i> value reflects expertise, not a lack of independence. Furthermore, only one of them was involved in data labeling for training, internal validation, and the external validation panel, whereas the other was involved solely in internal validation.</p><p>Fourth, the suggestion that our validation process is susceptible to overfitting seems to stem from an essential misunderstanding of the concept, as overfitting pertains to the training phase rather than validation. Overfitting is a phenomenon occurring during training when a model is excessively tailored to a specific dataset, reducing its ability to generalize. External validation, by definition, does not involve retraining, making such concerns both misplaced and irrelevant.<span><sup>4</sup></span></p><p>Finally, speculation about potential commercial influence is both unsubstantiated and misleading. Our study was conducted independently, with all affiliations transparently disclosed. Although commercial entities contribute to AI development in medicine, this does not inherently compromise scientific integrity when conflicts of interest are properly managed,<span><sup>5</sup></span> as was the case in our study. The insinuation of bias lacks supporting evidence and disregards the fundamental principles of independent scientific inquiry.</p><p>In conclusion, our study offers a rigorous and clinically meaningful external validation of AI-based IED detection, adhering to best practices in clinical neurophysiology and AI validation.<span><sup>1</sup></span> We appreciate the opportunity to further substantiate these points. We welcome others to evaluate our AI system using their own external datasets.</p><p>M.J.A.M.v.P. is cofounder of Clinical Science Systems, a supplier of EEG systems for Medisch Spectrum Twente. Clinical Science Systems offered no funding and was not involved in the design, execution, analysis, interpretation, or publication of the study. M.C.T.-C. has no conflict of interest. We confirm that we have read the Journal's position on issues involved in ethical publication and affirm that this report is consistent with those guidelines.</p>","PeriodicalId":11768,"journal":{"name":"Epilepsia","volume":"66 7","pages":"2598-2599"},"PeriodicalIF":6.6000,"publicationDate":"2025-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/epi.18411","citationCount":"0","resultStr":"{\"title\":\"Reply to: Optimizing external validation of deep neural networks for interictal discharge detection\",\"authors\":\"Marleen C. Tjepkema-Cloostermans,&nbsp;Michel J. A. M. van Putten\",\"doi\":\"10.1111/epi.18411\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>We appreciate the opportunity to respond to the letter regarding our study.<span><sup>1</sup></span> Although we welcome constructive discussion, the concerns raised reflect a misunderstanding of our methodology and the principles of external validation.</p><p>First, external validation assesses a model's generalizability in an independent clinical setting, and our study followed best practices by evaluating the deep learning model on an external dataset. The suggestion that our process introduced confirmation bias is misleading. Expert review of artificial intelligence (AI)-detected events is a widely accepted practice in clinical AI validation, particularly in the absence of a universally accepted gold standard for interictal epileptiform discharge (IED) detection.<span><sup>2</sup></span> Our inclusion of a multiexpert panel further strengthens the validation process and mitigates individual bias.</p><p>Second, although one of the original model developers participated in reviewing IEDs flagged by the model, final adjudication was conducted by a panel of five experts. This is a standard and accepted approach in electroencephalographic (EEG) studies, where interrater agreement naturally varies.<span><sup>3</sup></span> The assertion that this process introduces bias disregards that clinical neurophysiology often relies on expert consensus in the absence of a definitive ground truth.</p><p>Third, the claim that two authors who achieved perfect agreement (Cohen <i>κ</i> = 1.0) were involved in both training and external validation, indicating a lack of “assessor independence,” misrepresents our study. The interrater variability in the internal validation set ranged from .71 to 1. The one pair of experts who achieved perfect agreement were from different institutions, each with &gt;20 years of experience in EEG interpretation. Their high <i>κ</i> value reflects expertise, not a lack of independence. Furthermore, only one of them was involved in data labeling for training, internal validation, and the external validation panel, whereas the other was involved solely in internal validation.</p><p>Fourth, the suggestion that our validation process is susceptible to overfitting seems to stem from an essential misunderstanding of the concept, as overfitting pertains to the training phase rather than validation. Overfitting is a phenomenon occurring during training when a model is excessively tailored to a specific dataset, reducing its ability to generalize. External validation, by definition, does not involve retraining, making such concerns both misplaced and irrelevant.<span><sup>4</sup></span></p><p>Finally, speculation about potential commercial influence is both unsubstantiated and misleading. Our study was conducted independently, with all affiliations transparently disclosed. Although commercial entities contribute to AI development in medicine, this does not inherently compromise scientific integrity when conflicts of interest are properly managed,<span><sup>5</sup></span> as was the case in our study. The insinuation of bias lacks supporting evidence and disregards the fundamental principles of independent scientific inquiry.</p><p>In conclusion, our study offers a rigorous and clinically meaningful external validation of AI-based IED detection, adhering to best practices in clinical neurophysiology and AI validation.<span><sup>1</sup></span> We appreciate the opportunity to further substantiate these points. We welcome others to evaluate our AI system using their own external datasets.</p><p>M.J.A.M.v.P. is cofounder of Clinical Science Systems, a supplier of EEG systems for Medisch Spectrum Twente. Clinical Science Systems offered no funding and was not involved in the design, execution, analysis, interpretation, or publication of the study. M.C.T.-C. has no conflict of interest. We confirm that we have read the Journal's position on issues involved in ethical publication and affirm that this report is consistent with those guidelines.</p>\",\"PeriodicalId\":11768,\"journal\":{\"name\":\"Epilepsia\",\"volume\":\"66 7\",\"pages\":\"2598-2599\"},\"PeriodicalIF\":6.6000,\"publicationDate\":\"2025-05-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://onlinelibrary.wiley.com/doi/epdf/10.1111/epi.18411\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Epilepsia\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1111/epi.18411\",\"RegionNum\":1,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"CLINICAL NEUROLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Epilepsia","FirstCategoryId":"3","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1111/epi.18411","RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CLINICAL NEUROLOGY","Score":null,"Total":0}
引用次数: 0

摘要

我们很感激有机会回复有关我们学习的信尽管我们欢迎建设性的讨论,但提出的问题反映了对我们的方法和外部验证原则的误解。首先,外部验证评估了模型在独立临床环境中的泛化性,我们的研究遵循了在外部数据集上评估深度学习模型的最佳实践。有人说我们的程序引入了确认偏误,这是误导。对人工智能(AI)检测到的事件进行专家审查是临床人工智能验证中广泛接受的做法,特别是在缺乏普遍接受的癫痫样放电(IED)检测金标准的情况下我们纳入了一个多专家小组,进一步加强了验证过程,减轻了个人偏见。其次,尽管最初的模型开发人员之一参与了对模型标记的简易爆炸装置的审查,但最终的裁决是由五名专家组成的小组进行的。在脑电图(EEG)研究中,这是一种标准的、被接受的方法,在脑电图研究中,研究者之间的一致性自然是不同的断言这一过程引入了偏见,忽视了临床神经生理学往往依赖于专家共识,缺乏明确的基础真理。第三,声称两位达到完美一致(Cohen κ = 1.0)的作者同时参与了培训和外部验证,这表明缺乏“评估者独立性”,这歪曲了我们的研究。内部验证集的变异率范围从0.71到1。两组专家的意见完全一致,他们分别来自不同的机构,都有20年的脑电图解释经验。他们的高κ值反映了专业知识,而不是缺乏独立性。此外,他们中只有一个人参与了训练、内部验证和外部验证小组的数据标记,而另一个人只参与了内部验证。第四,我们的验证过程容易受到过拟合的影响的建议似乎源于对概念的基本误解,因为过拟合属于训练阶段而不是验证阶段。过度拟合是指在训练过程中,当模型过度适应特定数据集时,会降低其泛化能力。外部验证,根据定义,不涉及再培训,使得这种关注既错位又无关紧要。最后,关于潜在商业影响的猜测既没有根据,又具有误导性。我们的研究是独立进行的,所有附属机构都透明地披露了。尽管商业实体有助于医学领域的人工智能发展,但在利益冲突得到妥善管理的情况下,这并不会损害科学的完整性,5正如我们研究中的情况一样。这种暗示偏见的做法缺乏证据支持,而且无视独立科学探究的基本原则。总之,我们的研究为基于人工智能的IED检测提供了严格且具有临床意义的外部验证,坚持了临床神经生理学和人工智能验证的最佳实践我们感谢有机会进一步证实这些观点。我们欢迎其他人使用他们自己的外部数据集来评估我们的人工智能系统。是临床科学系统公司的联合创始人,该公司是medicch Spectrum Twente的脑电图系统供应商。临床科学系统没有提供资金,也没有参与该研究的设计、执行、分析、解释或发表。M.C.T.-C。没有利益冲突。我们确认,我们已经阅读了《华尔街日报》关于出版伦理问题的立场,并确认本报告符合这些准则。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Reply to: Optimizing external validation of deep neural networks for interictal discharge detection

We appreciate the opportunity to respond to the letter regarding our study.1 Although we welcome constructive discussion, the concerns raised reflect a misunderstanding of our methodology and the principles of external validation.

First, external validation assesses a model's generalizability in an independent clinical setting, and our study followed best practices by evaluating the deep learning model on an external dataset. The suggestion that our process introduced confirmation bias is misleading. Expert review of artificial intelligence (AI)-detected events is a widely accepted practice in clinical AI validation, particularly in the absence of a universally accepted gold standard for interictal epileptiform discharge (IED) detection.2 Our inclusion of a multiexpert panel further strengthens the validation process and mitigates individual bias.

Second, although one of the original model developers participated in reviewing IEDs flagged by the model, final adjudication was conducted by a panel of five experts. This is a standard and accepted approach in electroencephalographic (EEG) studies, where interrater agreement naturally varies.3 The assertion that this process introduces bias disregards that clinical neurophysiology often relies on expert consensus in the absence of a definitive ground truth.

Third, the claim that two authors who achieved perfect agreement (Cohen κ = 1.0) were involved in both training and external validation, indicating a lack of “assessor independence,” misrepresents our study. The interrater variability in the internal validation set ranged from .71 to 1. The one pair of experts who achieved perfect agreement were from different institutions, each with >20 years of experience in EEG interpretation. Their high κ value reflects expertise, not a lack of independence. Furthermore, only one of them was involved in data labeling for training, internal validation, and the external validation panel, whereas the other was involved solely in internal validation.

Fourth, the suggestion that our validation process is susceptible to overfitting seems to stem from an essential misunderstanding of the concept, as overfitting pertains to the training phase rather than validation. Overfitting is a phenomenon occurring during training when a model is excessively tailored to a specific dataset, reducing its ability to generalize. External validation, by definition, does not involve retraining, making such concerns both misplaced and irrelevant.4

Finally, speculation about potential commercial influence is both unsubstantiated and misleading. Our study was conducted independently, with all affiliations transparently disclosed. Although commercial entities contribute to AI development in medicine, this does not inherently compromise scientific integrity when conflicts of interest are properly managed,5 as was the case in our study. The insinuation of bias lacks supporting evidence and disregards the fundamental principles of independent scientific inquiry.

In conclusion, our study offers a rigorous and clinically meaningful external validation of AI-based IED detection, adhering to best practices in clinical neurophysiology and AI validation.1 We appreciate the opportunity to further substantiate these points. We welcome others to evaluate our AI system using their own external datasets.

M.J.A.M.v.P. is cofounder of Clinical Science Systems, a supplier of EEG systems for Medisch Spectrum Twente. Clinical Science Systems offered no funding and was not involved in the design, execution, analysis, interpretation, or publication of the study. M.C.T.-C. has no conflict of interest. We confirm that we have read the Journal's position on issues involved in ethical publication and affirm that this report is consistent with those guidelines.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Epilepsia
Epilepsia 医学-临床神经学
CiteScore
10.90
自引率
10.70%
发文量
319
审稿时长
2-4 weeks
期刊介绍: Epilepsia is the leading, authoritative source for innovative clinical and basic science research for all aspects of epilepsy and seizures. In addition, Epilepsia publishes critical reviews, opinion pieces, and guidelines that foster understanding and aim to improve the diagnosis and treatment of people with seizures and epilepsy.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信