Marleen C. Tjepkema-Cloostermans, Michel J. A. M. van Putten
{"title":"深度神经网络间歇放电检测的外部验证优化。","authors":"Marleen C. Tjepkema-Cloostermans, Michel J. A. M. van Putten","doi":"10.1111/epi.18411","DOIUrl":null,"url":null,"abstract":"<p>We appreciate the opportunity to respond to the letter regarding our study.<span><sup>1</sup></span> Although we welcome constructive discussion, the concerns raised reflect a misunderstanding of our methodology and the principles of external validation.</p><p>First, external validation assesses a model's generalizability in an independent clinical setting, and our study followed best practices by evaluating the deep learning model on an external dataset. The suggestion that our process introduced confirmation bias is misleading. Expert review of artificial intelligence (AI)-detected events is a widely accepted practice in clinical AI validation, particularly in the absence of a universally accepted gold standard for interictal epileptiform discharge (IED) detection.<span><sup>2</sup></span> Our inclusion of a multiexpert panel further strengthens the validation process and mitigates individual bias.</p><p>Second, although one of the original model developers participated in reviewing IEDs flagged by the model, final adjudication was conducted by a panel of five experts. This is a standard and accepted approach in electroencephalographic (EEG) studies, where interrater agreement naturally varies.<span><sup>3</sup></span> The assertion that this process introduces bias disregards that clinical neurophysiology often relies on expert consensus in the absence of a definitive ground truth.</p><p>Third, the claim that two authors who achieved perfect agreement (Cohen <i>κ</i> = 1.0) were involved in both training and external validation, indicating a lack of “assessor independence,” misrepresents our study. The interrater variability in the internal validation set ranged from .71 to 1. The one pair of experts who achieved perfect agreement were from different institutions, each with >20 years of experience in EEG interpretation. Their high <i>κ</i> value reflects expertise, not a lack of independence. Furthermore, only one of them was involved in data labeling for training, internal validation, and the external validation panel, whereas the other was involved solely in internal validation.</p><p>Fourth, the suggestion that our validation process is susceptible to overfitting seems to stem from an essential misunderstanding of the concept, as overfitting pertains to the training phase rather than validation. Overfitting is a phenomenon occurring during training when a model is excessively tailored to a specific dataset, reducing its ability to generalize. External validation, by definition, does not involve retraining, making such concerns both misplaced and irrelevant.<span><sup>4</sup></span></p><p>Finally, speculation about potential commercial influence is both unsubstantiated and misleading. Our study was conducted independently, with all affiliations transparently disclosed. Although commercial entities contribute to AI development in medicine, this does not inherently compromise scientific integrity when conflicts of interest are properly managed,<span><sup>5</sup></span> as was the case in our study. The insinuation of bias lacks supporting evidence and disregards the fundamental principles of independent scientific inquiry.</p><p>In conclusion, our study offers a rigorous and clinically meaningful external validation of AI-based IED detection, adhering to best practices in clinical neurophysiology and AI validation.<span><sup>1</sup></span> We appreciate the opportunity to further substantiate these points. We welcome others to evaluate our AI system using their own external datasets.</p><p>M.J.A.M.v.P. is cofounder of Clinical Science Systems, a supplier of EEG systems for Medisch Spectrum Twente. Clinical Science Systems offered no funding and was not involved in the design, execution, analysis, interpretation, or publication of the study. M.C.T.-C. has no conflict of interest. We confirm that we have read the Journal's position on issues involved in ethical publication and affirm that this report is consistent with those guidelines.</p>","PeriodicalId":11768,"journal":{"name":"Epilepsia","volume":"66 7","pages":"2598-2599"},"PeriodicalIF":6.6000,"publicationDate":"2025-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/epi.18411","citationCount":"0","resultStr":"{\"title\":\"Reply to: Optimizing external validation of deep neural networks for interictal discharge detection\",\"authors\":\"Marleen C. Tjepkema-Cloostermans, Michel J. A. M. van Putten\",\"doi\":\"10.1111/epi.18411\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>We appreciate the opportunity to respond to the letter regarding our study.<span><sup>1</sup></span> Although we welcome constructive discussion, the concerns raised reflect a misunderstanding of our methodology and the principles of external validation.</p><p>First, external validation assesses a model's generalizability in an independent clinical setting, and our study followed best practices by evaluating the deep learning model on an external dataset. The suggestion that our process introduced confirmation bias is misleading. Expert review of artificial intelligence (AI)-detected events is a widely accepted practice in clinical AI validation, particularly in the absence of a universally accepted gold standard for interictal epileptiform discharge (IED) detection.<span><sup>2</sup></span> Our inclusion of a multiexpert panel further strengthens the validation process and mitigates individual bias.</p><p>Second, although one of the original model developers participated in reviewing IEDs flagged by the model, final adjudication was conducted by a panel of five experts. This is a standard and accepted approach in electroencephalographic (EEG) studies, where interrater agreement naturally varies.<span><sup>3</sup></span> The assertion that this process introduces bias disregards that clinical neurophysiology often relies on expert consensus in the absence of a definitive ground truth.</p><p>Third, the claim that two authors who achieved perfect agreement (Cohen <i>κ</i> = 1.0) were involved in both training and external validation, indicating a lack of “assessor independence,” misrepresents our study. The interrater variability in the internal validation set ranged from .71 to 1. The one pair of experts who achieved perfect agreement were from different institutions, each with >20 years of experience in EEG interpretation. Their high <i>κ</i> value reflects expertise, not a lack of independence. Furthermore, only one of them was involved in data labeling for training, internal validation, and the external validation panel, whereas the other was involved solely in internal validation.</p><p>Fourth, the suggestion that our validation process is susceptible to overfitting seems to stem from an essential misunderstanding of the concept, as overfitting pertains to the training phase rather than validation. Overfitting is a phenomenon occurring during training when a model is excessively tailored to a specific dataset, reducing its ability to generalize. External validation, by definition, does not involve retraining, making such concerns both misplaced and irrelevant.<span><sup>4</sup></span></p><p>Finally, speculation about potential commercial influence is both unsubstantiated and misleading. Our study was conducted independently, with all affiliations transparently disclosed. Although commercial entities contribute to AI development in medicine, this does not inherently compromise scientific integrity when conflicts of interest are properly managed,<span><sup>5</sup></span> as was the case in our study. The insinuation of bias lacks supporting evidence and disregards the fundamental principles of independent scientific inquiry.</p><p>In conclusion, our study offers a rigorous and clinically meaningful external validation of AI-based IED detection, adhering to best practices in clinical neurophysiology and AI validation.<span><sup>1</sup></span> We appreciate the opportunity to further substantiate these points. We welcome others to evaluate our AI system using their own external datasets.</p><p>M.J.A.M.v.P. is cofounder of Clinical Science Systems, a supplier of EEG systems for Medisch Spectrum Twente. Clinical Science Systems offered no funding and was not involved in the design, execution, analysis, interpretation, or publication of the study. M.C.T.-C. has no conflict of interest. We confirm that we have read the Journal's position on issues involved in ethical publication and affirm that this report is consistent with those guidelines.</p>\",\"PeriodicalId\":11768,\"journal\":{\"name\":\"Epilepsia\",\"volume\":\"66 7\",\"pages\":\"2598-2599\"},\"PeriodicalIF\":6.6000,\"publicationDate\":\"2025-05-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://onlinelibrary.wiley.com/doi/epdf/10.1111/epi.18411\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Epilepsia\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1111/epi.18411\",\"RegionNum\":1,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"CLINICAL NEUROLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Epilepsia","FirstCategoryId":"3","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1111/epi.18411","RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CLINICAL NEUROLOGY","Score":null,"Total":0}
Reply to: Optimizing external validation of deep neural networks for interictal discharge detection
We appreciate the opportunity to respond to the letter regarding our study.1 Although we welcome constructive discussion, the concerns raised reflect a misunderstanding of our methodology and the principles of external validation.
First, external validation assesses a model's generalizability in an independent clinical setting, and our study followed best practices by evaluating the deep learning model on an external dataset. The suggestion that our process introduced confirmation bias is misleading. Expert review of artificial intelligence (AI)-detected events is a widely accepted practice in clinical AI validation, particularly in the absence of a universally accepted gold standard for interictal epileptiform discharge (IED) detection.2 Our inclusion of a multiexpert panel further strengthens the validation process and mitigates individual bias.
Second, although one of the original model developers participated in reviewing IEDs flagged by the model, final adjudication was conducted by a panel of five experts. This is a standard and accepted approach in electroencephalographic (EEG) studies, where interrater agreement naturally varies.3 The assertion that this process introduces bias disregards that clinical neurophysiology often relies on expert consensus in the absence of a definitive ground truth.
Third, the claim that two authors who achieved perfect agreement (Cohen κ = 1.0) were involved in both training and external validation, indicating a lack of “assessor independence,” misrepresents our study. The interrater variability in the internal validation set ranged from .71 to 1. The one pair of experts who achieved perfect agreement were from different institutions, each with >20 years of experience in EEG interpretation. Their high κ value reflects expertise, not a lack of independence. Furthermore, only one of them was involved in data labeling for training, internal validation, and the external validation panel, whereas the other was involved solely in internal validation.
Fourth, the suggestion that our validation process is susceptible to overfitting seems to stem from an essential misunderstanding of the concept, as overfitting pertains to the training phase rather than validation. Overfitting is a phenomenon occurring during training when a model is excessively tailored to a specific dataset, reducing its ability to generalize. External validation, by definition, does not involve retraining, making such concerns both misplaced and irrelevant.4
Finally, speculation about potential commercial influence is both unsubstantiated and misleading. Our study was conducted independently, with all affiliations transparently disclosed. Although commercial entities contribute to AI development in medicine, this does not inherently compromise scientific integrity when conflicts of interest are properly managed,5 as was the case in our study. The insinuation of bias lacks supporting evidence and disregards the fundamental principles of independent scientific inquiry.
In conclusion, our study offers a rigorous and clinically meaningful external validation of AI-based IED detection, adhering to best practices in clinical neurophysiology and AI validation.1 We appreciate the opportunity to further substantiate these points. We welcome others to evaluate our AI system using their own external datasets.
M.J.A.M.v.P. is cofounder of Clinical Science Systems, a supplier of EEG systems for Medisch Spectrum Twente. Clinical Science Systems offered no funding and was not involved in the design, execution, analysis, interpretation, or publication of the study. M.C.T.-C. has no conflict of interest. We confirm that we have read the Journal's position on issues involved in ethical publication and affirm that this report is consistent with those guidelines.
期刊介绍:
Epilepsia is the leading, authoritative source for innovative clinical and basic science research for all aspects of epilepsy and seizures. In addition, Epilepsia publishes critical reviews, opinion pieces, and guidelines that foster understanding and aim to improve the diagnosis and treatment of people with seizures and epilepsy.