{"title":"深度学习模型在客观听觉脑干反应检测中的比较:一项多中心验证研究。","authors":"Yin Liu, Lingjie Xiang, Qiang Li, Kangkang Li, Yihan Yang, Tiantian Wang, Yuting Qin, Xinxing Fu, Yu Zhao, Chenqiang Gao","doi":"10.1177/23312165251347773","DOIUrl":null,"url":null,"abstract":"<p><p>Auditory brainstem response (ABR) interpretation in clinical practice often relies on visual inspection by audiologists, which is prone to inter-practitioner variability. While deep learning (DL) algorithms have shown promise in objectifying ABR detection in controlled settings, their applicability to real-world clinical data is hindered by small datasets and insufficient heterogeneity. This study evaluates the generalizability of nine DL models for ABR detection using large, multicenter datasets. The primary dataset analyzed, Clinical Dataset I, comprises 128,123 labeled ABRs from 13,813 participants across a wide range of ages and hearing levels, and was divided into a training set (90%) and a held-out test set (10%). The models included convolutional neural networks (CNNs; AlexNet, VGG, ResNet), transformer-based architectures (Transformer, Patch Time Series Transformer [PatchTST], Differential Transformer, and Differential PatchTST), and hybrid CNN-transformer models (ResTransformer, ResPatchTST). Performance was assessed on the held-out test set and four external datasets (Clinical II, Southampton, PhysioNet, Mendeley) using accuracy and area under the receiver operating characteristic curve (AUC). ResPatchTST achieved the highest performance on the held-out test set (accuracy: 91.90%, AUC: 0.976). Transformer-based models, particularly PatchTST, showed superior generalization to external datasets, maintaining robust accuracy across diverse clinical settings. Additional experiments highlighted the critical role of dataset size and diversity in enhancing model robustness. We also observed that incorporating acquisition parameters and demographic features as auxiliary inputs yielded performance gains in cross-center generalization. These findings underscore the potential of DL models-especially transformer-based architectures-for accurate and generalizable ABR detection, and highlight the necessity of large, diverse datasets in developing clinically reliable systems.</p>","PeriodicalId":48678,"journal":{"name":"Trends in Hearing","volume":"29 ","pages":"23312165251347773"},"PeriodicalIF":3.0000,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12134522/pdf/","citationCount":"0","resultStr":"{\"title\":\"Comparison of Deep Learning Models for Objective Auditory Brainstem Response Detection: A Multicenter Validation Study.\",\"authors\":\"Yin Liu, Lingjie Xiang, Qiang Li, Kangkang Li, Yihan Yang, Tiantian Wang, Yuting Qin, Xinxing Fu, Yu Zhao, Chenqiang Gao\",\"doi\":\"10.1177/23312165251347773\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Auditory brainstem response (ABR) interpretation in clinical practice often relies on visual inspection by audiologists, which is prone to inter-practitioner variability. While deep learning (DL) algorithms have shown promise in objectifying ABR detection in controlled settings, their applicability to real-world clinical data is hindered by small datasets and insufficient heterogeneity. This study evaluates the generalizability of nine DL models for ABR detection using large, multicenter datasets. The primary dataset analyzed, Clinical Dataset I, comprises 128,123 labeled ABRs from 13,813 participants across a wide range of ages and hearing levels, and was divided into a training set (90%) and a held-out test set (10%). The models included convolutional neural networks (CNNs; AlexNet, VGG, ResNet), transformer-based architectures (Transformer, Patch Time Series Transformer [PatchTST], Differential Transformer, and Differential PatchTST), and hybrid CNN-transformer models (ResTransformer, ResPatchTST). Performance was assessed on the held-out test set and four external datasets (Clinical II, Southampton, PhysioNet, Mendeley) using accuracy and area under the receiver operating characteristic curve (AUC). ResPatchTST achieved the highest performance on the held-out test set (accuracy: 91.90%, AUC: 0.976). Transformer-based models, particularly PatchTST, showed superior generalization to external datasets, maintaining robust accuracy across diverse clinical settings. Additional experiments highlighted the critical role of dataset size and diversity in enhancing model robustness. We also observed that incorporating acquisition parameters and demographic features as auxiliary inputs yielded performance gains in cross-center generalization. These findings underscore the potential of DL models-especially transformer-based architectures-for accurate and generalizable ABR detection, and highlight the necessity of large, diverse datasets in developing clinically reliable systems.</p>\",\"PeriodicalId\":48678,\"journal\":{\"name\":\"Trends in Hearing\",\"volume\":\"29 \",\"pages\":\"23312165251347773\"},\"PeriodicalIF\":3.0000,\"publicationDate\":\"2025-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12134522/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Trends in Hearing\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1177/23312165251347773\",\"RegionNum\":2,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2025/6/3 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q1\",\"JCRName\":\"AUDIOLOGY & SPEECH-LANGUAGE PATHOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Trends in Hearing","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1177/23312165251347773","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/6/3 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"AUDIOLOGY & SPEECH-LANGUAGE PATHOLOGY","Score":null,"Total":0}
Comparison of Deep Learning Models for Objective Auditory Brainstem Response Detection: A Multicenter Validation Study.
Auditory brainstem response (ABR) interpretation in clinical practice often relies on visual inspection by audiologists, which is prone to inter-practitioner variability. While deep learning (DL) algorithms have shown promise in objectifying ABR detection in controlled settings, their applicability to real-world clinical data is hindered by small datasets and insufficient heterogeneity. This study evaluates the generalizability of nine DL models for ABR detection using large, multicenter datasets. The primary dataset analyzed, Clinical Dataset I, comprises 128,123 labeled ABRs from 13,813 participants across a wide range of ages and hearing levels, and was divided into a training set (90%) and a held-out test set (10%). The models included convolutional neural networks (CNNs; AlexNet, VGG, ResNet), transformer-based architectures (Transformer, Patch Time Series Transformer [PatchTST], Differential Transformer, and Differential PatchTST), and hybrid CNN-transformer models (ResTransformer, ResPatchTST). Performance was assessed on the held-out test set and four external datasets (Clinical II, Southampton, PhysioNet, Mendeley) using accuracy and area under the receiver operating characteristic curve (AUC). ResPatchTST achieved the highest performance on the held-out test set (accuracy: 91.90%, AUC: 0.976). Transformer-based models, particularly PatchTST, showed superior generalization to external datasets, maintaining robust accuracy across diverse clinical settings. Additional experiments highlighted the critical role of dataset size and diversity in enhancing model robustness. We also observed that incorporating acquisition parameters and demographic features as auxiliary inputs yielded performance gains in cross-center generalization. These findings underscore the potential of DL models-especially transformer-based architectures-for accurate and generalizable ABR detection, and highlight the necessity of large, diverse datasets in developing clinically reliable systems.
Trends in HearingAUDIOLOGY & SPEECH-LANGUAGE PATHOLOGYOTORH-OTORHINOLARYNGOLOGY
CiteScore
4.50
自引率
11.10%
发文量
44
审稿时长
12 weeks
期刊介绍:
Trends in Hearing is an open access journal completely dedicated to publishing original research and reviews focusing on human hearing, hearing loss, hearing aids, auditory implants, and aural rehabilitation. Under its former name, Trends in Amplification, the journal established itself as a forum for concise explorations of all areas of translational hearing research by leaders in the field. Trends in Hearing has now expanded its focus to include original research articles, with the goal of becoming the premier venue for research related to human hearing and hearing loss.