Derivative-Guided Dual-Attention Mechanisms in Patch Transformer for Efficient Automated Recognition of Auditory Brainstem Response Latency

IF 4.8 2区医学 Q2 ENGINEERING, BIOMEDICAL

IEEE Transactions on Neural Systems and Rehabilitation Engineering Pub Date : 2025-04-08 DOI:10.1109/TNSRE.2025.3558730

Yin Liu;Huanghong Sun;Qiang Li;Kangkang Li;Xinxing Fu;Hao Zhu;Tiecheng Song;Yue Zhao;Tiantian Wang;Chenqiang Gao

{"title":"Derivative-Guided Dual-Attention Mechanisms in Patch Transformer for Efficient Automated Recognition of Auditory Brainstem Response Latency","authors":"Yin Liu;Huanghong Sun;Qiang Li;Kangkang Li;Xinxing Fu;Hao Zhu;Tiecheng Song;Yue Zhao;Tiantian Wang;Chenqiang Gao","doi":"10.1109/TNSRE.2025.3558730","DOIUrl":null,"url":null,"abstract":"Accurate recognition of auditory brainstem response (ABR) wave latencies is essential for clinical practice but remains a subjective and time-consuming process. Existing AI approaches face challenges in generalization, complexity, and semantic sparsity due to single sampling-point analysis. This study introduces the Derivative-Guided Patch Dual-Attention Transformer (Patch-DAT), a novel, lightweight, and generalizable deep learning (DL) model for the automated recognition of latencies for waves I, III, and V. Patch-DAT divides the ABR time series into overlapping patches to aggregate semantic information, better capturing local temporal patterns. Meanwhile, leveraging the fact that ABR waves occur at the zero crossing of the first derivative, Patch-DAT incorporates a first derivative-guided dual-attention mechanism to model global dependencies. Trained and validated on large-scale, diverse datasets from two hospitals, Patch-DAT (with a size of 0.36 MB) achieves accuracies of 92.29% and 98.07% at 0.1 ms and 0.2 ms error scales, respectively, on a held-out test set. It also performs well on an independent dataset with accuracies of 88.50% and 95.14%, demonstrating strong generalization across clinical settings. Ablation studies highlight the contributions of the patching strategy and dual-attention mechanisms. Compared to previous state-of-the-art DL models, Patch-DAT shows superior accuracy and reduced complexity, making it a promising solution for object recognition of ABR latencies. Additionally, we systematically investigate how sample size and data heterogeneity affect model generalization, indicating the importance of large, diverse datasets in training robust DL models. Future work will focus on expanding dataset diversity and improving model interpretability to further improve clinical relevance.","PeriodicalId":13419,"journal":{"name":"IEEE Transactions on Neural Systems and Rehabilitation Engineering","volume":"33 ","pages":"1865-1877"},"PeriodicalIF":4.8000,"publicationDate":"2025-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10955482","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Neural Systems and Rehabilitation Engineering","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10955482/","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, BIOMEDICAL","Score":null,"Total":0}

引用次数: 0

Abstract

Accurate recognition of auditory brainstem response (ABR) wave latencies is essential for clinical practice but remains a subjective and time-consuming process. Existing AI approaches face challenges in generalization, complexity, and semantic sparsity due to single sampling-point analysis. This study introduces the Derivative-Guided Patch Dual-Attention Transformer (Patch-DAT), a novel, lightweight, and generalizable deep learning (DL) model for the automated recognition of latencies for waves I, III, and V. Patch-DAT divides the ABR time series into overlapping patches to aggregate semantic information, better capturing local temporal patterns. Meanwhile, leveraging the fact that ABR waves occur at the zero crossing of the first derivative, Patch-DAT incorporates a first derivative-guided dual-attention mechanism to model global dependencies. Trained and validated on large-scale, diverse datasets from two hospitals, Patch-DAT (with a size of 0.36 MB) achieves accuracies of 92.29% and 98.07% at 0.1 ms and 0.2 ms error scales, respectively, on a held-out test set. It also performs well on an independent dataset with accuracies of 88.50% and 95.14%, demonstrating strong generalization across clinical settings. Ablation studies highlight the contributions of the patching strategy and dual-attention mechanisms. Compared to previous state-of-the-art DL models, Patch-DAT shows superior accuracy and reduced complexity, making it a promising solution for object recognition of ABR latencies. Additionally, we systematically investigate how sample size and data heterogeneity affect model generalization, indicating the importance of large, diverse datasets in training robust DL models. Future work will focus on expanding dataset diversity and improving model interpretability to further improve clinical relevance.

查看原文本刊更多论文

导导贴片变压器双注意机制对听觉脑干反应延迟的有效自动识别。

准确识别听觉脑干反应（ABR）波潜伏期对临床实践至关重要，但这仍然是一个主观且耗时的过程。由于采用单采样点分析，现有的人工智能方法在泛化、复杂性和语义稀疏性方面面临挑战。Patch-DAT 将 ABR 时间序列划分为重叠的斑块以聚合语义信息，从而更好地捕捉局部时间模式。同时，利用 ABR 波发生在一阶导数的零交叉点这一事实，Patch-DAT 采用了一阶导数引导的双重注意机制来模拟全局依赖关系。Patch-DAT（大小为 0.36 MB）在来自两家医院的大规模、多样化数据集上进行了训练和验证，在 0.1 毫秒和 0.2 毫秒误差范围内，它在保持不变的测试集上的准确率分别达到了 92.29% 和 98.07%。在一个独立的数据集上，它也表现出色，准确率分别为 88.50% 和 95.14%，显示了在不同临床环境下的强大通用性。消融研究凸显了贴片策略和双重注意机制的贡献。与之前最先进的 DL 模型相比，Patch-DAT 显示出更高的准确性并降低了复杂性，使其成为 ABR 潜伏期对象识别的一个有前途的解决方案。此外，我们还系统地研究了样本大小和数据异质性对模型泛化的影响，这表明在训练稳健的 DL 模型时，大型、多样化的数据集非常重要。未来的工作将侧重于扩大数据集的多样性和提高模型的可解释性，以进一步提高临床相关性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE Transactions on Neural Systems and Rehabilitation Engineering 医学-工程：生物医学

CiteScore

8.60

自引率

8.20%

发文量

479

审稿时长

6-12 weeks

期刊介绍： Rehabilitative and neural aspects of biomedical engineering, including functional electrical stimulation, acoustic dynamics, human performance measurement and analysis, nerve stimulation, electromyography, motor control and stimulation; and hardware and software applications for rehabilitation engineering and assistive devices.