Data Augmentation for Seizure Prediction With Generative Diffusion Model

IF 4.9 3区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

IEEE Transactions on Cognitive and Developmental Systems Pub Date : 2024-10-31 DOI:10.1109/TCDS.2024.3489357

Kai Shu;Le Wu;Yuchang Zhao;Aiping Liu;Ruobing Qian;Xun Chen

{"title":"Data Augmentation for Seizure Prediction With Generative Diffusion Model","authors":"Kai Shu;Le Wu;Yuchang Zhao;Aiping Liu;Ruobing Qian;Xun Chen","doi":"10.1109/TCDS.2024.3489357","DOIUrl":null,"url":null,"abstract":"Data augmentation (DA) can significantly strengthen the electroencephalogram (EEG)-based seizure prediction methods. However, existing DA approaches are just the linear transformations of original data and cannot explore the feature space to increase diversity effectively. Therefore, we propose a novel diffusion-based DA method called DiffEEG. DiffEEG can fully explore data distribution and generate samples with high diversity, offering extra information to classifiers. It involves two processes: the diffusion process and the denoised process. In the diffusion process, the model incrementally adds noise with different scales to EEG input and converts it into random noise. In this way, the representation of data can be learned. In the denoised process, the model utilizes learned knowledge to sample synthetic data from random noise input by gradually removing noise. The randomness of input noise and the precise representation enable the synthetic samples to possess diversity while ensuring the consistency of feature space. We compared DiffEEG with original, down-sampling, sliding windows and recombination methods, and integrated them into five representative classifiers. The experiments demonstrate the effectiveness and generality of our method. With the contribution of DiffEEG, the multiscale CNN achieves state-of-the-art performance, with an average sensitivity, FPR, AUC of 95.4%, 0.051/h, 0.932 on the CHB-MIT database and 93.6%, 0.121/h, 0.822 on the Kaggle database.","PeriodicalId":54300,"journal":{"name":"IEEE Transactions on Cognitive and Developmental Systems","volume":"17 3","pages":"577-591"},"PeriodicalIF":4.9000,"publicationDate":"2024-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Cognitive and Developmental Systems","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10740033/","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Data augmentation (DA) can significantly strengthen the electroencephalogram (EEG)-based seizure prediction methods. However, existing DA approaches are just the linear transformations of original data and cannot explore the feature space to increase diversity effectively. Therefore, we propose a novel diffusion-based DA method called DiffEEG. DiffEEG can fully explore data distribution and generate samples with high diversity, offering extra information to classifiers. It involves two processes: the diffusion process and the denoised process. In the diffusion process, the model incrementally adds noise with different scales to EEG input and converts it into random noise. In this way, the representation of data can be learned. In the denoised process, the model utilizes learned knowledge to sample synthetic data from random noise input by gradually removing noise. The randomness of input noise and the precise representation enable the synthetic samples to possess diversity while ensuring the consistency of feature space. We compared DiffEEG with original, down-sampling, sliding windows and recombination methods, and integrated them into five representative classifiers. The experiments demonstrate the effectiveness and generality of our method. With the contribution of DiffEEG, the multiscale CNN achieves state-of-the-art performance, with an average sensitivity, FPR, AUC of 95.4%, 0.051/h, 0.932 on the CHB-MIT database and 93.6%, 0.121/h, 0.822 on the Kaggle database.

查看原文本刊更多论文

基于生成扩散模型的癫痫发作预测数据增强

数据增强（DA）可以显著增强基于脑电图的癫痫发作预测方法。然而，现有的数据分析方法只是对原始数据进行线性变换，不能有效地挖掘特征空间以增加多样性。因此，我们提出了一种新的基于扩散的数据分析方法DiffEEG。DiffEEG可以充分挖掘数据分布，生成具有高度多样性的样本，为分类器提供额外的信息。它包括两个过程：扩散过程和去噪过程。在扩散过程中，该模型将不同尺度的噪声增量加入到脑电信号输入中，并将其转化为随机噪声。通过这种方式，可以学习数据的表示。在去噪过程中，该模型利用学习到的知识从随机噪声输入中逐步去除噪声，对合成数据进行采样。输入噪声的随机性和精确表示使得合成样本在保证特征空间一致性的同时具有多样性。我们将DiffEEG与原始方法、下采样方法、滑动窗口方法和重组方法进行了比较，并将它们整合到五个具有代表性的分类器中。实验证明了该方法的有效性和通用性。在DiffEEG的贡献下，多尺度CNN在CHB-MIT数据库上的平均灵敏度、FPR和AUC分别为95.4%、0.051/h和0.932，在Kaggle数据库上的平均灵敏度、FPR和AUC分别为93.6%、0.121/h和0.822。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE Transactions on Cognitive and Developmental Systems Computer Science-Software

CiteScore

7.20

自引率

10.00%

发文量

170

期刊介绍： The IEEE Transactions on Cognitive and Developmental Systems (TCDS) focuses on advances in the study of development and cognition in natural (humans, animals) and artificial (robots, agents) systems. It welcomes contributions from multiple related disciplines including cognitive systems, cognitive robotics, developmental and epigenetic robotics, autonomous and evolutionary robotics, social structures, multi-agent and artificial life systems, computational neuroscience, and developmental psychology. Articles on theoretical, computational, application-oriented, and experimental studies as well as reviews in these areas are considered.