Jingwei Zhang, Zhaoyi Liu, Christos Chatzichristos, Sam Michiels, Wim Van Paesschen, Danny Hughes, Maarten De Vos
{"title":"Select for better learning: identifying high-quality training data for a multimodal cyclic transformer.","authors":"Jingwei Zhang, Zhaoyi Liu, Christos Chatzichristos, Sam Michiels, Wim Van Paesschen, Danny Hughes, Maarten De Vos","doi":"10.1088/1741-2552/adbec0","DOIUrl":null,"url":null,"abstract":"<p><p><i>Objective</i>. Tonic-clonic seizures (TCSs), which present a significant risk for sudden unexpected death in epilepsy, require accurate detection to enable effective long-term monitoring. Previous studies have demonstrated the advantages of multimodal seizure detection systems in reliably detecting TCSs over extended periods. However, the effectiveness of these data-driven systems depends heavily on the availability of reliable training data.<i>Approach</i>. To address this need, we propose an innovative data selection method designed to identify high-quality training samples. Our approach evaluates sample quality based on learning difficulty, classifying samples with lower learning difficulty as higher quality. We then introduce a confidence-based method to quantify the proportion of high-quality samples within the dataset.<i>Main results</i>. Experimental results show that our method improves the performance of a state-of-the-art TCS detection model by 11%.<i>Significance</i>. Using this data selection method, we develop a training pipeline that enhances the training process of multimodal seizure detection models.</p>","PeriodicalId":94096,"journal":{"name":"Journal of neural engineering","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2025-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of neural engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1088/1741-2552/adbec0","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Objective. Tonic-clonic seizures (TCSs), which present a significant risk for sudden unexpected death in epilepsy, require accurate detection to enable effective long-term monitoring. Previous studies have demonstrated the advantages of multimodal seizure detection systems in reliably detecting TCSs over extended periods. However, the effectiveness of these data-driven systems depends heavily on the availability of reliable training data.Approach. To address this need, we propose an innovative data selection method designed to identify high-quality training samples. Our approach evaluates sample quality based on learning difficulty, classifying samples with lower learning difficulty as higher quality. We then introduce a confidence-based method to quantify the proportion of high-quality samples within the dataset.Main results. Experimental results show that our method improves the performance of a state-of-the-art TCS detection model by 11%.Significance. Using this data selection method, we develop a training pipeline that enhances the training process of multimodal seizure detection models.