利用域不变预处理和迁移学习增强心电信号分类的跨域鲁棒性

IF 4.9 2区 医学 Q1 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS
Arnab Maity, Goutam Saha
{"title":"利用域不变预处理和迁移学习增强心电信号分类的跨域鲁棒性","authors":"Arnab Maity,&nbsp;Goutam Saha","doi":"10.1016/j.cmpb.2024.108462","DOIUrl":null,"url":null,"abstract":"<div><h3>Background and objective:</h3><div>Phonocardiogram (PCG) signal analysis is a non-invasive and cost-efficient approach for diagnosing cardiovascular diseases. Existing PCG-based approaches employ signal processing and machine learning (ML) for automatic disease detection. However, machine learning techniques are known to underperform in cross-corpora arrangements. A drastic effect on disease detection performance is observed when training and testing sets come from different PCG databases with varying data acquisition settings. This study investigates the impact of data acquisition parameter variations in the PCG data across different databases and develops methods to achieve robustness against these variations.</div></div><div><h3>Methods:</h3><div>To alleviate the effect of dataset-induced variations, it employs a combination of three strategies: domain-invariant preprocessing, transfer learning, and domain-balanced variable hop fragment selection (DBVHFS). The domain-invariant preprocessing normalizes the PCG to reduce the stethoscope and environment-induced variations. The transfer learning utilizes a pre-trained model trained on diverse audio data to reduce the impact of data variability by generalizing feature representations. DBVHFS facilitates unbiased fine-tuning of the pre-trained model by balancing the training fragments across all domains, ensuring equal distribution from each class.</div></div><div><h3>Results:</h3><div>The proposed method is evaluated on six independent PhysioNet/CinC Challenge <span><math><mrow><mn>2016</mn></mrow></math></span> PCG databases using leave-one-dataset-out cross-validation. Results indicate that our system outperforms the existing study with a relative improvement of <strong>5.92%</strong> in unweighted average recall and <strong>17.71%</strong> in sensitivity.</div></div><div><h3>Conclusions:</h3><div>The methods proposed in this study address variations in PCG data originating from different sources, potentially enhancing the implementation possibility of automated cardiac screening systems in real-life scenarios.</div></div>","PeriodicalId":10624,"journal":{"name":"Computer methods and programs in biomedicine","volume":"257 ","pages":"Article 108462"},"PeriodicalIF":4.9000,"publicationDate":"2024-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Enhancing cross-domain robustness in phonocardiogram signal classification using domain-invariant preprocessing and transfer learning\",\"authors\":\"Arnab Maity,&nbsp;Goutam Saha\",\"doi\":\"10.1016/j.cmpb.2024.108462\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><h3>Background and objective:</h3><div>Phonocardiogram (PCG) signal analysis is a non-invasive and cost-efficient approach for diagnosing cardiovascular diseases. Existing PCG-based approaches employ signal processing and machine learning (ML) for automatic disease detection. However, machine learning techniques are known to underperform in cross-corpora arrangements. A drastic effect on disease detection performance is observed when training and testing sets come from different PCG databases with varying data acquisition settings. This study investigates the impact of data acquisition parameter variations in the PCG data across different databases and develops methods to achieve robustness against these variations.</div></div><div><h3>Methods:</h3><div>To alleviate the effect of dataset-induced variations, it employs a combination of three strategies: domain-invariant preprocessing, transfer learning, and domain-balanced variable hop fragment selection (DBVHFS). The domain-invariant preprocessing normalizes the PCG to reduce the stethoscope and environment-induced variations. The transfer learning utilizes a pre-trained model trained on diverse audio data to reduce the impact of data variability by generalizing feature representations. DBVHFS facilitates unbiased fine-tuning of the pre-trained model by balancing the training fragments across all domains, ensuring equal distribution from each class.</div></div><div><h3>Results:</h3><div>The proposed method is evaluated on six independent PhysioNet/CinC Challenge <span><math><mrow><mn>2016</mn></mrow></math></span> PCG databases using leave-one-dataset-out cross-validation. Results indicate that our system outperforms the existing study with a relative improvement of <strong>5.92%</strong> in unweighted average recall and <strong>17.71%</strong> in sensitivity.</div></div><div><h3>Conclusions:</h3><div>The methods proposed in this study address variations in PCG data originating from different sources, potentially enhancing the implementation possibility of automated cardiac screening systems in real-life scenarios.</div></div>\",\"PeriodicalId\":10624,\"journal\":{\"name\":\"Computer methods and programs in biomedicine\",\"volume\":\"257 \",\"pages\":\"Article 108462\"},\"PeriodicalIF\":4.9000,\"publicationDate\":\"2024-10-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computer methods and programs in biomedicine\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0169260724004553\",\"RegionNum\":2,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer methods and programs in biomedicine","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0169260724004553","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 0

摘要

背景和目的:心电图(PCG)信号分析是诊断心血管疾病的一种无创、经济高效的方法。现有的基于 PCG 的方法采用信号处理和机器学习 (ML) 技术进行自动疾病检测。然而,众所周知,机器学习技术在跨病区排列时表现不佳。当训练集和测试集来自不同的 PCG 数据库且数据采集设置不同时,疾病检测性能就会受到极大影响。本研究调查了不同数据库的 PCG 数据中数据采集参数变化的影响,并开发了实现对这些变化的鲁棒性的方法:为了减轻数据集引起的变化的影响,本研究采用了三种策略的组合:域不变量预处理、迁移学习和域平衡变量跳变片段选择(DBVHFS)。域不变预处理对 PCG 进行归一化处理,以减少听诊器和环境引起的变化。迁移学习利用在不同音频数据上预先训练好的模型,通过泛化特征表征来减少数据变化的影响。DBVHFS 通过平衡所有领域的训练片段,确保每个类别的平均分布,从而对预训练模型进行无偏微调:结果:我们在六个独立的 PhysioNet/CinC Challenge 2016 PCG 数据库上对所提出的方法进行了评估,采用的是 "留出一个数据集 "交叉验证法。结果表明,我们的系统优于现有研究,非加权平均召回率相对提高了 5.92%,灵敏度提高了 17.71%:本研究提出的方法可解决不同来源 PCG 数据的差异问题,有望提高自动心脏筛查系统在现实生活中的应用可能性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Enhancing cross-domain robustness in phonocardiogram signal classification using domain-invariant preprocessing and transfer learning

Background and objective:

Phonocardiogram (PCG) signal analysis is a non-invasive and cost-efficient approach for diagnosing cardiovascular diseases. Existing PCG-based approaches employ signal processing and machine learning (ML) for automatic disease detection. However, machine learning techniques are known to underperform in cross-corpora arrangements. A drastic effect on disease detection performance is observed when training and testing sets come from different PCG databases with varying data acquisition settings. This study investigates the impact of data acquisition parameter variations in the PCG data across different databases and develops methods to achieve robustness against these variations.

Methods:

To alleviate the effect of dataset-induced variations, it employs a combination of three strategies: domain-invariant preprocessing, transfer learning, and domain-balanced variable hop fragment selection (DBVHFS). The domain-invariant preprocessing normalizes the PCG to reduce the stethoscope and environment-induced variations. The transfer learning utilizes a pre-trained model trained on diverse audio data to reduce the impact of data variability by generalizing feature representations. DBVHFS facilitates unbiased fine-tuning of the pre-trained model by balancing the training fragments across all domains, ensuring equal distribution from each class.

Results:

The proposed method is evaluated on six independent PhysioNet/CinC Challenge 2016 PCG databases using leave-one-dataset-out cross-validation. Results indicate that our system outperforms the existing study with a relative improvement of 5.92% in unweighted average recall and 17.71% in sensitivity.

Conclusions:

The methods proposed in this study address variations in PCG data originating from different sources, potentially enhancing the implementation possibility of automated cardiac screening systems in real-life scenarios.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Computer methods and programs in biomedicine
Computer methods and programs in biomedicine 工程技术-工程:生物医学
CiteScore
12.30
自引率
6.60%
发文量
601
审稿时长
135 days
期刊介绍: To encourage the development of formal computing methods, and their application in biomedical research and medical practice, by illustration of fundamental principles in biomedical informatics research; to stimulate basic research into application software design; to report the state of research of biomedical information processing projects; to report new computer methodologies applied in biomedical areas; the eventual distribution of demonstrable software to avoid duplication of effort; to provide a forum for discussion and improvement of existing software; to optimize contact between national organizations and regional user groups by promoting an international exchange of information on formal methods, standards and software in biomedicine. Computer Methods and Programs in Biomedicine covers computing methodology and software systems derived from computing science for implementation in all aspects of biomedical research and medical practice. It is designed to serve: biochemists; biologists; geneticists; immunologists; neuroscientists; pharmacologists; toxicologists; clinicians; epidemiologists; psychiatrists; psychologists; cardiologists; chemists; (radio)physicists; computer scientists; programmers and systems analysts; biomedical, clinical, electrical and other engineers; teachers of medical informatics and users of educational software.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信