使用音频信号进行智能疾病诊断的自我监督学习：从copd到一系列疾病

IF 3.5 2区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Applied Intelligence Pub Date : 2025-02-27 DOI:10.1007/s10489-024-06028-2

Wenchao Sun, Gang Wu, Ming Ming, Jiameng Zhang, Chun Shi, Linlin Qin

{"title":"使用音频信号进行智能疾病诊断的自我监督学习：从copd到一系列疾病","authors":"Wenchao Sun, Gang Wu, Ming Ming, Jiameng Zhang, Chun Shi, Linlin Qin","doi":"10.1007/s10489-024-06028-2","DOIUrl":null,"url":null,"abstract":"<div><p>Given the widespread prevalence and significant patient base of COPD (Chronic Obstructive Pulmonary Disease), the development of simple and rapid diagnostic methods has emerged as a key research focus. Through pathological studies, the medical community has identified the potential of cough sounds for diagnosing COPD, sparking interest in leveraging deep learning to analyze various disease-related sounds, including those associated with COVID-19 and cardiac conditions, etc. Yet, research specifically targeting COPD remains scarce, primarily due to two challenges: traditional models trained on small medical datasets often fall short of expectations due to stringent data privacy and collection requirements in healthcare; and the scarcity of publicly accessible COPD datasets, particularly those that could obviate the need for medical equipment. Addressing these challenges, our paper introduces a novel dataset of smartphone-recorded cough sounds, termed the CC (COPD-Cough) dataset. It comprises 221 recordings from COPD patients and 632 from healthy individuals, marking the first dataset explicitly curated for COPD cough sound analysis. The dataset, endorsed by clinical professionals and collected independently of medical devices, promises to propel advancements in straightforward COPD diagnostics. Furthermore, we propose a self-supervised learning model enhanced by unique data augmentation techniques and an efficient sound feature extractor, demonstrating superior performance across three distinct disease datasets and achieving state-of-the-art results. Comprehensive ablation studies affirm our model’s efficacy, while sensitivity analyses optimize its applicability to various tasks. For further engagement, the framework’s source code and dataset are available at https://github.com/auto-chao/COPD_Diagnosis and https://zenodo.org/records/10209837, respectively.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"55 6","pages":""},"PeriodicalIF":3.5000,"publicationDate":"2025-02-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Self-supervised learning for intelligent disease diagnosis using audio signals: beyond copd to a spectrum of diseases\",\"authors\":\"Wenchao Sun, Gang Wu, Ming Ming, Jiameng Zhang, Chun Shi, Linlin Qin\",\"doi\":\"10.1007/s10489-024-06028-2\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Given the widespread prevalence and significant patient base of COPD (Chronic Obstructive Pulmonary Disease), the development of simple and rapid diagnostic methods has emerged as a key research focus. Through pathological studies, the medical community has identified the potential of cough sounds for diagnosing COPD, sparking interest in leveraging deep learning to analyze various disease-related sounds, including those associated with COVID-19 and cardiac conditions, etc. Yet, research specifically targeting COPD remains scarce, primarily due to two challenges: traditional models trained on small medical datasets often fall short of expectations due to stringent data privacy and collection requirements in healthcare; and the scarcity of publicly accessible COPD datasets, particularly those that could obviate the need for medical equipment. Addressing these challenges, our paper introduces a novel dataset of smartphone-recorded cough sounds, termed the CC (COPD-Cough) dataset. It comprises 221 recordings from COPD patients and 632 from healthy individuals, marking the first dataset explicitly curated for COPD cough sound analysis. The dataset, endorsed by clinical professionals and collected independently of medical devices, promises to propel advancements in straightforward COPD diagnostics. Furthermore, we propose a self-supervised learning model enhanced by unique data augmentation techniques and an efficient sound feature extractor, demonstrating superior performance across three distinct disease datasets and achieving state-of-the-art results. Comprehensive ablation studies affirm our model’s efficacy, while sensitivity analyses optimize its applicability to various tasks. For further engagement, the framework’s source code and dataset are available at https://github.com/auto-chao/COPD_Diagnosis and https://zenodo.org/records/10209837, respectively.</p></div>\",\"PeriodicalId\":8041,\"journal\":{\"name\":\"Applied Intelligence\",\"volume\":\"55 6\",\"pages\":\"\"},\"PeriodicalIF\":3.5000,\"publicationDate\":\"2025-02-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Applied Intelligence\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://link.springer.com/article/10.1007/s10489-024-06028-2\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Intelligence","FirstCategoryId":"94","ListUrlMain":"https://link.springer.com/article/10.1007/s10489-024-06028-2","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

鉴于慢性阻塞性肺疾病（COPD）的广泛流行和重要的患者基础，开发简单快速的诊断方法已成为一个关键的研究重点。通过病理研究，医学界已经确定了咳嗽声音诊断慢性阻塞性肺病的潜力，引发了利用深度学习分析各种疾病相关声音的兴趣，包括与COVID-19和心脏病等相关的声音。然而，专门针对COPD的研究仍然很少，主要是由于两个挑战：由于医疗保健中严格的数据隐私和收集要求，在小型医疗数据集上训练的传统模型往往达不到预期；以及缺乏可公开获取的COPD数据集，特别是那些可以消除对医疗设备需求的数据集。为了解决这些挑战，我们的论文引入了一个新的智能手机录制的咳嗽声音数据集，称为CC （COPD-Cough）数据集。它包括来自慢性阻塞性肺病患者的221段录音和来自健康个体的632段录音，标志着第一个明确为慢性阻塞性肺病咳嗽声音分析整理的数据集。该数据集由临床专业人员认可，独立于医疗设备收集，有望推动COPD直接诊断的进步。此外，我们提出了一个由独特的数据增强技术和有效的声音特征提取器增强的自监督学习模型，在三种不同的疾病数据集上展示了卓越的性能，并取得了最先进的结果。综合消融研究证实了我们的模型的有效性，而敏感性分析优化了它对各种任务的适用性。为了进一步参与，框架的源代码和数据集分别可在https://github.com/auto-chao/COPD_Diagnosis和https://zenodo.org/records/10209837上获得。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

Self-supervised learning for intelligent disease diagnosis using audio signals: beyond copd to a spectrum of diseases

查看原文本刊更多论文

Self-supervised learning for intelligent disease diagnosis using audio signals: beyond copd to a spectrum of diseases

Given the widespread prevalence and significant patient base of COPD (Chronic Obstructive Pulmonary Disease), the development of simple and rapid diagnostic methods has emerged as a key research focus. Through pathological studies, the medical community has identified the potential of cough sounds for diagnosing COPD, sparking interest in leveraging deep learning to analyze various disease-related sounds, including those associated with COVID-19 and cardiac conditions, etc. Yet, research specifically targeting COPD remains scarce, primarily due to two challenges: traditional models trained on small medical datasets often fall short of expectations due to stringent data privacy and collection requirements in healthcare; and the scarcity of publicly accessible COPD datasets, particularly those that could obviate the need for medical equipment. Addressing these challenges, our paper introduces a novel dataset of smartphone-recorded cough sounds, termed the CC (COPD-Cough) dataset. It comprises 221 recordings from COPD patients and 632 from healthy individuals, marking the first dataset explicitly curated for COPD cough sound analysis. The dataset, endorsed by clinical professionals and collected independently of medical devices, promises to propel advancements in straightforward COPD diagnostics. Furthermore, we propose a self-supervised learning model enhanced by unique data augmentation techniques and an efficient sound feature extractor, demonstrating superior performance across three distinct disease datasets and achieving state-of-the-art results. Comprehensive ablation studies affirm our model’s efficacy, while sensitivity analyses optimize its applicability to various tasks. For further engagement, the framework’s source code and dataset are available at https://github.com/auto-chao/COPD_Diagnosis and https://zenodo.org/records/10209837, respectively.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Applied Intelligence 工程技术-计算机：人工智能

CiteScore

6.60

自引率

20.80%

发文量

1361

审稿时长

5.9 months

期刊介绍： With a focus on research in artificial intelligence and neural networks, this journal addresses issues involving solutions of real-life manufacturing, defense, management, government and industrial problems which are too complex to be solved through conventional approaches and require the simulation of intelligent thought processes, heuristics, applications of knowledge, and distributed and parallel processing. The integration of these multiple approaches in solving complex problems is of particular importance. The journal presents new and original research and technological developments, addressing real and complex issues applicable to difficult problems. It provides a medium for exchanging scientific research and technological achievements accomplished by the international community.