将语音作为疾病检测的生物标志物

arXiv - EE - Audio and Speech Processing Pub Date : 2024-09-16 DOI:arxiv-2409.10230

Catarina Botelho, Alberto Abad, Tanja Schultz, Isabel Trancoso

{"title":"将语音作为疾病检测的生物标志物","authors":"Catarina Botelho, Alberto Abad, Tanja Schultz, Isabel Trancoso","doi":"arxiv-2409.10230","DOIUrl":null,"url":null,"abstract":"Speech is a rich biomarker that encodes substantial information about the\nhealth of a speaker, and thus it has been proposed for the detection of\nnumerous diseases, achieving promising results. However, questions remain about\nwhat the models trained for the automatic detection of these diseases are\nactually learning and the basis for their predictions, which can significantly\nimpact patients' lives. This work advocates for an interpretable health model,\nsuitable for detecting several diseases, motivated by the observation that\nspeech-affecting disorders often have overlapping effects on speech signals. A\nframework is presented that first defines \"reference speech\" and then leverages\nthis definition for disease detection. Reference speech is characterized\nthrough reference intervals, i.e., the typical values of clinically meaningful\nacoustic and linguistic features derived from a reference population. This\nnovel approach in the field of speech as a biomarker is inspired by the use of\nreference intervals in clinical laboratory science. Deviations of new speakers\nfrom this reference model are quantified and used as input to detect\nAlzheimer's and Parkinson's disease. The classification strategy explored is\nbased on Neural Additive Models, a type of glass-box neural network, which\nenables interpretability. The proposed framework for reference speech\ncharacterization and disease detection is designed to support the medical\ncommunity by providing clinically meaningful explanations that can serve as a\nvaluable second opinion.","PeriodicalId":501284,"journal":{"name":"arXiv - EE - Audio and Speech Processing","volume":"9 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Speech as a Biomarker for Disease Detection\",\"authors\":\"Catarina Botelho, Alberto Abad, Tanja Schultz, Isabel Trancoso\",\"doi\":\"arxiv-2409.10230\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Speech is a rich biomarker that encodes substantial information about the\\nhealth of a speaker, and thus it has been proposed for the detection of\\nnumerous diseases, achieving promising results. However, questions remain about\\nwhat the models trained for the automatic detection of these diseases are\\nactually learning and the basis for their predictions, which can significantly\\nimpact patients' lives. This work advocates for an interpretable health model,\\nsuitable for detecting several diseases, motivated by the observation that\\nspeech-affecting disorders often have overlapping effects on speech signals. A\\nframework is presented that first defines \\\"reference speech\\\" and then leverages\\nthis definition for disease detection. Reference speech is characterized\\nthrough reference intervals, i.e., the typical values of clinically meaningful\\nacoustic and linguistic features derived from a reference population. This\\nnovel approach in the field of speech as a biomarker is inspired by the use of\\nreference intervals in clinical laboratory science. Deviations of new speakers\\nfrom this reference model are quantified and used as input to detect\\nAlzheimer's and Parkinson's disease. The classification strategy explored is\\nbased on Neural Additive Models, a type of glass-box neural network, which\\nenables interpretability. The proposed framework for reference speech\\ncharacterization and disease detection is designed to support the medical\\ncommunity by providing clinically meaningful explanations that can serve as a\\nvaluable second opinion.\",\"PeriodicalId\":501284,\"journal\":{\"name\":\"arXiv - EE - Audio and Speech Processing\",\"volume\":\"9 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-09-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - EE - Audio and Speech Processing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2409.10230\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - EE - Audio and Speech Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.10230","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

语音是一种丰富的生物标志物，能编码关于说话者健康状况的大量信息，因此被提议用于检测许多疾病，并取得了可喜的成果。然而，人们对为自动检测这些疾病而训练的模型究竟在学习什么以及它们的预测依据仍有疑问，而这可能会对患者的生活产生重大影响。这项工作主张建立一个可解释的健康模型，该模型适用于检测多种疾病，其动机是观察到影响语音的疾病通常对语音信号有重叠影响。该模型首先定义了 "参考语音"，然后利用这一定义进行疾病检测。参考语音的特征是参考区间，即从参考人群中得出的具有临床意义的声学和语言特征的典型值。这种将语音作为生物标记物的新方法是受临床实验室科学中使用参考区间的启发。新发言人与这一参考模型的偏差被量化并用作检测阿尔茨海默病和帕金森病的输入。所探讨的分类策略是基于神经加法模型，这是一种玻璃箱神经网络，具有可解释性。所提出的参考语音特征和疾病检测框架旨在通过提供有临床意义的解释来支持医学界，这些解释可以作为宝贵的第二意见。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Speech as a Biomarker for Disease Detection

Speech is a rich biomarker that encodes substantial information about the health of a speaker, and thus it has been proposed for the detection of numerous diseases, achieving promising results. However, questions remain about what the models trained for the automatic detection of these diseases are actually learning and the basis for their predictions, which can significantly impact patients' lives. This work advocates for an interpretable health model, suitable for detecting several diseases, motivated by the observation that speech-affecting disorders often have overlapping effects on speech signals. A framework is presented that first defines "reference speech" and then leverages this definition for disease detection. Reference speech is characterized through reference intervals, i.e., the typical values of clinically meaningful acoustic and linguistic features derived from a reference population. This novel approach in the field of speech as a biomarker is inspired by the use of reference intervals in clinical laboratory science. Deviations of new speakers from this reference model are quantified and used as input to detect Alzheimer's and Parkinson's disease. The classification strategy explored is based on Neural Additive Models, a type of glass-box neural network, which enables interpretability. The proposed framework for reference speech characterization and disease detection is designed to support the medical community by providing clinically meaningful explanations that can serve as a valuable second opinion.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

arXiv - EE - Audio and Speech Processing

自引率

0.00%

发文量