Catarina Botelho, Alberto Abad, Tanja Schultz, Isabel Trancoso
{"title":"Speech as a Biomarker for Disease Detection","authors":"Catarina Botelho, Alberto Abad, Tanja Schultz, Isabel Trancoso","doi":"arxiv-2409.10230","DOIUrl":null,"url":null,"abstract":"Speech is a rich biomarker that encodes substantial information about the\nhealth of a speaker, and thus it has been proposed for the detection of\nnumerous diseases, achieving promising results. However, questions remain about\nwhat the models trained for the automatic detection of these diseases are\nactually learning and the basis for their predictions, which can significantly\nimpact patients' lives. This work advocates for an interpretable health model,\nsuitable for detecting several diseases, motivated by the observation that\nspeech-affecting disorders often have overlapping effects on speech signals. A\nframework is presented that first defines \"reference speech\" and then leverages\nthis definition for disease detection. Reference speech is characterized\nthrough reference intervals, i.e., the typical values of clinically meaningful\nacoustic and linguistic features derived from a reference population. This\nnovel approach in the field of speech as a biomarker is inspired by the use of\nreference intervals in clinical laboratory science. Deviations of new speakers\nfrom this reference model are quantified and used as input to detect\nAlzheimer's and Parkinson's disease. The classification strategy explored is\nbased on Neural Additive Models, a type of glass-box neural network, which\nenables interpretability. The proposed framework for reference speech\ncharacterization and disease detection is designed to support the medical\ncommunity by providing clinically meaningful explanations that can serve as a\nvaluable second opinion.","PeriodicalId":501284,"journal":{"name":"arXiv - EE - Audio and Speech Processing","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - EE - Audio and Speech Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.10230","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Speech is a rich biomarker that encodes substantial information about the
health of a speaker, and thus it has been proposed for the detection of
numerous diseases, achieving promising results. However, questions remain about
what the models trained for the automatic detection of these diseases are
actually learning and the basis for their predictions, which can significantly
impact patients' lives. This work advocates for an interpretable health model,
suitable for detecting several diseases, motivated by the observation that
speech-affecting disorders often have overlapping effects on speech signals. A
framework is presented that first defines "reference speech" and then leverages
this definition for disease detection. Reference speech is characterized
through reference intervals, i.e., the typical values of clinically meaningful
acoustic and linguistic features derived from a reference population. This
novel approach in the field of speech as a biomarker is inspired by the use of
reference intervals in clinical laboratory science. Deviations of new speakers
from this reference model are quantified and used as input to detect
Alzheimer's and Parkinson's disease. The classification strategy explored is
based on Neural Additive Models, a type of glass-box neural network, which
enables interpretability. The proposed framework for reference speech
characterization and disease detection is designed to support the medical
community by providing clinically meaningful explanations that can serve as a
valuable second opinion.