Multi-Task Estimation of Age and Cognitive Decline from Speech

ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) Pub Date : 2021-06-06 DOI:10.1109/ICASSP39728.2021.9414642

Yilin Pan, Venkata Srikanth Nallanthighal, D. Blackburn, H. Christensen, Aki Härmä

{"title":"Multi-Task Estimation of Age and Cognitive Decline from Speech","authors":"Yilin Pan, Venkata Srikanth Nallanthighal, D. Blackburn, H. Christensen, Aki Härmä","doi":"10.1109/ICASSP39728.2021.9414642","DOIUrl":null,"url":null,"abstract":"Speech is a common physiological signal that can be affected by both ageing and cognitive decline. Often the effect can be confounding, as would be the case for people at, e.g., very early stages of cognitive decline due to dementia. Despite this, the automatic predictions of age and cognitive decline based on cues found in the speech signal are generally treated as two separate tasks. In this paper, multi-task learning is applied for the joint estimation of age and the Mini-Mental Status Evaluation criteria (MMSE) commonly used to assess cognitive decline. To explore the relationship between age and MMSE, two neural network architectures are evaluated: a SincNet-based end-to-end architecture, and a system comprising of a feature extractor followed by a shallow neural network. Both are trained with single-task or multi-task targets. To compare, an SVM-based regressor is trained in a single-task setup. i-vector, x-vector and ComParE features are explored. Results are obtained on systems trained on the DementiaBank dataset and tested on an in-house dataset as well as the ADReSS dataset. The results show that both the age and MMSE estimation is improved by applying multitask learning, with state-of-the-art results achieved on the ADReSS dataset acoustic-only task.","PeriodicalId":347060,"journal":{"name":"ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"31 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICASSP39728.2021.9414642","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 5

Abstract

Speech is a common physiological signal that can be affected by both ageing and cognitive decline. Often the effect can be confounding, as would be the case for people at, e.g., very early stages of cognitive decline due to dementia. Despite this, the automatic predictions of age and cognitive decline based on cues found in the speech signal are generally treated as two separate tasks. In this paper, multi-task learning is applied for the joint estimation of age and the Mini-Mental Status Evaluation criteria (MMSE) commonly used to assess cognitive decline. To explore the relationship between age and MMSE, two neural network architectures are evaluated: a SincNet-based end-to-end architecture, and a system comprising of a feature extractor followed by a shallow neural network. Both are trained with single-task or multi-task targets. To compare, an SVM-based regressor is trained in a single-task setup. i-vector, x-vector and ComParE features are explored. Results are obtained on systems trained on the DementiaBank dataset and tested on an in-house dataset as well as the ADReSS dataset. The results show that both the age and MMSE estimation is improved by applying multitask learning, with state-of-the-art results achieved on the ADReSS dataset acoustic-only task.

查看原文本刊更多论文

多任务估计年龄和言语认知衰退

语言是一种常见的生理信号，会受到衰老和认知能力下降的影响。通常情况下，这种影响是令人困惑的，例如，对于那些由于痴呆症而认知能力下降的非常早期阶段的人来说。尽管如此，基于语音信号中发现的线索对年龄和认知能力下降的自动预测通常被视为两个独立的任务。本文将多任务学习应用于年龄的联合估计和常用的认知衰退评估标准——最小心智状态评估标准(MMSE)。为了探索年龄和MMSE之间的关系，我们评估了两种神经网络架构:基于sincnet的端到端架构，以及由特征提取器和浅神经网络组成的系统。两者都接受过单任务或多任务目标的训练。为了进行比较，在单任务设置中训练基于svm的回归器。探索了i向量、x向量和ComParE特征。在DementiaBank数据集上训练的系统上获得结果，并在内部数据集和address数据集上进行测试。结果表明，通过应用多任务学习，年龄和MMSE估计都得到了改善，在address数据集声学任务上取得了最先进的结果。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

自引率

0.00%

发文量