Adversarial-Free Speaker Identity-Invariant Representation Learning for Automatic Dysarthric Speech Classification

Interspeech Pub Date : 2022-09-18 DOI:10.21437/interspeech.2022-402

Parvaneh Janbakhshi, I. Kodrasi

引用次数: 0

Abstract

Speech representations which are robust to pathology-unrelated cues such as speaker identity information have been shown to be advantageous for automatic dysarthric speech classiﬁcation. A recently proposed technique to learn speaker identity-invariant representations for dysarthric speech classiﬁcation is based on adversarial training. However, adversarial training can be challenging, unstable, and sensitive to training parameters. To avoid adversarial training, in this paper we propose to learn speaker-identity invariant representations exploiting a feature separation framework relying on mutual information minimization. Experimental results on a database of neurotypical and dysarthric speech show that the proposed adversarial-free framework successfully learns speaker identity-invariant representations. Further, it is shown that such representations result in a similar dysarthric speech classiﬁcation performance as the representations obtained using adversarial training, while the training procedure is more stable and less sensitive to training parameters.

查看原文本刊更多论文

用于构音障碍语音自动分类的对抗性自由说话人身份不变表示学习

对病理学无关线索（如说话者身份信息）具有鲁棒性的语音表示已被证明有利于自动进行构音障碍语音分类。最近提出的一种用于学习构音障碍语音分类的说话人身份不变表示的技术是基于对抗性训练的。然而，对抗性训练可能具有挑战性、不稳定且对训练参数敏感。为了避免对抗性训练，在本文中，我们提出利用依赖于互信息最小化的特征分离框架来学习说话人身份不变表示。在神经典型和构音障碍语音数据库上的实验结果表明，所提出的无对抗性框架成功地学习了说话人身份不变表示。此外，研究表明，这种表征与使用对抗性训练获得的表征具有相似的构音障碍语音分类性能，而训练过程更稳定，对训练参数不太敏感。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Interspeech

自引率

0.00%

发文量