Towards robust heart failure detection in digital telephony environments by utilizing transformer-based codec inversion

IF 2.4 3区 计算机科学 Q2 ACOUSTICS
Saska Tirronen , Farhad Javanmardi , Hilla Pohjalainen , Sudarsana Reddy Kadiri , Kiran Reddy Mittapalle , Pyry Helkkula , Kasimir Kaitue , Mikko Minkkinen , Heli Tolppanen , Tuomo Nieminen , Paavo Alku
{"title":"Towards robust heart failure detection in digital telephony environments by utilizing transformer-based codec inversion","authors":"Saska Tirronen ,&nbsp;Farhad Javanmardi ,&nbsp;Hilla Pohjalainen ,&nbsp;Sudarsana Reddy Kadiri ,&nbsp;Kiran Reddy Mittapalle ,&nbsp;Pyry Helkkula ,&nbsp;Kasimir Kaitue ,&nbsp;Mikko Minkkinen ,&nbsp;Heli Tolppanen ,&nbsp;Tuomo Nieminen ,&nbsp;Paavo Alku","doi":"10.1016/j.specom.2025.103279","DOIUrl":null,"url":null,"abstract":"<div><div>This study introduces the Codec Transformer Network (CTN) to enhance the reliability of automatic heart failure (HF) detection from coded telephone speech by addressing codec-related challenges in digital telephony. The study specifically addresses the codec mismatch between training and inference in HF detection. CTN is designed to map the mel-spectrogram representations of encoded speech signals back to their original, non-encoded forms, thereby recovering HF-related discriminative information. The effectiveness of CTN is demonstrated in conjunction with three HF detectors, based on Support Vector Machine, Random Forest, and K-Nearest Neighbors classifiers. The results show that CTN effectively retrieves the discriminative information between patients and controls, and performs comparably to or better than a baseline approach, based on multi-condition training.</div></div>","PeriodicalId":49485,"journal":{"name":"Speech Communication","volume":"173 ","pages":"Article 103279"},"PeriodicalIF":2.4000,"publicationDate":"2025-07-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Speech Communication","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0167639325000949","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ACOUSTICS","Score":null,"Total":0}
引用次数: 0

Abstract

This study introduces the Codec Transformer Network (CTN) to enhance the reliability of automatic heart failure (HF) detection from coded telephone speech by addressing codec-related challenges in digital telephony. The study specifically addresses the codec mismatch between training and inference in HF detection. CTN is designed to map the mel-spectrogram representations of encoded speech signals back to their original, non-encoded forms, thereby recovering HF-related discriminative information. The effectiveness of CTN is demonstrated in conjunction with three HF detectors, based on Support Vector Machine, Random Forest, and K-Nearest Neighbors classifiers. The results show that CTN effectively retrieves the discriminative information between patients and controls, and performs comparably to or better than a baseline approach, based on multi-condition training.
利用基于变压器的编解码器反转在数字电话环境中实现鲁棒心力衰竭检测
本研究引入了编解码器变压器网络(CTN),通过解决数字电话中与编解码器相关的挑战,来提高从编码电话语音中自动检测心力衰竭(HF)的可靠性。该研究特别解决了高频检测中训练和推理之间的编解码器不匹配问题。CTN旨在将编码语音信号的梅尔谱图表示映射回其原始的非编码形式,从而恢复hf相关的判别信息。CTN的有效性与基于支持向量机、随机森林和k近邻分类器的三种高频检测器相结合。结果表明,CTN有效地检索了患者和对照组之间的判别信息,并且基于多条件训练的CTN方法的性能与基线方法相当或更好。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Speech Communication
Speech Communication 工程技术-计算机:跨学科应用
CiteScore
6.80
自引率
6.20%
发文量
94
审稿时长
19.2 weeks
期刊介绍: Speech Communication is an interdisciplinary journal whose primary objective is to fulfil the need for the rapid dissemination and thorough discussion of basic and applied research results. The journal''s primary objectives are: • to present a forum for the advancement of human and human-machine speech communication science; • to stimulate cross-fertilization between different fields of this domain; • to contribute towards the rapid and wide diffusion of scientifically sound contributions in this domain.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信