The First Cadenza Challenges: Using Machine Learning Competitions to Improve Music for Listeners With a Hearing Loss

IF 2.9 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC
Gerardo Roa-Dabike;Michael A. Akeroyd;Scott Bannister;Jon P. Barker;Trevor J. Cox;Bruno Fazenda;Jennifer Firth;Simone Graetzer;Alinka Greasley;Rebecca R. Vos;William M. Whitmer
{"title":"The First Cadenza Challenges: Using Machine Learning Competitions to Improve Music for Listeners With a Hearing Loss","authors":"Gerardo Roa-Dabike;Michael A. Akeroyd;Scott Bannister;Jon P. Barker;Trevor J. Cox;Bruno Fazenda;Jennifer Firth;Simone Graetzer;Alinka Greasley;Rebecca R. Vos;William M. Whitmer","doi":"10.1109/OJSP.2025.3578299","DOIUrl":null,"url":null,"abstract":"Listening to music can be an issue for those with a hearing impairment, and hearing aids are not a universal solution. This paper details the first use of an open challenge methodology to improve the audio quality of music for those with hearing loss through machine learning. The first challenge (CAD1) had 9 participants. The second was a 2024 ICASSP grand challenge (ICASSP24), which attracted 17 entrants. The challenge tasks concerned demixing and remixing pop/rock music to allow a personalized rebalancing of the instruments in the mix, along with amplification to correct for raised hearing thresholds. The software baselines provided for entrants to build upon used two state-of-the-art demix algorithms: Hybrid Demucs and Open-Unmix. Objective evaluation used HAAQI, the Hearing-Aid Audio Quality Index. No entries improved on the best baseline in CAD1. It is suggested that this arose because demixing algorithms are relatively mature, and recent work has shown that access to large (private) datasets is needed to further improve performance. Learning from this, for ICASSP24 the scenario was made more difficult by using loudspeaker reproduction and specifying gains to be applied before remixing. This also made the scenario more useful for listening through hearing aids. Nine entrants scored better than the best ICASSP24 baseline. Most of the entrants used a refined version of Hybrid Demucs and NAL-R amplification. The highest scoring system combined the outputs of several demixing algorithms in an ensemble approach. These challenges are now open benchmarks for future research with freely available software and data.","PeriodicalId":73300,"journal":{"name":"IEEE open journal of signal processing","volume":"6 ","pages":"722-734"},"PeriodicalIF":2.9000,"publicationDate":"2025-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11030066","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE open journal of signal processing","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/11030066/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0

Abstract

Listening to music can be an issue for those with a hearing impairment, and hearing aids are not a universal solution. This paper details the first use of an open challenge methodology to improve the audio quality of music for those with hearing loss through machine learning. The first challenge (CAD1) had 9 participants. The second was a 2024 ICASSP grand challenge (ICASSP24), which attracted 17 entrants. The challenge tasks concerned demixing and remixing pop/rock music to allow a personalized rebalancing of the instruments in the mix, along with amplification to correct for raised hearing thresholds. The software baselines provided for entrants to build upon used two state-of-the-art demix algorithms: Hybrid Demucs and Open-Unmix. Objective evaluation used HAAQI, the Hearing-Aid Audio Quality Index. No entries improved on the best baseline in CAD1. It is suggested that this arose because demixing algorithms are relatively mature, and recent work has shown that access to large (private) datasets is needed to further improve performance. Learning from this, for ICASSP24 the scenario was made more difficult by using loudspeaker reproduction and specifying gains to be applied before remixing. This also made the scenario more useful for listening through hearing aids. Nine entrants scored better than the best ICASSP24 baseline. Most of the entrants used a refined version of Hybrid Demucs and NAL-R amplification. The highest scoring system combined the outputs of several demixing algorithms in an ensemble approach. These challenges are now open benchmarks for future research with freely available software and data.
第一个华彩挑战:使用机器学习比赛来改善听力损失听众的音乐
对于听力受损的人来说,听音乐可能是个问题,助听器并不是万能的解决方案。本文详细介绍了首次使用开放式挑战方法,通过机器学习为听力损失的人提高音乐的音频质量。第一个挑战(CAD1)有9名参与者。第二次是2024年ICASSP大挑战(ICASSP24),吸引了17名参赛者。挑战任务涉及对流行/摇滚音乐进行解混音和重混音,以允许在混音中对乐器进行个性化的再平衡,同时使用扩音来纠正听力阈值的提高。为参赛者提供的软件基线使用了两种最先进的分解算法:Hybrid demus和Open-Unmix。客观评价采用助听器音质指数HAAQI。在CAD1的最佳基线上没有条目改善。有人认为,这是因为去混算法相对成熟,最近的工作表明,需要访问大型(私有)数据集来进一步提高性能。从中吸取教训,对于ICASSP24来说,通过使用扬声器再现和在混音之前指定要应用的增益,这种情况变得更加困难。这也使得这种情况对通过助听器进行听力更有用。9名参赛者的得分高于ICASSP24的最佳基线。大多数参赛者使用的是改良版的Hybrid Demucs和NAL-R放大器。得分最高的系统在集成方法中结合了几种解混算法的输出。这些挑战现在是开放的基准,为未来的研究与免费提供的软件和数据。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
5.30
自引率
0.00%
发文量
0
审稿时长
22 weeks
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信