Clarity : machine learning challenges to revolutionise hearing device processing

arXiv: Audio and Speech Processing Pub Date : 2020-06-19 DOI:10.48465/FA.2020.0198

S. Graetzer, M. Akeroyd, J. Barker, T. Cox, J. Culling, G. Naylor, Eszter Porter, R. V. Muñoz

{"title":"Clarity : machine learning challenges to revolutionise hearing device processing","authors":"S. Graetzer, M. Akeroyd, J. Barker, T. Cox, J. Culling, G. Naylor, Eszter Porter, R. V. Muñoz","doi":"10.48465/FA.2020.0198","DOIUrl":null,"url":null,"abstract":"In the Clarity project, we will run a series of machine learning challenges to revolutionise speech processing for hearing devices. Over five years, there will be three paired challenges. Each pair will consist of a competition focussed on hearing-device processing (“enhancement”) and another focussed on speech perception modelling (“prediction”). The enhancement challenges will deliver new and improved approaches for hearing device signal processing for speech. The parallel prediction challenges will develop and improve methods for predicting speech intelligibility and quality for hearing impaired listeners. To facilitate the challenges, we will generate openaccess datasets, models and infrastructure. These will include: (1) tools for generating realistic test/training materials for different listening scenarios; (2) baseline models of hearing impairment; (3) baseline models of hearing-device processing; (4) baseline models of speech perception and (5) databases of speech perception in noise. The databases will include the results of listening tests that characterise how hearing-impaired listeners perceive speech in noise. We will also provide a comprehensive characterisation of each listeners hearing ability. The provision of open-access datasets, models and infrastructure will allow other researchers to develop algorithms for speech and hearing aid processing. In addition, it will lower barriers that prevent researchers from considering hearing impairment. In round one, speech will occur in the context of a living room, i.e., a moderately reverberant room with minimal (non-speech) background noise. Entries can be submitted to either the enhancement or prediction challenges, or both. We expect to open the beta version of round one in October for a full opening in November 2020, a closing date in June 2021 and results in October 2021. This Engineering and Physical Sciences Research Council (EPSRC) funded project involves researchers from the Universities of Sheffield, Salford, Nottingham and Cardiff in conjunction with the Hearing Industry Research Consortium, Action on Hearing Loss, Amazon, and Honda. To register interest in the challenges, go to www.claritychallenge.org/.","PeriodicalId":119553,"journal":{"name":"arXiv: Audio and Speech Processing","volume":"99 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv: Audio and Speech Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.48465/FA.2020.0198","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 5

Abstract

In the Clarity project, we will run a series of machine learning challenges to revolutionise speech processing for hearing devices. Over five years, there will be three paired challenges. Each pair will consist of a competition focussed on hearing-device processing (“enhancement”) and another focussed on speech perception modelling (“prediction”). The enhancement challenges will deliver new and improved approaches for hearing device signal processing for speech. The parallel prediction challenges will develop and improve methods for predicting speech intelligibility and quality for hearing impaired listeners. To facilitate the challenges, we will generate openaccess datasets, models and infrastructure. These will include: (1) tools for generating realistic test/training materials for different listening scenarios; (2) baseline models of hearing impairment; (3) baseline models of hearing-device processing; (4) baseline models of speech perception and (5) databases of speech perception in noise. The databases will include the results of listening tests that characterise how hearing-impaired listeners perceive speech in noise. We will also provide a comprehensive characterisation of each listeners hearing ability. The provision of open-access datasets, models and infrastructure will allow other researchers to develop algorithms for speech and hearing aid processing. In addition, it will lower barriers that prevent researchers from considering hearing impairment. In round one, speech will occur in the context of a living room, i.e., a moderately reverberant room with minimal (non-speech) background noise. Entries can be submitted to either the enhancement or prediction challenges, or both. We expect to open the beta version of round one in October for a full opening in November 2020, a closing date in June 2021 and results in October 2021. This Engineering and Physical Sciences Research Council (EPSRC) funded project involves researchers from the Universities of Sheffield, Salford, Nottingham and Cardiff in conjunction with the Hearing Industry Research Consortium, Action on Hearing Loss, Amazon, and Honda. To register interest in the challenges, go to www.claritychallenge.org/.

查看原文本刊更多论文

清晰度:机器学习挑战彻底改变听力设备处理

在Clarity项目中，我们将进行一系列机器学习挑战，以彻底改变听力设备的语音处理。在接下来的五年里，将会有三个成对的挑战。每组比赛将包括一场专注于听力设备处理(“增强”)和另一场专注于语音感知建模(“预测”)的比赛。增强挑战将为语音的听力设备信号处理提供新的和改进的方法。平行预测挑战将发展和改进预测听力受损听众语音清晰度和质量的方法。为了应对这些挑战，我们将生成开放获取的数据集、模型和基础设施。这些工具将包括:(1)针对不同听力场景生成真实的测试/培训材料的工具;(2)听力损伤基线模型;(3)助听器加工基线模型;(4)语音感知基线模型;(5)噪声环境下语音感知数据库。该数据库将包括听力测试的结果，这些测试描述了听力受损的听众如何在噪音中感知语音。我们还将提供每位听众听力能力的综合描述。提供开放获取的数据集、模型和基础设施将允许其他研究人员开发语音和助听器处理的算法。此外，它将降低阻碍研究人员考虑听力障碍的障碍。在第一轮中，讲话将发生在客厅的环境中，即一个具有最小(非讲话)背景噪音的中度混响房间。参赛作品可以提交增强或预测挑战，或同时提交。我们预计将在10月份开放第一轮的测试版，2020年11月全面开放，2021年6月截止，2021年10月公布结果。这个工程和物理科学研究委员会(EPSRC)资助的项目涉及来自谢菲尔德大学、索尔福德大学、诺丁汉大学和卡迪夫大学的研究人员，以及听力产业研究联盟、听力损失行动、亚马逊和本田。要对挑战感兴趣，请登录www.claritychallenge.org/。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

arXiv: Audio and Speech Processing

自引率

0.00%

发文量