{"title":"ToyADMOS2 dataset: Another dataset of miniature-machine operating sounds for anomalous sound detection under domain shift conditions","authors":"N. Harada, Daisuke Niizumi, Daiki Takeuchi, Yasunori Ohishi, Masahiro Yasuda, Shoichiro Saito","doi":"10.5281/ZENODO.4580270","DOIUrl":"https://doi.org/10.5281/ZENODO.4580270","url":null,"abstract":"This paper proposes a new large-scale dataset called \"ToyADMOS2\" for anomaly detection in machine operating sounds (ADMOS). As did for our previous ToyADMOS dataset, we collected a large number of operating sounds of miniature machines (toys) under normal and anomaly conditions by deliberately damaging them but extended with providing controlled depth of damages in anomaly samples. Since typical application scenarios of ADMOS often require robust performance under domain-shift conditions, the ToyADMOS2 dataset is designed for evaluating systems under such conditions. The released dataset consists of two sub-datasets for machine-condition inspection: fault diagnosis of machines with geometrically fixed tasks and fault diagnosis of machines with moving tasks. Domain shifts are represented by introducing several differences in operating conditions, such as the use of the same machine type but with different machine models and parts configurations, different operating speeds, microphone arrangements, etc. Each sub-dataset contains over 27 k samples of normal machine-operating sounds and over 8 k samples of anomalous sounds recorded with five to eight microphones. The dataset is freely available for download at this https URL and this https URL.","PeriodicalId":119553,"journal":{"name":"arXiv: Audio and Speech Processing","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123642728","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
S. Graetzer, M. Akeroyd, J. Barker, T. Cox, J. Culling, G. Naylor, Eszter Porter, R. V. Muñoz
{"title":"Clarity : machine learning challenges to revolutionise hearing device processing","authors":"S. Graetzer, M. Akeroyd, J. Barker, T. Cox, J. Culling, G. Naylor, Eszter Porter, R. V. Muñoz","doi":"10.48465/FA.2020.0198","DOIUrl":"https://doi.org/10.48465/FA.2020.0198","url":null,"abstract":"In the Clarity project, we will run a series of machine learning challenges to revolutionise speech processing for hearing devices. Over five years, there will be three paired challenges. Each pair will consist of a competition focussed on hearing-device processing (“enhancement”) and another focussed on speech perception modelling (“prediction”). The enhancement challenges will deliver new and improved approaches for hearing device signal processing for speech. The parallel prediction challenges will develop and improve methods for predicting speech intelligibility and quality for hearing impaired listeners. To facilitate the challenges, we will generate openaccess datasets, models and infrastructure. These will include: (1) tools for generating realistic test/training materials for different listening scenarios; (2) baseline models of hearing impairment; (3) baseline models of hearing-device processing; (4) baseline models of speech perception and (5) databases of speech perception in noise. The databases will include the results of listening tests that characterise how hearing-impaired listeners perceive speech in noise. We will also provide a comprehensive characterisation of each listeners hearing ability. The provision of open-access datasets, models and infrastructure will allow other researchers to develop algorithms for speech and hearing aid processing. In addition, it will lower barriers that prevent researchers from considering hearing impairment. In round one, speech will occur in the context of a living room, i.e., a moderately reverberant room with minimal (non-speech) background noise. Entries can be submitted to either the enhancement or prediction challenges, or both. We expect to open the beta version of round one in October for a full opening in November 2020, a closing date in June 2021 and results in October 2021. This Engineering and Physical Sciences Research Council (EPSRC) funded project involves researchers from the Universities of Sheffield, Salford, Nottingham and Cardiff in conjunction with the Hearing Industry Research Consortium, Action on Hearing Loss, Amazon, and Honda. To register interest in the challenges, go to www.claritychallenge.org/.","PeriodicalId":119553,"journal":{"name":"arXiv: Audio and Speech Processing","volume":"99 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129207585","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}