SALT：标准化音频事件标签分类法

arXiv - EE - Audio and Speech Processing Pub Date : 2024-09-18 DOI:arxiv-2409.11746

Paraskevas StamatiadisIDS, S2A, LTCI, Michel OlveraIDS, S2A, LTCI, Slim EssidIDS, S2A, LTCI

{"title":"SALT：标准化音频事件标签分类法","authors":"Paraskevas StamatiadisIDS, S2A, LTCI, Michel OlveraIDS, S2A, LTCI, Slim EssidIDS, S2A, LTCI","doi":"arxiv-2409.11746","DOIUrl":null,"url":null,"abstract":"Machine listening systems often rely on fixed taxonomies to organize and\nlabel audio data, key for training and evaluating deep neural networks (DNNs)\nand other supervised algorithms. However, such taxonomies face significant\nconstraints: they are composed of application-dependent predefined categories,\nwhich hinders the integration of new or varied sounds, and exhibits limited\ncross-dataset compatibility due to inconsistent labeling standards. To overcome\nthese limitations, we introduce SALT: Standardized Audio event Label Taxonomy.\nBuilding upon the hierarchical structure of AudioSet's ontology, our taxonomy\nextends and standardizes labels across 24 publicly available environmental\nsound datasets, allowing the mapping of class labels from diverse datasets to a\nunified system. Our proposal comes with a new Python package designed for\nnavigating and utilizing this taxonomy, easing cross-dataset label searching\nand hierarchical exploration. Notably, our package allows effortless data\naggregation from diverse sources, hence easy experimentation with combined\ndatasets.","PeriodicalId":501284,"journal":{"name":"arXiv - EE - Audio and Speech Processing","volume":"30 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"SALT: Standardized Audio event Label Taxonomy\",\"authors\":\"Paraskevas StamatiadisIDS, S2A, LTCI, Michel OlveraIDS, S2A, LTCI, Slim EssidIDS, S2A, LTCI\",\"doi\":\"arxiv-2409.11746\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Machine listening systems often rely on fixed taxonomies to organize and\\nlabel audio data, key for training and evaluating deep neural networks (DNNs)\\nand other supervised algorithms. However, such taxonomies face significant\\nconstraints: they are composed of application-dependent predefined categories,\\nwhich hinders the integration of new or varied sounds, and exhibits limited\\ncross-dataset compatibility due to inconsistent labeling standards. To overcome\\nthese limitations, we introduce SALT: Standardized Audio event Label Taxonomy.\\nBuilding upon the hierarchical structure of AudioSet's ontology, our taxonomy\\nextends and standardizes labels across 24 publicly available environmental\\nsound datasets, allowing the mapping of class labels from diverse datasets to a\\nunified system. Our proposal comes with a new Python package designed for\\nnavigating and utilizing this taxonomy, easing cross-dataset label searching\\nand hierarchical exploration. Notably, our package allows effortless data\\naggregation from diverse sources, hence easy experimentation with combined\\ndatasets.\",\"PeriodicalId\":501284,\"journal\":{\"name\":\"arXiv - EE - Audio and Speech Processing\",\"volume\":\"30 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-09-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - EE - Audio and Speech Processing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2409.11746\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - EE - Audio and Speech Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.11746","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

机器听音系统通常依靠固定的分类标准来组织和标记音频数据，这是训练和评估深度神经网络（DNN）和其他监督算法的关键。然而，这些分类标准面临着很大的限制：它们由依赖于应用的预定义类别组成，这阻碍了新声音或各种声音的整合，而且由于标签标准不一致，跨数据集的兼容性也很有限。为了克服这些限制，我们引入了 SALT：标准化音频事件标签分类法。在 AudioSet 本体的分层结构基础上，我们的分类法扩展并标准化了 24 个公开可用的环境声音数据集的标签，允许将不同数据集的类标签映射到统一的系统中。我们的提案还附带了一个新的 Python 软件包，该软件包专为导航和使用该分类法而设计，可简化跨数据集标签搜索和分层探索。值得注意的是，我们的软件包可以毫不费力地对不同来源的数据进行聚合，从而轻松地对组合数据集进行实验。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

SALT: Standardized Audio event Label Taxonomy

Machine listening systems often rely on fixed taxonomies to organize and label audio data, key for training and evaluating deep neural networks (DNNs) and other supervised algorithms. However, such taxonomies face significant constraints: they are composed of application-dependent predefined categories, which hinders the integration of new or varied sounds, and exhibits limited cross-dataset compatibility due to inconsistent labeling standards. To overcome these limitations, we introduce SALT: Standardized Audio event Label Taxonomy. Building upon the hierarchical structure of AudioSet's ontology, our taxonomy extends and standardizes labels across 24 publicly available environmental sound datasets, allowing the mapping of class labels from diverse datasets to a unified system. Our proposal comes with a new Python package designed for navigating and utilizing this taxonomy, easing cross-dataset label searching and hierarchical exploration. Notably, our package allows effortless data aggregation from diverse sources, hence easy experimentation with combined datasets.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

arXiv - EE - Audio and Speech Processing

自引率

0.00%

发文量