{"title":"SALT: Standardized Audio event Label Taxonomy","authors":"Paraskevas StamatiadisIDS, S2A, LTCI, Michel OlveraIDS, S2A, LTCI, Slim EssidIDS, S2A, LTCI","doi":"arxiv-2409.11746","DOIUrl":null,"url":null,"abstract":"Machine listening systems often rely on fixed taxonomies to organize and\nlabel audio data, key for training and evaluating deep neural networks (DNNs)\nand other supervised algorithms. However, such taxonomies face significant\nconstraints: they are composed of application-dependent predefined categories,\nwhich hinders the integration of new or varied sounds, and exhibits limited\ncross-dataset compatibility due to inconsistent labeling standards. To overcome\nthese limitations, we introduce SALT: Standardized Audio event Label Taxonomy.\nBuilding upon the hierarchical structure of AudioSet's ontology, our taxonomy\nextends and standardizes labels across 24 publicly available environmental\nsound datasets, allowing the mapping of class labels from diverse datasets to a\nunified system. Our proposal comes with a new Python package designed for\nnavigating and utilizing this taxonomy, easing cross-dataset label searching\nand hierarchical exploration. Notably, our package allows effortless data\naggregation from diverse sources, hence easy experimentation with combined\ndatasets.","PeriodicalId":501284,"journal":{"name":"arXiv - EE - Audio and Speech Processing","volume":"30 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - EE - Audio and Speech Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.11746","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Machine listening systems often rely on fixed taxonomies to organize and
label audio data, key for training and evaluating deep neural networks (DNNs)
and other supervised algorithms. However, such taxonomies face significant
constraints: they are composed of application-dependent predefined categories,
which hinders the integration of new or varied sounds, and exhibits limited
cross-dataset compatibility due to inconsistent labeling standards. To overcome
these limitations, we introduce SALT: Standardized Audio event Label Taxonomy.
Building upon the hierarchical structure of AudioSet's ontology, our taxonomy
extends and standardizes labels across 24 publicly available environmental
sound datasets, allowing the mapping of class labels from diverse datasets to a
unified system. Our proposal comes with a new Python package designed for
navigating and utilizing this taxonomy, easing cross-dataset label searching
and hierarchical exploration. Notably, our package allows effortless data
aggregation from diverse sources, hence easy experimentation with combined
datasets.