Sascha Grollmisch , Estefanía Cano , Hanna Lukashevich , Jakob Abeßer
{"title":"利用不确定性对 FixMatch 进行新的扩展,用于半监督音频分类","authors":"Sascha Grollmisch , Estefanía Cano , Hanna Lukashevich , Jakob Abeßer","doi":"10.1016/j.sctalk.2024.100364","DOIUrl":null,"url":null,"abstract":"<div><p>Semi-supervised learning (SSL) is a commonly used technique when annotated data is scarce but unlabeled data is easily available. In recent years, SSL has seen a large boost in the computer vision domain and methods such as FixMatch were successfully adapted to audio classification tasks. However, there still remains a gap between SSL methods and the fully supervised baselines, which were trained with all labels available. In this work, we first investigate the quality of the pseudo-labels, i.e., generated labels for unlabeled data, for musical instrument family classification and acoustic scene classification. Based on these insights, we propose and evaluate a novel extension of FixMatch that quantifies and considers the uncertainty of the pseudo-labels. Additionally, we highlight the problematic tradeoff between pseudo-label quality and quantity. Our results show that Monte-Carlo Dropout combined with temperature scaling improved the pseudo-label accuracy from 78.4% to 86.7% for instrument family and from 87.9% to 89.9% for acoustic scene classification. Even though the accuracy on the test sets improved from 71.0% to 72.1% and from 69.2% to 70.8%, respectively, there is still a gap to the fully supervised baseline leaving room for future work.</p></div>","PeriodicalId":101148,"journal":{"name":"Science Talks","volume":"10 ","pages":"Article 100364"},"PeriodicalIF":0.0000,"publicationDate":"2024-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2772569324000720/pdfft?md5=71e508d40caa26eb0c2cde9d66bc9567&pid=1-s2.0-S2772569324000720-main.pdf","citationCount":"0","resultStr":"{\"title\":\"A novel extension of FixMatch using uncertainty for semi-supervised audio classification\",\"authors\":\"Sascha Grollmisch , Estefanía Cano , Hanna Lukashevich , Jakob Abeßer\",\"doi\":\"10.1016/j.sctalk.2024.100364\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Semi-supervised learning (SSL) is a commonly used technique when annotated data is scarce but unlabeled data is easily available. In recent years, SSL has seen a large boost in the computer vision domain and methods such as FixMatch were successfully adapted to audio classification tasks. However, there still remains a gap between SSL methods and the fully supervised baselines, which were trained with all labels available. In this work, we first investigate the quality of the pseudo-labels, i.e., generated labels for unlabeled data, for musical instrument family classification and acoustic scene classification. Based on these insights, we propose and evaluate a novel extension of FixMatch that quantifies and considers the uncertainty of the pseudo-labels. Additionally, we highlight the problematic tradeoff between pseudo-label quality and quantity. Our results show that Monte-Carlo Dropout combined with temperature scaling improved the pseudo-label accuracy from 78.4% to 86.7% for instrument family and from 87.9% to 89.9% for acoustic scene classification. Even though the accuracy on the test sets improved from 71.0% to 72.1% and from 69.2% to 70.8%, respectively, there is still a gap to the fully supervised baseline leaving room for future work.</p></div>\",\"PeriodicalId\":101148,\"journal\":{\"name\":\"Science Talks\",\"volume\":\"10 \",\"pages\":\"Article 100364\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-06-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.sciencedirect.com/science/article/pii/S2772569324000720/pdfft?md5=71e508d40caa26eb0c2cde9d66bc9567&pid=1-s2.0-S2772569324000720-main.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Science Talks\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2772569324000720\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Science Talks","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2772569324000720","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A novel extension of FixMatch using uncertainty for semi-supervised audio classification
Semi-supervised learning (SSL) is a commonly used technique when annotated data is scarce but unlabeled data is easily available. In recent years, SSL has seen a large boost in the computer vision domain and methods such as FixMatch were successfully adapted to audio classification tasks. However, there still remains a gap between SSL methods and the fully supervised baselines, which were trained with all labels available. In this work, we first investigate the quality of the pseudo-labels, i.e., generated labels for unlabeled data, for musical instrument family classification and acoustic scene classification. Based on these insights, we propose and evaluate a novel extension of FixMatch that quantifies and considers the uncertainty of the pseudo-labels. Additionally, we highlight the problematic tradeoff between pseudo-label quality and quantity. Our results show that Monte-Carlo Dropout combined with temperature scaling improved the pseudo-label accuracy from 78.4% to 86.7% for instrument family and from 87.9% to 89.9% for acoustic scene classification. Even though the accuracy on the test sets improved from 71.0% to 72.1% and from 69.2% to 70.8%, respectively, there is still a gap to the fully supervised baseline leaving room for future work.