Jincheng Zhang, Baojun Wang, W. Shi, Jucai Lin, Jun Yin
{"title":"基于音频增强的危险声音检测","authors":"Jincheng Zhang, Baojun Wang, W. Shi, Jucai Lin, Jun Yin","doi":"10.1145/3459104.3459174","DOIUrl":null,"url":null,"abstract":"The aim of surveillance is to detect the occurrence of dangerous events. Recently, with the widely use of deep learning, video surveillance had get dramatically improvement. For audio event detection in surveillance, the deep learning means are applied in hazardous sound classification task. However, due to the low frequency of dangerous sounds occurred and the high cost of collection, there is no corresponding large-scale dataset. Large-scale dataset is essential to achieve an ideal result for deep learning methods. Therefore, how to obtain richer audio events has become an urgent problem. Nowadays, researchers have use a variety of data augmentation methods in computer vision, making performance improvement obviously. And these approaches are gradually being used in various sound pattern recognition or ASR (auto-speech recognition), but there is little research on the classification of hazardous sounds with less data set. In this paper, various data augmentation methods are adopted for hazardous sound classification. Our results show that data augmentation has bring big improvement on all four class dataset. The classification accuracy has increased by 0.5% on average. As the scale of data augmentation increases, the classification accuracy has increased to about 1.5%.","PeriodicalId":142284,"journal":{"name":"2021 International Symposium on Electrical, Electronics and Information Engineering","volume":"16 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-02-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Hazardous Sound Detection Based on Audio Augmentation\",\"authors\":\"Jincheng Zhang, Baojun Wang, W. Shi, Jucai Lin, Jun Yin\",\"doi\":\"10.1145/3459104.3459174\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The aim of surveillance is to detect the occurrence of dangerous events. Recently, with the widely use of deep learning, video surveillance had get dramatically improvement. For audio event detection in surveillance, the deep learning means are applied in hazardous sound classification task. However, due to the low frequency of dangerous sounds occurred and the high cost of collection, there is no corresponding large-scale dataset. Large-scale dataset is essential to achieve an ideal result for deep learning methods. Therefore, how to obtain richer audio events has become an urgent problem. Nowadays, researchers have use a variety of data augmentation methods in computer vision, making performance improvement obviously. And these approaches are gradually being used in various sound pattern recognition or ASR (auto-speech recognition), but there is little research on the classification of hazardous sounds with less data set. In this paper, various data augmentation methods are adopted for hazardous sound classification. Our results show that data augmentation has bring big improvement on all four class dataset. The classification accuracy has increased by 0.5% on average. As the scale of data augmentation increases, the classification accuracy has increased to about 1.5%.\",\"PeriodicalId\":142284,\"journal\":{\"name\":\"2021 International Symposium on Electrical, Electronics and Information Engineering\",\"volume\":\"16 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-02-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 International Symposium on Electrical, Electronics and Information Engineering\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3459104.3459174\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 International Symposium on Electrical, Electronics and Information Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3459104.3459174","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Hazardous Sound Detection Based on Audio Augmentation
The aim of surveillance is to detect the occurrence of dangerous events. Recently, with the widely use of deep learning, video surveillance had get dramatically improvement. For audio event detection in surveillance, the deep learning means are applied in hazardous sound classification task. However, due to the low frequency of dangerous sounds occurred and the high cost of collection, there is no corresponding large-scale dataset. Large-scale dataset is essential to achieve an ideal result for deep learning methods. Therefore, how to obtain richer audio events has become an urgent problem. Nowadays, researchers have use a variety of data augmentation methods in computer vision, making performance improvement obviously. And these approaches are gradually being used in various sound pattern recognition or ASR (auto-speech recognition), but there is little research on the classification of hazardous sounds with less data set. In this paper, various data augmentation methods are adopted for hazardous sound classification. Our results show that data augmentation has bring big improvement on all four class dataset. The classification accuracy has increased by 0.5% on average. As the scale of data augmentation increases, the classification accuracy has increased to about 1.5%.