{"title":"基于自适应数据填充的环境声音分类算法","authors":"Wei Qin, Bo Yin","doi":"10.1109/scset55041.2022.00028","DOIUrl":null,"url":null,"abstract":"Environmental sound classification (ESC) has important practical significance, such as security monitoring, audio retrieval, etc. However, there are many problems in the field of ESC, which lead to the application in the actual scene is often not up to the ideal situation. In this paper, due to the non-stationary nature of environmental sound and the strong disturbance of environmental noise, an environmental sound classification algorithm based on adaptive data padding is proposed. In this method, the short raw audio data is first filled with random padding method, and then the raw audio data is converted into logmel spectrum, and then the generated logmel spectrum is input into the neural network for training. In this paper, the structure of neural network is reorganized by incremental convolution kernel, and the Batch Normalization (BN) layer is used for data normalization after each convolution layer. Finally, the model is verified based on UrbanSound8K dataset, and the experimental results prove the validity of the proposed model.","PeriodicalId":446933,"journal":{"name":"2022 International Seminar on Computer Science and Engineering Technology (SCSET)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Environmental Sound Classification Algorithm Based on Adaptive Data Padding\",\"authors\":\"Wei Qin, Bo Yin\",\"doi\":\"10.1109/scset55041.2022.00028\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Environmental sound classification (ESC) has important practical significance, such as security monitoring, audio retrieval, etc. However, there are many problems in the field of ESC, which lead to the application in the actual scene is often not up to the ideal situation. In this paper, due to the non-stationary nature of environmental sound and the strong disturbance of environmental noise, an environmental sound classification algorithm based on adaptive data padding is proposed. In this method, the short raw audio data is first filled with random padding method, and then the raw audio data is converted into logmel spectrum, and then the generated logmel spectrum is input into the neural network for training. In this paper, the structure of neural network is reorganized by incremental convolution kernel, and the Batch Normalization (BN) layer is used for data normalization after each convolution layer. Finally, the model is verified based on UrbanSound8K dataset, and the experimental results prove the validity of the proposed model.\",\"PeriodicalId\":446933,\"journal\":{\"name\":\"2022 International Seminar on Computer Science and Engineering Technology (SCSET)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 International Seminar on Computer Science and Engineering Technology (SCSET)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/scset55041.2022.00028\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 International Seminar on Computer Science and Engineering Technology (SCSET)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/scset55041.2022.00028","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Environmental Sound Classification Algorithm Based on Adaptive Data Padding
Environmental sound classification (ESC) has important practical significance, such as security monitoring, audio retrieval, etc. However, there are many problems in the field of ESC, which lead to the application in the actual scene is often not up to the ideal situation. In this paper, due to the non-stationary nature of environmental sound and the strong disturbance of environmental noise, an environmental sound classification algorithm based on adaptive data padding is proposed. In this method, the short raw audio data is first filled with random padding method, and then the raw audio data is converted into logmel spectrum, and then the generated logmel spectrum is input into the neural network for training. In this paper, the structure of neural network is reorganized by incremental convolution kernel, and the Batch Normalization (BN) layer is used for data normalization after each convolution layer. Finally, the model is verified based on UrbanSound8K dataset, and the experimental results prove the validity of the proposed model.