{"title":"机器听力系统声学特征提取的元分析","authors":"Ricardo A. Catanghal, T. Palaoag, C. Dayagdag","doi":"10.1145/3316615.3316664","DOIUrl":null,"url":null,"abstract":"Generally, the concentration of the study and research in the understanding of sounds revolves around the speech and music area, on the contrary, there are few in environmental and non-speech recognition. This paper carries out a meta-analysis of the acoustic transformation and feature set extraction of the environmental sound raw signal form into a parametric type representation in handling analysis, perception, and labeling for audio analysis of sound identification systems. We evaluated and analyzed the various contemporary methods and feature algorithms surveyed for the acoustic identification and perception of surrounding sounds, the Gammatone spectral coefficients (GSTC) and Mel Filterbank (FBEs) then the acoustic signal classification the Convolutional Neural Network (ConvNet) was applied. The outcome demonstrates that GSTC accomplished better as a feature in contrast to FBEs, but FBEs tend to improve performance when merge or incorporated with other feature. The analysis demonstrates that merging or incorporating with other features set is encouraging in achieving a much better accuracy in contrast to a single feature in classifying environmental sounds that is useful in the advancement of the intelligent machine listening frameworks.","PeriodicalId":268392,"journal":{"name":"Proceedings of the 2019 8th International Conference on Software and Computer Applications","volume":"24 12","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-02-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Meta-Analysis of Acoustic Feature Extraction for Machine Listening Systems\",\"authors\":\"Ricardo A. Catanghal, T. Palaoag, C. Dayagdag\",\"doi\":\"10.1145/3316615.3316664\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Generally, the concentration of the study and research in the understanding of sounds revolves around the speech and music area, on the contrary, there are few in environmental and non-speech recognition. This paper carries out a meta-analysis of the acoustic transformation and feature set extraction of the environmental sound raw signal form into a parametric type representation in handling analysis, perception, and labeling for audio analysis of sound identification systems. We evaluated and analyzed the various contemporary methods and feature algorithms surveyed for the acoustic identification and perception of surrounding sounds, the Gammatone spectral coefficients (GSTC) and Mel Filterbank (FBEs) then the acoustic signal classification the Convolutional Neural Network (ConvNet) was applied. The outcome demonstrates that GSTC accomplished better as a feature in contrast to FBEs, but FBEs tend to improve performance when merge or incorporated with other feature. The analysis demonstrates that merging or incorporating with other features set is encouraging in achieving a much better accuracy in contrast to a single feature in classifying environmental sounds that is useful in the advancement of the intelligent machine listening frameworks.\",\"PeriodicalId\":268392,\"journal\":{\"name\":\"Proceedings of the 2019 8th International Conference on Software and Computer Applications\",\"volume\":\"24 12\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-02-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 2019 8th International Conference on Software and Computer Applications\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3316615.3316664\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2019 8th International Conference on Software and Computer Applications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3316615.3316664","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Meta-Analysis of Acoustic Feature Extraction for Machine Listening Systems
Generally, the concentration of the study and research in the understanding of sounds revolves around the speech and music area, on the contrary, there are few in environmental and non-speech recognition. This paper carries out a meta-analysis of the acoustic transformation and feature set extraction of the environmental sound raw signal form into a parametric type representation in handling analysis, perception, and labeling for audio analysis of sound identification systems. We evaluated and analyzed the various contemporary methods and feature algorithms surveyed for the acoustic identification and perception of surrounding sounds, the Gammatone spectral coefficients (GSTC) and Mel Filterbank (FBEs) then the acoustic signal classification the Convolutional Neural Network (ConvNet) was applied. The outcome demonstrates that GSTC accomplished better as a feature in contrast to FBEs, but FBEs tend to improve performance when merge or incorporated with other feature. The analysis demonstrates that merging or incorporating with other features set is encouraging in achieving a much better accuracy in contrast to a single feature in classifying environmental sounds that is useful in the advancement of the intelligent machine listening frameworks.