{"title":"基于方形柯西混合分布的声音识别","authors":"A. Ito","doi":"10.1109/SIPROCESS.2016.7888359","DOIUrl":null,"url":null,"abstract":"In this paper, a new probability density distribution, “the square Cauchy mixture distribution” is proposed for recognition of sound. The proposed density is based on the Cauchy distribution and modified so that it has mean and variance. Since the proposed density can be calculated using only simple arithmetic operations, it can be calculated faster than the Gaussian mixture model (GMM). In addition to the definition of the proposed distribution, a parameter estimation method based on the gradient descent is also described. Two experiments were conducted such as recognition of environmental sound and recognition of singer of the singing voice. The results of the experiments revealed that the proposed method was 10% to 15% faster than the GMM with addlog operation and the recognition performance was comparable.","PeriodicalId":142802,"journal":{"name":"2016 IEEE International Conference on Signal and Image Processing (ICSIP)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Recognition of sounds using square cauchy mixture distribution\",\"authors\":\"A. Ito\",\"doi\":\"10.1109/SIPROCESS.2016.7888359\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, a new probability density distribution, “the square Cauchy mixture distribution” is proposed for recognition of sound. The proposed density is based on the Cauchy distribution and modified so that it has mean and variance. Since the proposed density can be calculated using only simple arithmetic operations, it can be calculated faster than the Gaussian mixture model (GMM). In addition to the definition of the proposed distribution, a parameter estimation method based on the gradient descent is also described. Two experiments were conducted such as recognition of environmental sound and recognition of singer of the singing voice. The results of the experiments revealed that the proposed method was 10% to 15% faster than the GMM with addlog operation and the recognition performance was comparable.\",\"PeriodicalId\":142802,\"journal\":{\"name\":\"2016 IEEE International Conference on Signal and Image Processing (ICSIP)\",\"volume\":\"19 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-08-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 IEEE International Conference on Signal and Image Processing (ICSIP)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/SIPROCESS.2016.7888359\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 IEEE International Conference on Signal and Image Processing (ICSIP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SIPROCESS.2016.7888359","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Recognition of sounds using square cauchy mixture distribution
In this paper, a new probability density distribution, “the square Cauchy mixture distribution” is proposed for recognition of sound. The proposed density is based on the Cauchy distribution and modified so that it has mean and variance. Since the proposed density can be calculated using only simple arithmetic operations, it can be calculated faster than the Gaussian mixture model (GMM). In addition to the definition of the proposed distribution, a parameter estimation method based on the gradient descent is also described. Two experiments were conducted such as recognition of environmental sound and recognition of singer of the singing voice. The results of the experiments revealed that the proposed method was 10% to 15% faster than the GMM with addlog operation and the recognition performance was comparable.