Anshul Thakur, V. Abrol, Pulkit Sharma, Padmanabhan Rajan
{"title":"基于rsamnyi熵的互信息半监督鸟类发声分割","authors":"Anshul Thakur, V. Abrol, Pulkit Sharma, Padmanabhan Rajan","doi":"10.1109/MLSP.2017.8168118","DOIUrl":null,"url":null,"abstract":"In this paper we describe a semi-supervised algorithm to segment bird vocalizations using matrix factorization and Rényi entropy based mutual information. Singular value decomposition (SVD) is applied on pooled time-frequency representations of bird vocalizations to learn basis vectors. By utilizing only a few of the bases, a compact feature representation is obtained for input test data. Rényi entropy based mutual information is calculated between feature representations of consecutive frames. After some simple post-processing, a threshold is used to reliably distinguish bird vocalizations from other sounds. The algorithm is evaluated on the field recordings of different bird species and different SNR conditions. The results highlight the effectiveness of the proposed method in all SNR conditions, improvements over other methods, and its generality.","PeriodicalId":6542,"journal":{"name":"2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP)","volume":"101 ","pages":"1-6"},"PeriodicalIF":0.0000,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"11","resultStr":"{\"title\":\"Rényi entropy based mutual information for semi-supervised bird vocalization segmentation\",\"authors\":\"Anshul Thakur, V. Abrol, Pulkit Sharma, Padmanabhan Rajan\",\"doi\":\"10.1109/MLSP.2017.8168118\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper we describe a semi-supervised algorithm to segment bird vocalizations using matrix factorization and Rényi entropy based mutual information. Singular value decomposition (SVD) is applied on pooled time-frequency representations of bird vocalizations to learn basis vectors. By utilizing only a few of the bases, a compact feature representation is obtained for input test data. Rényi entropy based mutual information is calculated between feature representations of consecutive frames. After some simple post-processing, a threshold is used to reliably distinguish bird vocalizations from other sounds. The algorithm is evaluated on the field recordings of different bird species and different SNR conditions. The results highlight the effectiveness of the proposed method in all SNR conditions, improvements over other methods, and its generality.\",\"PeriodicalId\":6542,\"journal\":{\"name\":\"2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP)\",\"volume\":\"101 \",\"pages\":\"1-6\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"11\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/MLSP.2017.8168118\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/MLSP.2017.8168118","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Rényi entropy based mutual information for semi-supervised bird vocalization segmentation
In this paper we describe a semi-supervised algorithm to segment bird vocalizations using matrix factorization and Rényi entropy based mutual information. Singular value decomposition (SVD) is applied on pooled time-frequency representations of bird vocalizations to learn basis vectors. By utilizing only a few of the bases, a compact feature representation is obtained for input test data. Rényi entropy based mutual information is calculated between feature representations of consecutive frames. After some simple post-processing, a threshold is used to reliably distinguish bird vocalizations from other sounds. The algorithm is evaluated on the field recordings of different bird species and different SNR conditions. The results highlight the effectiveness of the proposed method in all SNR conditions, improvements over other methods, and its generality.