{"title":"Speaker Identification in Medical Simulation Data Using Fisher Vector Representation","authors":"Shuangshuang Jiang, H. Frigui, A. Calhoun","doi":"10.1109/ICMLA.2015.187","DOIUrl":null,"url":null,"abstract":"We present a robust speaker identification algorithm that uses effective features based on Fisher Vector (FV) representations. First, low-level spectral features are extracted from the training data. Next, we model the data (in the spectral feature space) by a mixture of Gaussian components. Then, we construct FV descriptors based on the deviation of the features from the Gaussian components. We analyze the FV feature representations on speech data with two common classifiers: K-nearest neighbor classifier (KNN) and support vector machines (SVM). The proposed approach is evaluated using audio data sets recorded to simulate medical crises. We show that the proposed FV feature representation approach achieves a significant improvement when compared to the state-of-art methods.","PeriodicalId":288427,"journal":{"name":"2015 IEEE 14th International Conference on Machine Learning and Applications (ICMLA)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 IEEE 14th International Conference on Machine Learning and Applications (ICMLA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICMLA.2015.187","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
We present a robust speaker identification algorithm that uses effective features based on Fisher Vector (FV) representations. First, low-level spectral features are extracted from the training data. Next, we model the data (in the spectral feature space) by a mixture of Gaussian components. Then, we construct FV descriptors based on the deviation of the features from the Gaussian components. We analyze the FV feature representations on speech data with two common classifiers: K-nearest neighbor classifier (KNN) and support vector machines (SVM). The proposed approach is evaluated using audio data sets recorded to simulate medical crises. We show that the proposed FV feature representation approach achieves a significant improvement when compared to the state-of-art methods.