Evangelia Pantraki, Constantine Kotropoulos, A. Lanitis
{"title":"基于PARAFAC2的语音年龄间隔和性别预测","authors":"Evangelia Pantraki, Constantine Kotropoulos, A. Lanitis","doi":"10.1109/IWBF.2016.7449694","DOIUrl":null,"url":null,"abstract":"Important problems in speech soft biometrics include the prediction of speaker's age or gender. Here, the aforementioned problems are addressed in the context of utterances collected during a long time period. A unified framework for age and gender prediction is proposed based on Parallel Factor Analysis 2 (PARAFAC2). PARAFAC2 is applied to a collection of three matrices, namely the speech utterance-feature matrix whose columns are the auditory cortical representations, the speaker age matrix whose columns are indicator vectors of suitable dimension, and the speaker gender matrix whose columns are proper indicator vectors associated to speaker's gender. PARAFAC2 is able to reduce the dimensionality of the auditory cortical representations by projecting these representations onto a semantic space dominated by the age and the gender concepts, yielding a sketch (i.e., a feature vector of reduced dimensions). To predict speaker's age interval associated to a test utterance, the speech utterance sketch is pre-multiplied by the left singular vectors of the speaker age matrix. To predict the gender of the speaker who uttered any test utterance, the speech utterance sketch is pre-multiplied by the left singular vectors of the speaker gender matrix. In both cases, a ranking vector is obtained that is exploited for decision making. Promising results are demonstrated, when the aforementioned framework is applied to the Trinity College Dublin Speaker Ageing Database.","PeriodicalId":282164,"journal":{"name":"2016 4th International Conference on Biometrics and Forensics (IWBF)","volume":"31 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-03-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Age interval and gender prediction using PARAFAC2 applied to speech utterances\",\"authors\":\"Evangelia Pantraki, Constantine Kotropoulos, A. Lanitis\",\"doi\":\"10.1109/IWBF.2016.7449694\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Important problems in speech soft biometrics include the prediction of speaker's age or gender. Here, the aforementioned problems are addressed in the context of utterances collected during a long time period. A unified framework for age and gender prediction is proposed based on Parallel Factor Analysis 2 (PARAFAC2). PARAFAC2 is applied to a collection of three matrices, namely the speech utterance-feature matrix whose columns are the auditory cortical representations, the speaker age matrix whose columns are indicator vectors of suitable dimension, and the speaker gender matrix whose columns are proper indicator vectors associated to speaker's gender. PARAFAC2 is able to reduce the dimensionality of the auditory cortical representations by projecting these representations onto a semantic space dominated by the age and the gender concepts, yielding a sketch (i.e., a feature vector of reduced dimensions). To predict speaker's age interval associated to a test utterance, the speech utterance sketch is pre-multiplied by the left singular vectors of the speaker age matrix. To predict the gender of the speaker who uttered any test utterance, the speech utterance sketch is pre-multiplied by the left singular vectors of the speaker gender matrix. In both cases, a ranking vector is obtained that is exploited for decision making. Promising results are demonstrated, when the aforementioned framework is applied to the Trinity College Dublin Speaker Ageing Database.\",\"PeriodicalId\":282164,\"journal\":{\"name\":\"2016 4th International Conference on Biometrics and Forensics (IWBF)\",\"volume\":\"31 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-03-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 4th International Conference on Biometrics and Forensics (IWBF)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IWBF.2016.7449694\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 4th International Conference on Biometrics and Forensics (IWBF)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IWBF.2016.7449694","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Age interval and gender prediction using PARAFAC2 applied to speech utterances
Important problems in speech soft biometrics include the prediction of speaker's age or gender. Here, the aforementioned problems are addressed in the context of utterances collected during a long time period. A unified framework for age and gender prediction is proposed based on Parallel Factor Analysis 2 (PARAFAC2). PARAFAC2 is applied to a collection of three matrices, namely the speech utterance-feature matrix whose columns are the auditory cortical representations, the speaker age matrix whose columns are indicator vectors of suitable dimension, and the speaker gender matrix whose columns are proper indicator vectors associated to speaker's gender. PARAFAC2 is able to reduce the dimensionality of the auditory cortical representations by projecting these representations onto a semantic space dominated by the age and the gender concepts, yielding a sketch (i.e., a feature vector of reduced dimensions). To predict speaker's age interval associated to a test utterance, the speech utterance sketch is pre-multiplied by the left singular vectors of the speaker age matrix. To predict the gender of the speaker who uttered any test utterance, the speech utterance sketch is pre-multiplied by the left singular vectors of the speaker gender matrix. In both cases, a ranking vector is obtained that is exploited for decision making. Promising results are demonstrated, when the aforementioned framework is applied to the Trinity College Dublin Speaker Ageing Database.