Robert F. Dickerson, Enamul Hoque, Philip Asare, S. Nirjon, J. Stankovic
{"title":"共振:用于改进语音模型分类的混响环境仿真","authors":"Robert F. Dickerson, Enamul Hoque, Philip Asare, S. Nirjon, J. Stankovic","doi":"10.1109/IPSN.2014.6846745","DOIUrl":null,"url":null,"abstract":"Home monitoring systems currently gather information about peoples activities of daily living and information regarding emergencies, however they currently lack the ability to track speech. Practical speech analysis solutions are needed to help monitor ongoing conditions such as depression, as the amount of social interaction and vocal affect is important for assessing mood and well-being. Although there are existing solutions that classify the identity and the mood of a speaker, when the acoustic signals are captured in reverberant environments they perform poorly. In this paper, we present a practical reverberation compensation method called RESONATE, which uses simulated room impulse responses to adapt a training corpus for use in multiple real reverberant rooms. We demonstrate that the system creates robust classifiers that perform within 5 - 10% of baseline accuracy of non-reverberant environments. We demonstrate and evaluate the performance of this matched condition strategy using a public dataset, and also in controlled experiments with six rooms, and two long-term and uncontrolled real deployments. We offer a practical implementation that performs collection, feature extraction, and classification on-node, and training and simulation of training sets on a base station or cloud service.","PeriodicalId":297218,"journal":{"name":"IPSN-14 Proceedings of the 13th International Symposium on Information Processing in Sensor Networks","volume":"60 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-04-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":"{\"title\":\"RESONATE: Reverberation environment simulation for improved classification of speech models\",\"authors\":\"Robert F. Dickerson, Enamul Hoque, Philip Asare, S. Nirjon, J. Stankovic\",\"doi\":\"10.1109/IPSN.2014.6846745\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Home monitoring systems currently gather information about peoples activities of daily living and information regarding emergencies, however they currently lack the ability to track speech. Practical speech analysis solutions are needed to help monitor ongoing conditions such as depression, as the amount of social interaction and vocal affect is important for assessing mood and well-being. Although there are existing solutions that classify the identity and the mood of a speaker, when the acoustic signals are captured in reverberant environments they perform poorly. In this paper, we present a practical reverberation compensation method called RESONATE, which uses simulated room impulse responses to adapt a training corpus for use in multiple real reverberant rooms. We demonstrate that the system creates robust classifiers that perform within 5 - 10% of baseline accuracy of non-reverberant environments. We demonstrate and evaluate the performance of this matched condition strategy using a public dataset, and also in controlled experiments with six rooms, and two long-term and uncontrolled real deployments. We offer a practical implementation that performs collection, feature extraction, and classification on-node, and training and simulation of training sets on a base station or cloud service.\",\"PeriodicalId\":297218,\"journal\":{\"name\":\"IPSN-14 Proceedings of the 13th International Symposium on Information Processing in Sensor Networks\",\"volume\":\"60 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2014-04-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"10\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IPSN-14 Proceedings of the 13th International Symposium on Information Processing in Sensor Networks\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IPSN.2014.6846745\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IPSN-14 Proceedings of the 13th International Symposium on Information Processing in Sensor Networks","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IPSN.2014.6846745","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
RESONATE: Reverberation environment simulation for improved classification of speech models
Home monitoring systems currently gather information about peoples activities of daily living and information regarding emergencies, however they currently lack the ability to track speech. Practical speech analysis solutions are needed to help monitor ongoing conditions such as depression, as the amount of social interaction and vocal affect is important for assessing mood and well-being. Although there are existing solutions that classify the identity and the mood of a speaker, when the acoustic signals are captured in reverberant environments they perform poorly. In this paper, we present a practical reverberation compensation method called RESONATE, which uses simulated room impulse responses to adapt a training corpus for use in multiple real reverberant rooms. We demonstrate that the system creates robust classifiers that perform within 5 - 10% of baseline accuracy of non-reverberant environments. We demonstrate and evaluate the performance of this matched condition strategy using a public dataset, and also in controlled experiments with six rooms, and two long-term and uncontrolled real deployments. We offer a practical implementation that performs collection, feature extraction, and classification on-node, and training and simulation of training sets on a base station or cloud service.