{"title":"Twitter:用于会话语音情感识别的自动标记数据的新在线来源","authors":"Christopher Hines, V. Sethu, J. Epps","doi":"10.1145/2813524.2813529","DOIUrl":null,"url":null,"abstract":"In the space of affect detection in multimedia, there is a strong demand for more tagged data in order to better understand human emotions, the way they are expressed, and approaches for detecting them automatically. Unfortunately, emotion datasets are typically small due to the manual process of annotating them with emotional labels. In response, we present for the first time the application of automatically tagged Twitter data to the problem of speech emotion recognition (SER). SER has been shown to benefit from the combination of acoustic and linguistic features, albeit when the linguistic training data is from the same database as the test data. Using the presence of emoticons for automatic tagging, we compile a corpus of over 800,000 tweets that is totally independent from our evaluation database. By supplementing an acoustic classifier with linguistic information, we classify the spontaneous content within the USC-IEMOCAP corpus on valence and activation descriptors. With comparison to prior literature, we demonstrate performance improvements for valence of 2% and 6% over an acoustic-only system, using linguistic training data from Twitter and IEMOCAP respectively.","PeriodicalId":197562,"journal":{"name":"Proceedings of the 1st International Workshop on Affect & Sentiment in Multimedia","volume":"34 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Twitter: A New Online Source of Automatically Tagged Data for Conversational Speech Emotion Recognition\",\"authors\":\"Christopher Hines, V. Sethu, J. Epps\",\"doi\":\"10.1145/2813524.2813529\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In the space of affect detection in multimedia, there is a strong demand for more tagged data in order to better understand human emotions, the way they are expressed, and approaches for detecting them automatically. Unfortunately, emotion datasets are typically small due to the manual process of annotating them with emotional labels. In response, we present for the first time the application of automatically tagged Twitter data to the problem of speech emotion recognition (SER). SER has been shown to benefit from the combination of acoustic and linguistic features, albeit when the linguistic training data is from the same database as the test data. Using the presence of emoticons for automatic tagging, we compile a corpus of over 800,000 tweets that is totally independent from our evaluation database. By supplementing an acoustic classifier with linguistic information, we classify the spontaneous content within the USC-IEMOCAP corpus on valence and activation descriptors. With comparison to prior literature, we demonstrate performance improvements for valence of 2% and 6% over an acoustic-only system, using linguistic training data from Twitter and IEMOCAP respectively.\",\"PeriodicalId\":197562,\"journal\":{\"name\":\"Proceedings of the 1st International Workshop on Affect & Sentiment in Multimedia\",\"volume\":\"34 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-10-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 1st International Workshop on Affect & Sentiment in Multimedia\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2813524.2813529\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 1st International Workshop on Affect & Sentiment in Multimedia","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2813524.2813529","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Twitter: A New Online Source of Automatically Tagged Data for Conversational Speech Emotion Recognition
In the space of affect detection in multimedia, there is a strong demand for more tagged data in order to better understand human emotions, the way they are expressed, and approaches for detecting them automatically. Unfortunately, emotion datasets are typically small due to the manual process of annotating them with emotional labels. In response, we present for the first time the application of automatically tagged Twitter data to the problem of speech emotion recognition (SER). SER has been shown to benefit from the combination of acoustic and linguistic features, albeit when the linguistic training data is from the same database as the test data. Using the presence of emoticons for automatic tagging, we compile a corpus of over 800,000 tweets that is totally independent from our evaluation database. By supplementing an acoustic classifier with linguistic information, we classify the spontaneous content within the USC-IEMOCAP corpus on valence and activation descriptors. With comparison to prior literature, we demonstrate performance improvements for valence of 2% and 6% over an acoustic-only system, using linguistic training data from Twitter and IEMOCAP respectively.