M. Ptaszynski, Rafal Rzepka, S. Oyama, M. Kurihara, K. Araki
{"title":"大型语料库与情感语料库研究综述","authors":"M. Ptaszynski, Rafal Rzepka, S. Oyama, M. Kurihara, K. Araki","doi":"10.11185/IMT.9.429","DOIUrl":null,"url":null,"abstract":"In this paper we present a survey on natural language corpora, with particular focus on corpora of large scale and those applicable to sentiment analysis. Natural language corpora are crucial for training various Software Engineering applications, from part-of-speech taggers and dependency parsers to dialog systems or sentiment analysis software. We compare several natural language corpora created for different languages, analyze their distinctive features and the amount of additional annotations provided by the developers of those corpora.","PeriodicalId":16243,"journal":{"name":"Journal of Information Processing","volume":"8 1","pages":"429-445"},"PeriodicalIF":0.0000,"publicationDate":"2014-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"A Survey on Large Scale Corpora and Emotion Corpora\",\"authors\":\"M. Ptaszynski, Rafal Rzepka, S. Oyama, M. Kurihara, K. Araki\",\"doi\":\"10.11185/IMT.9.429\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper we present a survey on natural language corpora, with particular focus on corpora of large scale and those applicable to sentiment analysis. Natural language corpora are crucial for training various Software Engineering applications, from part-of-speech taggers and dependency parsers to dialog systems or sentiment analysis software. We compare several natural language corpora created for different languages, analyze their distinctive features and the amount of additional annotations provided by the developers of those corpora.\",\"PeriodicalId\":16243,\"journal\":{\"name\":\"Journal of Information Processing\",\"volume\":\"8 1\",\"pages\":\"429-445\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2014-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Information Processing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.11185/IMT.9.429\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"Computer Science\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Information Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.11185/IMT.9.429","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"Computer Science","Score":null,"Total":0}
A Survey on Large Scale Corpora and Emotion Corpora
In this paper we present a survey on natural language corpora, with particular focus on corpora of large scale and those applicable to sentiment analysis. Natural language corpora are crucial for training various Software Engineering applications, from part-of-speech taggers and dependency parsers to dialog systems or sentiment analysis software. We compare several natural language corpora created for different languages, analyze their distinctive features and the amount of additional annotations provided by the developers of those corpora.