{"title":"文本数据反讽检测分类技术的综合研究","authors":"Anandkumar D. Dave, NIKITA PARITOSH DESAI","doi":"10.1109/ICEEOT.2016.7755036","DOIUrl":null,"url":null,"abstract":"During the last decade majority of research has been carried out in the area of sentiment Analysis of textual data available on the web. Sentiment Analysis has its challenges, and one of them is Sarcasm. Classification of sarcastic sentences is a difficult task due to representation variations in the textual form sentences. This can affect many Natural Language Processing based applications. Sarcasm is the kind of representation to convey the different sentiment than presented. In our study we have tried to identify different supervised classification techniques mainly used for sarcasm detection and their features. Also we have analyzed results of the classification techniques, on textual data available in various languages on review related sites, social media sites and micro-blogging sites. Furthermore, for each method studied, our paper presents the analysis of data set generation and feature selection process used thereof. We also carried out preliminary experiment to detect sarcastic sentences in “Hindi” language. We trained SVM classifier with 10X validation with simple Bag-Of-Words as features and TF-IDF as frequency measure of the feature. We found that this simple model based on “bag-of-words” feature accurately classified 50% of sarcastic sentences. Thus, primary experiment has revealed the fact that simple Bag-of-Words are not sufficient for sarcasm detection.","PeriodicalId":383674,"journal":{"name":"2016 International Conference on Electrical, Electronics, and Optimization Techniques (ICEEOT)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-03-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"26","resultStr":"{\"title\":\"A comprehensive study of classification techniques for sarcasm detection on textual data\",\"authors\":\"Anandkumar D. Dave, NIKITA PARITOSH DESAI\",\"doi\":\"10.1109/ICEEOT.2016.7755036\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"During the last decade majority of research has been carried out in the area of sentiment Analysis of textual data available on the web. Sentiment Analysis has its challenges, and one of them is Sarcasm. Classification of sarcastic sentences is a difficult task due to representation variations in the textual form sentences. This can affect many Natural Language Processing based applications. Sarcasm is the kind of representation to convey the different sentiment than presented. In our study we have tried to identify different supervised classification techniques mainly used for sarcasm detection and their features. Also we have analyzed results of the classification techniques, on textual data available in various languages on review related sites, social media sites and micro-blogging sites. Furthermore, for each method studied, our paper presents the analysis of data set generation and feature selection process used thereof. We also carried out preliminary experiment to detect sarcastic sentences in “Hindi” language. We trained SVM classifier with 10X validation with simple Bag-Of-Words as features and TF-IDF as frequency measure of the feature. We found that this simple model based on “bag-of-words” feature accurately classified 50% of sarcastic sentences. Thus, primary experiment has revealed the fact that simple Bag-of-Words are not sufficient for sarcasm detection.\",\"PeriodicalId\":383674,\"journal\":{\"name\":\"2016 International Conference on Electrical, Electronics, and Optimization Techniques (ICEEOT)\",\"volume\":\"11 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-03-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"26\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 International Conference on Electrical, Electronics, and Optimization Techniques (ICEEOT)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICEEOT.2016.7755036\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 International Conference on Electrical, Electronics, and Optimization Techniques (ICEEOT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICEEOT.2016.7755036","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A comprehensive study of classification techniques for sarcasm detection on textual data
During the last decade majority of research has been carried out in the area of sentiment Analysis of textual data available on the web. Sentiment Analysis has its challenges, and one of them is Sarcasm. Classification of sarcastic sentences is a difficult task due to representation variations in the textual form sentences. This can affect many Natural Language Processing based applications. Sarcasm is the kind of representation to convey the different sentiment than presented. In our study we have tried to identify different supervised classification techniques mainly used for sarcasm detection and their features. Also we have analyzed results of the classification techniques, on textual data available in various languages on review related sites, social media sites and micro-blogging sites. Furthermore, for each method studied, our paper presents the analysis of data set generation and feature selection process used thereof. We also carried out preliminary experiment to detect sarcastic sentences in “Hindi” language. We trained SVM classifier with 10X validation with simple Bag-Of-Words as features and TF-IDF as frequency measure of the feature. We found that this simple model based on “bag-of-words” feature accurately classified 50% of sarcastic sentences. Thus, primary experiment has revealed the fact that simple Bag-of-Words are not sufficient for sarcasm detection.