{"title":"情感分析是一门艺术还是一门科学?训练语料库中词汇丰富度对机器学习的影响","authors":"Sanchit Garg, Aashish Saini, Nitika Khanna","doi":"10.1109/ICACCI.2016.7732474","DOIUrl":null,"url":null,"abstract":"Social Media is exploding with data - that can help you derive an optimal marketing strategy in the internet world, engage with your audience on the fly, and protect your reputation from smearing campaigns if it is processed and analyzed in a timely fashion. Digital marketing analysts and data scientists rely on social media analytics tools to deduce customer sentiment from countless opinions and reviews. While numerous attempts have been made to improve their accuracy in the past, yet we know surprisingly little about how accurate their results are. We present an unbiased study of users' tweets and the methods that leverage the available tools & technologies for opinion mining. Our prime focus is on improving the consistency of text classifiers used for linguistic analysis. We also measure the impact of lexical richness in the sample data on the trained algorithm. This paper attempts to improve the reliability of sentiment classification process by the creation of a custom vote classifier using natural language processing techniques and various machine learning algorithms.","PeriodicalId":371328,"journal":{"name":"2016 International Conference on Advances in Computing, Communications and Informatics (ICACCI)","volume":"23 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Is sentiment analysis an art or a science? Impact of lexical richness in training corpus on machine learning\",\"authors\":\"Sanchit Garg, Aashish Saini, Nitika Khanna\",\"doi\":\"10.1109/ICACCI.2016.7732474\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Social Media is exploding with data - that can help you derive an optimal marketing strategy in the internet world, engage with your audience on the fly, and protect your reputation from smearing campaigns if it is processed and analyzed in a timely fashion. Digital marketing analysts and data scientists rely on social media analytics tools to deduce customer sentiment from countless opinions and reviews. While numerous attempts have been made to improve their accuracy in the past, yet we know surprisingly little about how accurate their results are. We present an unbiased study of users' tweets and the methods that leverage the available tools & technologies for opinion mining. Our prime focus is on improving the consistency of text classifiers used for linguistic analysis. We also measure the impact of lexical richness in the sample data on the trained algorithm. This paper attempts to improve the reliability of sentiment classification process by the creation of a custom vote classifier using natural language processing techniques and various machine learning algorithms.\",\"PeriodicalId\":371328,\"journal\":{\"name\":\"2016 International Conference on Advances in Computing, Communications and Informatics (ICACCI)\",\"volume\":\"23 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 International Conference on Advances in Computing, Communications and Informatics (ICACCI)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICACCI.2016.7732474\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 International Conference on Advances in Computing, Communications and Informatics (ICACCI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICACCI.2016.7732474","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Is sentiment analysis an art or a science? Impact of lexical richness in training corpus on machine learning
Social Media is exploding with data - that can help you derive an optimal marketing strategy in the internet world, engage with your audience on the fly, and protect your reputation from smearing campaigns if it is processed and analyzed in a timely fashion. Digital marketing analysts and data scientists rely on social media analytics tools to deduce customer sentiment from countless opinions and reviews. While numerous attempts have been made to improve their accuracy in the past, yet we know surprisingly little about how accurate their results are. We present an unbiased study of users' tweets and the methods that leverage the available tools & technologies for opinion mining. Our prime focus is on improving the consistency of text classifiers used for linguistic analysis. We also measure the impact of lexical richness in the sample data on the trained algorithm. This paper attempts to improve the reliability of sentiment classification process by the creation of a custom vote classifier using natural language processing techniques and various machine learning algorithms.