{"title":"Distributed sentiment analysis of an agglutinative language via Spark by applying machine learning methods","authors":"Azamat Serek, A. Issabek, A. Bogdanchikov","doi":"10.1109/ICECCO48375.2019.9043264","DOIUrl":null,"url":null,"abstract":"Currently there are a large amount of studies conducted on the field of sentiment analysis worldwide. Due to continuous drastic increase of amount of data, all natural language processing systems require special approach oriented to it. Implementing and evaluating such mechanism was a strong incentive to conduct this research. In this paper, sentiment analysis was implemented on Kazakh language via Spark on the basis of data set taken from Kazakh books. The data was initially unbalanced. There was achieved training F1 measure over 90 %.","PeriodicalId":166322,"journal":{"name":"2019 15th International Conference on Electronics, Computer and Computation (ICECCO)","volume":"29 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 15th International Conference on Electronics, Computer and Computation (ICECCO)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICECCO48375.2019.9043264","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3
Abstract
Currently there are a large amount of studies conducted on the field of sentiment analysis worldwide. Due to continuous drastic increase of amount of data, all natural language processing systems require special approach oriented to it. Implementing and evaluating such mechanism was a strong incentive to conduct this research. In this paper, sentiment analysis was implemented on Kazakh language via Spark on the basis of data set taken from Kazakh books. The data was initially unbalanced. There was achieved training F1 measure over 90 %.