{"title":"应用机器学习方法,通过Spark对黏着语言进行分布式情感分析","authors":"Azamat Serek, A. Issabek, A. Bogdanchikov","doi":"10.1109/ICECCO48375.2019.9043264","DOIUrl":null,"url":null,"abstract":"Currently there are a large amount of studies conducted on the field of sentiment analysis worldwide. Due to continuous drastic increase of amount of data, all natural language processing systems require special approach oriented to it. Implementing and evaluating such mechanism was a strong incentive to conduct this research. In this paper, sentiment analysis was implemented on Kazakh language via Spark on the basis of data set taken from Kazakh books. The data was initially unbalanced. There was achieved training F1 measure over 90 %.","PeriodicalId":166322,"journal":{"name":"2019 15th International Conference on Electronics, Computer and Computation (ICECCO)","volume":"29 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Distributed sentiment analysis of an agglutinative language via Spark by applying machine learning methods\",\"authors\":\"Azamat Serek, A. Issabek, A. Bogdanchikov\",\"doi\":\"10.1109/ICECCO48375.2019.9043264\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Currently there are a large amount of studies conducted on the field of sentiment analysis worldwide. Due to continuous drastic increase of amount of data, all natural language processing systems require special approach oriented to it. Implementing and evaluating such mechanism was a strong incentive to conduct this research. In this paper, sentiment analysis was implemented on Kazakh language via Spark on the basis of data set taken from Kazakh books. The data was initially unbalanced. There was achieved training F1 measure over 90 %.\",\"PeriodicalId\":166322,\"journal\":{\"name\":\"2019 15th International Conference on Electronics, Computer and Computation (ICECCO)\",\"volume\":\"29 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 15th International Conference on Electronics, Computer and Computation (ICECCO)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICECCO48375.2019.9043264\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 15th International Conference on Electronics, Computer and Computation (ICECCO)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICECCO48375.2019.9043264","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Distributed sentiment analysis of an agglutinative language via Spark by applying machine learning methods
Currently there are a large amount of studies conducted on the field of sentiment analysis worldwide. Due to continuous drastic increase of amount of data, all natural language processing systems require special approach oriented to it. Implementing and evaluating such mechanism was a strong incentive to conduct this research. In this paper, sentiment analysis was implemented on Kazakh language via Spark on the basis of data set taken from Kazakh books. The data was initially unbalanced. There was achieved training F1 measure over 90 %.