Distributed sentiment analysis of an agglutinative language via Spark by applying machine learning methods

2019 15th International Conference on Electronics, Computer and Computation (ICECCO) Pub Date : 2019-12-01 DOI:10.1109/ICECCO48375.2019.9043264

Azamat Serek, A. Issabek, A. Bogdanchikov

引用次数: 3

Abstract

Currently there are a large amount of studies conducted on the field of sentiment analysis worldwide. Due to continuous drastic increase of amount of data, all natural language processing systems require special approach oriented to it. Implementing and evaluating such mechanism was a strong incentive to conduct this research. In this paper, sentiment analysis was implemented on Kazakh language via Spark on the basis of data set taken from Kazakh books. The data was initially unbalanced. There was achieved training F1 measure over 90 %.

查看原文本刊更多论文

应用机器学习方法，通过Spark对黏着语言进行分布式情感分析

目前，世界范围内对情感分析领域进行了大量的研究。由于数据量的不断急剧增加，所有的自然语言处理系统都需要专门的方法来面向它。实施和评价这一机制是开展本研究的强烈动机。本文在选取哈萨克语图书数据集的基础上，利用Spark对哈萨克语进行情感分析。数据最初是不平衡的。达到了训练F1指标的90%以上。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2019 15th International Conference on Electronics, Computer and Computation (ICECCO)

自引率

0.00%

发文量