开发土耳其语情感词典,用于在线新闻媒体的情感分析

Fatih Saglam, H. Sever, Burkay Genç
{"title":"开发土耳其语情感词典,用于在线新闻媒体的情感分析","authors":"Fatih Saglam, H. Sever, Burkay Genç","doi":"10.1109/AICCSA.2016.7945670","DOIUrl":null,"url":null,"abstract":"Internet is a very rich resource of documents that need to be analysed to extract their sentimental values. Sentiment Analysis which is a subfield of Natural Language Processing discipline focuses on this issue. The existence of sentiment lexicons in their own language is a very important resource for scientists studying in sentiment analysis field. Since many studies of sentiment analysis have been conducted on text written in English language, developed methods and resources for English may not produce the desired results in other languages. In Turkish, a rich sentiment lexicon does not exists, such as SentiWordNet for English. In this study, we aimed to develop Turkish sentiment lexicon, and we enhanced an existing lexicon which has 27K Turkish words to 37K words. For quantifying the performance of this enhanced lexicon, we tested both lexicons on domain independent news texts. The accuracy of determining the polarity of news written in Turkish has been increased from 60.6% to 72.2%.","PeriodicalId":448329,"journal":{"name":"2016 IEEE/ACS 13th International Conference of Computer Systems and Applications (AICCSA)","volume":"27 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"15","resultStr":"{\"title\":\"Developing Turkish sentiment lexicon for sentiment analysis using online news media\",\"authors\":\"Fatih Saglam, H. Sever, Burkay Genç\",\"doi\":\"10.1109/AICCSA.2016.7945670\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Internet is a very rich resource of documents that need to be analysed to extract their sentimental values. Sentiment Analysis which is a subfield of Natural Language Processing discipline focuses on this issue. The existence of sentiment lexicons in their own language is a very important resource for scientists studying in sentiment analysis field. Since many studies of sentiment analysis have been conducted on text written in English language, developed methods and resources for English may not produce the desired results in other languages. In Turkish, a rich sentiment lexicon does not exists, such as SentiWordNet for English. In this study, we aimed to develop Turkish sentiment lexicon, and we enhanced an existing lexicon which has 27K Turkish words to 37K words. For quantifying the performance of this enhanced lexicon, we tested both lexicons on domain independent news texts. The accuracy of determining the polarity of news written in Turkish has been increased from 60.6% to 72.2%.\",\"PeriodicalId\":448329,\"journal\":{\"name\":\"2016 IEEE/ACS 13th International Conference of Computer Systems and Applications (AICCSA)\",\"volume\":\"27 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"15\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 IEEE/ACS 13th International Conference of Computer Systems and Applications (AICCSA)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/AICCSA.2016.7945670\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 IEEE/ACS 13th International Conference of Computer Systems and Applications (AICCSA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/AICCSA.2016.7945670","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 15

摘要

互联网是一个非常丰富的文件资源,需要分析以提取其情感价值。情感分析是自然语言处理学科的一个分支领域。母语中情感词汇的存在是情感分析领域科学家研究的重要资源。由于许多情感分析的研究都是针对英语文本进行的,因此针对英语开发的方法和资源在其他语言中可能无法产生预期的结果。在土耳其语中,不存在丰富的情感词典,如英语的SentiWordNet。在本研究中,我们的目标是开发土耳其语情感词汇,我们将一个已有的27K土耳其语词汇扩充到37K。为了量化这个增强的词典的性能,我们在独立于领域的新闻文本上测试了这两个词典。判断土耳其语新闻极性的准确率从60.6%提高到72.2%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Developing Turkish sentiment lexicon for sentiment analysis using online news media
Internet is a very rich resource of documents that need to be analysed to extract their sentimental values. Sentiment Analysis which is a subfield of Natural Language Processing discipline focuses on this issue. The existence of sentiment lexicons in their own language is a very important resource for scientists studying in sentiment analysis field. Since many studies of sentiment analysis have been conducted on text written in English language, developed methods and resources for English may not produce the desired results in other languages. In Turkish, a rich sentiment lexicon does not exists, such as SentiWordNet for English. In this study, we aimed to develop Turkish sentiment lexicon, and we enhanced an existing lexicon which has 27K Turkish words to 37K words. For quantifying the performance of this enhanced lexicon, we tested both lexicons on domain independent news texts. The accuracy of determining the polarity of news written in Turkish has been increased from 60.6% to 72.2%.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信