{"title":"Developing Turkish sentiment lexicon for sentiment analysis using online news media","authors":"Fatih Saglam, H. Sever, Burkay Genç","doi":"10.1109/AICCSA.2016.7945670","DOIUrl":null,"url":null,"abstract":"Internet is a very rich resource of documents that need to be analysed to extract their sentimental values. Sentiment Analysis which is a subfield of Natural Language Processing discipline focuses on this issue. The existence of sentiment lexicons in their own language is a very important resource for scientists studying in sentiment analysis field. Since many studies of sentiment analysis have been conducted on text written in English language, developed methods and resources for English may not produce the desired results in other languages. In Turkish, a rich sentiment lexicon does not exists, such as SentiWordNet for English. In this study, we aimed to develop Turkish sentiment lexicon, and we enhanced an existing lexicon which has 27K Turkish words to 37K words. For quantifying the performance of this enhanced lexicon, we tested both lexicons on domain independent news texts. The accuracy of determining the polarity of news written in Turkish has been increased from 60.6% to 72.2%.","PeriodicalId":448329,"journal":{"name":"2016 IEEE/ACS 13th International Conference of Computer Systems and Applications (AICCSA)","volume":"27 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"15","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 IEEE/ACS 13th International Conference of Computer Systems and Applications (AICCSA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/AICCSA.2016.7945670","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 15
Abstract
Internet is a very rich resource of documents that need to be analysed to extract their sentimental values. Sentiment Analysis which is a subfield of Natural Language Processing discipline focuses on this issue. The existence of sentiment lexicons in their own language is a very important resource for scientists studying in sentiment analysis field. Since many studies of sentiment analysis have been conducted on text written in English language, developed methods and resources for English may not produce the desired results in other languages. In Turkish, a rich sentiment lexicon does not exists, such as SentiWordNet for English. In this study, we aimed to develop Turkish sentiment lexicon, and we enhanced an existing lexicon which has 27K Turkish words to 37K words. For quantifying the performance of this enhanced lexicon, we tested both lexicons on domain independent news texts. The accuracy of determining the polarity of news written in Turkish has been increased from 60.6% to 72.2%.