TurkSentGraphExp: an inherent graph aware explainability framework from pre-trained LLM for Turkish sentiment analysis.

IF 3.5 4区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

PeerJ Computer Science Pub Date : 2025-03-21 eCollection Date: 2025-01-01 DOI:10.7717/peerj-cs.2729

Yasir Kilic, Cagatay Neftali Tulu

{"title":"TurkSentGraphExp: an inherent graph aware explainability framework from pre-trained LLM for Turkish sentiment analysis.","authors":"Yasir Kilic, Cagatay Neftali Tulu","doi":"10.7717/peerj-cs.2729","DOIUrl":null,"url":null,"abstract":"Sentiment classification is a widely studied problem in natural language processing (NLP) that focuses on identifying the sentiment expressed in text and categorizing it into predefined classes, such as positive, negative, or neutral. As sentiment classification solutions are increasingly integrated into real-world applications, such as analyzing customer feedback in business reviews (e.g., hotel reviews) or monitoring public sentiment on social media, the importance of both their accuracy and explainability has become widely acknowledged. In the Turkish language, this problem becomes more challenging due to the complex agglutinative structure of the language. Many solutions have been proposed in the literature to solve this problem. However, it is observed that the solutions are generally based on black-box models. Therefore the explainability requirement of such artificial intelligence (AI) models has become as important as the accuracy of the model. This has further increased the importance of studies based on the explainability of the AI model's decision. Although most existing studies prefer to explain the model decision in terms of the importance of a single feature/token, this does not provide full explainability due to the complex lexical and semantic relations in the texts. To fill these gaps in the Turkish NLP literature, in this article, we propose a graph-aware explainability solution for Turkish sentiment analysis named TurkSentGraphExp. The solution provides both classification and explainability for sentiment classification of Turkish texts by considering the semantic structure of suffixes, accommodating the agglutinative nature of Turkish, and capturing complex relationships through graph representations. Unlike traditional black-box learning models, this framework leverages an inherent graph representation learning (GRL) model to introduce rational phrase-level explainability. We conduct several experiments to quantify the effectiveness of this framework. The experimental results indicate that the proposed model achieves a 10 to 40% improvement in explainability compared to state-of-the-art methods across varying sparsity levels, further highlighting its effectiveness and robustness. Moreover, the experimental results, supported by a case study, reveal that the semantic relationships arising from affixes in Turkish texts can be identified as part of the model's decision-making process, demonstrating the proposed solution's ability to effectively capture the agglutinative structure of Turkish.","PeriodicalId":54224,"journal":{"name":"PeerJ Computer Science","volume":"11 ","pages":"e2729"},"PeriodicalIF":3.5000,"publicationDate":"2025-03-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11935768/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"PeerJ Computer Science","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.7717/peerj-cs.2729","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/1 0:00:00","PubModel":"eCollection","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Sentiment classification is a widely studied problem in natural language processing (NLP) that focuses on identifying the sentiment expressed in text and categorizing it into predefined classes, such as positive, negative, or neutral. As sentiment classification solutions are increasingly integrated into real-world applications, such as analyzing customer feedback in business reviews (e.g., hotel reviews) or monitoring public sentiment on social media, the importance of both their accuracy and explainability has become widely acknowledged. In the Turkish language, this problem becomes more challenging due to the complex agglutinative structure of the language. Many solutions have been proposed in the literature to solve this problem. However, it is observed that the solutions are generally based on black-box models. Therefore the explainability requirement of such artificial intelligence (AI) models has become as important as the accuracy of the model. This has further increased the importance of studies based on the explainability of the AI model's decision. Although most existing studies prefer to explain the model decision in terms of the importance of a single feature/token, this does not provide full explainability due to the complex lexical and semantic relations in the texts. To fill these gaps in the Turkish NLP literature, in this article, we propose a graph-aware explainability solution for Turkish sentiment analysis named TurkSentGraphExp. The solution provides both classification and explainability for sentiment classification of Turkish texts by considering the semantic structure of suffixes, accommodating the agglutinative nature of Turkish, and capturing complex relationships through graph representations. Unlike traditional black-box learning models, this framework leverages an inherent graph representation learning (GRL) model to introduce rational phrase-level explainability. We conduct several experiments to quantify the effectiveness of this framework. The experimental results indicate that the proposed model achieves a 10 to 40% improvement in explainability compared to state-of-the-art methods across varying sparsity levels, further highlighting its effectiveness and robustness. Moreover, the experimental results, supported by a case study, reveal that the semantic relationships arising from affixes in Turkish texts can be identified as part of the model's decision-making process, demonstrating the proposed solution's ability to effectively capture the agglutinative structure of Turkish.

查看原文本刊更多论文

TurkSentGraphExp：一个固有的图形感知可解释性框架，来自预训练的法学硕士，用于土耳其情感分析。

情感分类是自然语言处理（NLP）中一个被广泛研究的问题，其重点是识别文本中表达的情感并将其分类为预定义的类别，如积极、消极或中性。随着情感分类解决方案越来越多地集成到现实世界的应用中，例如分析商业评论中的客户反馈（例如，酒店评论）或监控社交媒体上的公众情绪，它们的准确性和可解释性的重要性已得到广泛认可。在土耳其语中，由于语言的复杂粘合结构，这个问题变得更具挑战性。文献中提出了许多解决方案来解决这个问题。然而，可以观察到，解决方案通常基于黑盒模型。因此，这种人工智能（AI）模型的可解释性要求与模型的准确性同等重要。这进一步增加了基于人工智能模型决策的可解释性的研究的重要性。尽管大多数现有研究倾向于从单个特征/标记的重要性来解释模型决策，但由于文本中复杂的词汇和语义关系，这并不能提供完全的解释性。为了填补土耳其NLP文献中的这些空白，在本文中，我们为土耳其情感分析提出了一个图感知的可解释性解决方案，名为TurkSentGraphExp。该解决方案通过考虑后缀的语义结构、适应土耳其语的黏着性以及通过图表示捕获复杂关系，为土耳其语文本的情感分类提供了分类和可解释性。与传统的黑盒学习模型不同，该框架利用固有的图表示学习（GRL）模型来引入合理的短语级可解释性。我们进行了几个实验来量化这个框架的有效性。实验结果表明，与最先进的方法相比，该模型在不同稀疏度水平上的可解释性提高了10%至40%，进一步突出了其有效性和鲁棒性。此外，在案例研究的支持下，实验结果表明，土耳其语文本中词缀产生的语义关系可以被识别为模型决策过程的一部分，这表明所提出的解决方案能够有效地捕捉土耳其语的粘合结构。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

PeerJ Computer Science Computer Science-General Computer Science

CiteScore

6.10

自引率

5.30%

发文量

332

审稿时长

10 weeks

期刊介绍： PeerJ Computer Science is the new open access journal covering all subject areas in computer science, with the backing of a prestigious advisory board and more than 300 academic editors.