利用优化的双向长短期记忆进行讽刺检测

IF 3.1 4区计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Knowledge and Information Systems Pub Date : 2024-09-06 DOI:10.1007/s10115-024-02210-7

Vidyullatha Sukhavasi, Venkatrama Phani kumar Sistla, Venkatesulu Dondeti

{"title":"利用优化的双向长短期记忆进行讽刺检测","authors":"Vidyullatha Sukhavasi, Venkatrama Phani kumar Sistla, Venkatesulu Dondeti","doi":"10.1007/s10115-024-02210-7","DOIUrl":null,"url":null,"abstract":"<p>In the current era, the number of social network users continues to increase day by day due to the vast usage of interactive social networking sites like Twitter, Facebook, Instagram, etc. On these sites, users generate posts, whereas the attitude of followers towards factor utilization like situation, sound, feeling, and so on can be analysed. But most people feel difficult to analyse feelings accurately, which is one of the most difficult problems in natural language processing. Some people expose their opinions with different sole meanings, and this sophisticated form of expressing sentiments through irony or mockery is termed sarcasm. The sarcastic comments, tweets or feedback can mislead data mining activities and may result in inaccurate predictions. Several existing models are used for sarcasm detection, but they have resulted in inaccuracy issues, huge time consumption, less training ability, high overfitting issues, etc. To overcome these limitations, an effective model is introduced in this research to detect sarcasm. Initially, the data are collected from publicly available sarcasmania and Generic sarcasm-Not sarcasm (Gen-Sarc-Notsarc) datasets. The collected data are pre-processed using stemming and stop word removal procedures. The features are extracted using the inverse filtering (IF) model through hash index creation, keyword matching and ranking. The optimal features are selected using adaptive search and rescue (ASAR) optimization algorithm. To enhance the accuracy of sarcasm detection, an optimized Bi-LSTM-based deep learning model is proposed by integrating Bi-directional long short-term memory (Bi-LSTM) with group teaching optimization (GTO). Also, the LSTM + GTO model is proposed to compare its performance with the Bi-LSTM + GTO model. The proposed models are compared with existing classifier approaches to prove the model’s superiority using PYTHON. The accuracy of 98.24% and 98.36% are attained for sarcasmania and Gen-Sarc-Notsarc datasets.</p>","PeriodicalId":54749,"journal":{"name":"Knowledge and Information Systems","volume":"15 1","pages":""},"PeriodicalIF":3.1000,"publicationDate":"2024-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Sarcasm detection using optimized bi-directional long short-term memory\",\"authors\":\"Vidyullatha Sukhavasi, Venkatrama Phani kumar Sistla, Venkatesulu Dondeti\",\"doi\":\"10.1007/s10115-024-02210-7\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>In the current era, the number of social network users continues to increase day by day due to the vast usage of interactive social networking sites like Twitter, Facebook, Instagram, etc. On these sites, users generate posts, whereas the attitude of followers towards factor utilization like situation, sound, feeling, and so on can be analysed. But most people feel difficult to analyse feelings accurately, which is one of the most difficult problems in natural language processing. Some people expose their opinions with different sole meanings, and this sophisticated form of expressing sentiments through irony or mockery is termed sarcasm. The sarcastic comments, tweets or feedback can mislead data mining activities and may result in inaccurate predictions. Several existing models are used for sarcasm detection, but they have resulted in inaccuracy issues, huge time consumption, less training ability, high overfitting issues, etc. To overcome these limitations, an effective model is introduced in this research to detect sarcasm. Initially, the data are collected from publicly available sarcasmania and Generic sarcasm-Not sarcasm (Gen-Sarc-Notsarc) datasets. The collected data are pre-processed using stemming and stop word removal procedures. The features are extracted using the inverse filtering (IF) model through hash index creation, keyword matching and ranking. The optimal features are selected using adaptive search and rescue (ASAR) optimization algorithm. To enhance the accuracy of sarcasm detection, an optimized Bi-LSTM-based deep learning model is proposed by integrating Bi-directional long short-term memory (Bi-LSTM) with group teaching optimization (GTO). Also, the LSTM + GTO model is proposed to compare its performance with the Bi-LSTM + GTO model. The proposed models are compared with existing classifier approaches to prove the model’s superiority using PYTHON. The accuracy of 98.24% and 98.36% are attained for sarcasmania and Gen-Sarc-Notsarc datasets.</p>\",\"PeriodicalId\":54749,\"journal\":{\"name\":\"Knowledge and Information Systems\",\"volume\":\"15 1\",\"pages\":\"\"},\"PeriodicalIF\":3.1000,\"publicationDate\":\"2024-09-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Knowledge and Information Systems\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1007/s10115-024-02210-7\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Knowledge and Information Systems","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s10115-024-02210-7","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

当今时代，由于 Twitter、Facebook、Instagram 等互动社交网站的广泛使用，社交网络用户数量与日俱增。在这些网站上，用户发布帖子，而关注者对情境、声音、感觉等因素的态度则可以被分析出来。但大多数人都觉得很难准确分析感受，这也是自然语言处理中最难解决的问题之一。有些人在表达自己的观点时会带有不同的唯一含义，这种通过讽刺或嘲弄来表达情感的复杂形式被称为讽刺。讽刺性评论、推特或反馈会误导数据挖掘活动，并可能导致不准确的预测。现有的一些模型被用于讽刺检测，但这些模型存在不准确、耗时长、训练能力差、过拟合问题严重等问题。为了克服这些局限性，本研究引入了一个有效的模型来检测讽刺语言。最初，我们从公开的讽刺狂热（sarcasmania）和通用讽刺-非讽刺（Gen-Sarc-Notsarc）数据集中收集数据。收集到的数据使用词干化和停止词去除程序进行预处理。通过哈希索引创建、关键词匹配和排序，使用反向过滤（IF）模型提取特征。使用自适应搜索和救援（ASAR）优化算法选择最佳特征。为了提高讽刺检测的准确性，通过将双向长短期记忆（Bi-LSTM）与分组教学优化（GTO）相结合，提出了一种基于 Bi-LSTM 的优化深度学习模型。此外，还提出了 LSTM + GTO 模型，以比较其与 Bi-LSTM + GTO 模型的性能。为了证明模型的优越性，我们使用PYTHON 将提出的模型与现有的分类器方法进行了比较。sarcasmania 和 Gen-Sarc-Notsarc 数据集的准确率分别达到 98.24% 和 98.36%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

Sarcasm detection using optimized bi-directional long short-term memory

查看原文本刊更多论文

Sarcasm detection using optimized bi-directional long short-term memory

In the current era, the number of social network users continues to increase day by day due to the vast usage of interactive social networking sites like Twitter, Facebook, Instagram, etc. On these sites, users generate posts, whereas the attitude of followers towards factor utilization like situation, sound, feeling, and so on can be analysed. But most people feel difficult to analyse feelings accurately, which is one of the most difficult problems in natural language processing. Some people expose their opinions with different sole meanings, and this sophisticated form of expressing sentiments through irony or mockery is termed sarcasm. The sarcastic comments, tweets or feedback can mislead data mining activities and may result in inaccurate predictions. Several existing models are used for sarcasm detection, but they have resulted in inaccuracy issues, huge time consumption, less training ability, high overfitting issues, etc. To overcome these limitations, an effective model is introduced in this research to detect sarcasm. Initially, the data are collected from publicly available sarcasmania and Generic sarcasm-Not sarcasm (Gen-Sarc-Notsarc) datasets. The collected data are pre-processed using stemming and stop word removal procedures. The features are extracted using the inverse filtering (IF) model through hash index creation, keyword matching and ranking. The optimal features are selected using adaptive search and rescue (ASAR) optimization algorithm. To enhance the accuracy of sarcasm detection, an optimized Bi-LSTM-based deep learning model is proposed by integrating Bi-directional long short-term memory (Bi-LSTM) with group teaching optimization (GTO). Also, the LSTM + GTO model is proposed to compare its performance with the Bi-LSTM + GTO model. The proposed models are compared with existing classifier approaches to prove the model’s superiority using PYTHON. The accuracy of 98.24% and 98.36% are attained for sarcasmania and Gen-Sarc-Notsarc datasets.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Knowledge and Information Systems 工程技术-计算机：人工智能

CiteScore

5.70

自引率

7.40%

发文量

152

审稿时长

7.2 months

期刊介绍： Knowledge and Information Systems (KAIS) provides an international forum for researchers and professionals to share their knowledge and report new advances on all topics related to knowledge systems and advanced information systems. This monthly peer-reviewed archival journal publishes state-of-the-art research reports on emerging topics in KAIS, reviews of important techniques in related areas, and application papers of interest to a general readership.