Zhongyi Qiu, Kangyi Qiu, Hanjia Lyu, Wei Xiong, Jiebo Luo
{"title":"Semantics Preserving Emoji Recommendation with Large Language Models","authors":"Zhongyi Qiu, Kangyi Qiu, Hanjia Lyu, Wei Xiong, Jiebo Luo","doi":"arxiv-2409.10760","DOIUrl":null,"url":null,"abstract":"Emojis have become an integral part of digital communication, enriching text\nby conveying emotions, tone, and intent. Existing emoji recommendation methods\nare primarily evaluated based on their ability to match the exact emoji a user\nchooses in the original text. However, they ignore the essence of users'\nbehavior on social media in that each text can correspond to multiple\nreasonable emojis. To better assess a model's ability to align with such\nreal-world emoji usage, we propose a new semantics preserving evaluation\nframework for emoji recommendation, which measures a model's ability to\nrecommend emojis that maintain the semantic consistency with the user's text.\nTo evaluate how well a model preserves semantics, we assess whether the\npredicted affective state, demographic profile, and attitudinal stance of the\nuser remain unchanged. If these attributes are preserved, we consider the\nrecommended emojis to have maintained the original semantics. The advanced\nabilities of Large Language Models (LLMs) in understanding and generating\nnuanced, contextually relevant output make them well-suited for handling the\ncomplexities of semantics preserving emoji recommendation. To this end, we\nconstruct a comprehensive benchmark to systematically assess the performance of\nsix proprietary and open-source LLMs using different prompting techniques on\nour task. Our experiments demonstrate that GPT-4o outperforms other LLMs,\nachieving a semantics preservation score of 79.23%. Additionally, we conduct\ncase studies to analyze model biases in downstream classification tasks and\nevaluate the diversity of the recommended emojis.","PeriodicalId":501032,"journal":{"name":"arXiv - CS - Social and Information Networks","volume":"34 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Social and Information Networks","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.10760","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Emojis have become an integral part of digital communication, enriching text
by conveying emotions, tone, and intent. Existing emoji recommendation methods
are primarily evaluated based on their ability to match the exact emoji a user
chooses in the original text. However, they ignore the essence of users'
behavior on social media in that each text can correspond to multiple
reasonable emojis. To better assess a model's ability to align with such
real-world emoji usage, we propose a new semantics preserving evaluation
framework for emoji recommendation, which measures a model's ability to
recommend emojis that maintain the semantic consistency with the user's text.
To evaluate how well a model preserves semantics, we assess whether the
predicted affective state, demographic profile, and attitudinal stance of the
user remain unchanged. If these attributes are preserved, we consider the
recommended emojis to have maintained the original semantics. The advanced
abilities of Large Language Models (LLMs) in understanding and generating
nuanced, contextually relevant output make them well-suited for handling the
complexities of semantics preserving emoji recommendation. To this end, we
construct a comprehensive benchmark to systematically assess the performance of
six proprietary and open-source LLMs using different prompting techniques on
our task. Our experiments demonstrate that GPT-4o outperforms other LLMs,
achieving a semantics preservation score of 79.23%. Additionally, we conduct
case studies to analyze model biases in downstream classification tasks and
evaluate the diversity of the recommended emojis.