Natural Language Processing Journal最新文献_第7页

Generating dynamic lip-syncing using target audio in a multimedia environment 在多媒体环境中使用目标音频生成动态唇语同步

Natural Language Processing Journal Pub Date : 2024-06-10 DOI: 10.1016/j.nlp.2024.100084

Diksha Pawar, Prashant Borde, Pravin Yannawar

{"title":"Generating dynamic lip-syncing using target audio in a multimedia environment","authors":"Diksha Pawar, Prashant Borde, Pravin Yannawar","doi":"10.1016/j.nlp.2024.100084","DOIUrl":"https://doi.org/10.1016/j.nlp.2024.100084","url":null,"abstract":"<div><p>The presented research focuses on the challenging task of creating lip-sync facial videos that align with a specified target speech segment. A novel deep-learning model has been developed to produce precise synthetic lip movements corresponding to the speech extracted from an audio source. Consequently, there are instances where portions of the visual data may fall out of sync with the updated audio and this challenge is handled through, a novel strategy, leveraging insights from a robust lip-sync discriminator. Additionally, this study introduces fresh criteria and evaluation benchmarks for assessing lip synchronization in unconstrained videos. LipChanger demonstrates improved PSNR values, indicative of enhanced image quality. Furthermore, it exhibits highly accurate lip synthesis, as evidenced by lower LMD values and higher SSIM values. These outcomes suggest that the LipChanger approach holds significant potential for enhancing lip synchronization in talking face videos, resulting in more realistic lip movements. The proposed LipChanger model and its associated evaluation benchmarks show promise and could potentially contribute to advancements in lip-sync technology for unconstrained talking face videos.</p></div>","PeriodicalId":100944,"journal":{"name":"Natural Language Processing Journal","volume":"8 ","pages":"Article 100084"},"PeriodicalIF":0.0,"publicationDate":"2024-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2949719124000323/pdfft?md5=84516d2e22e4420f113635a3914da66f&pid=1-s2.0-S2949719124000323-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141328741","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Evaluating LLMs on document-based QA: Exact answer selection and numerical extraction using CogTale dataset 评估基于文档的 QA 的 LLM：使用 CogTale 数据集进行精确答案选择和数字提取

Natural Language Processing Journal Pub Date : 2024-06-08 DOI: 10.1016/j.nlp.2024.100083

Zafaryab Rasool , Stefanus Kurniawan , Sherwin Balugo , Scott Barnett , Rajesh Vasa , Courtney Chesser , Benjamin M. Hampstead , Sylvie Belleville , Kon Mouzakis , Alex Bahar-Fuchs

{"title":"Evaluating LLMs on document-based QA: Exact answer selection and numerical extraction using CogTale dataset","authors":"Zafaryab Rasool , Stefanus Kurniawan , Sherwin Balugo , Scott Barnett , Rajesh Vasa , Courtney Chesser , Benjamin M. Hampstead , Sylvie Belleville , Kon Mouzakis , Alex Bahar-Fuchs","doi":"10.1016/j.nlp.2024.100083","DOIUrl":"https://doi.org/10.1016/j.nlp.2024.100083","url":null,"abstract":"<div><p>Document-based Question-Answering (QA) tasks are crucial for precise information retrieval. While some existing work focus on evaluating large language model’s (LLMs) performance on retrieving and answering questions from documents, assessing the LLMs performance on QA types that require exact answer selection from predefined options and numerical extraction is yet to be fully assessed. In this paper, we specifically focus on this underexplored context and conduct empirical analysis of LLMs (GPT-4 and GPT-3.5) on question types, including single-choice, yes–no, multiple-choice, and number extraction questions from documents. We use the CogTale dataset for evaluation, which provide human expert-tagged responses, offering a robust benchmark for precision and factual grounding. We found that LLMs, particularly GPT-4, can precisely answer many single-choice and yes–no questions given relevant context, demonstrating their efficacy in information retrieval tasks. However, their performance diminishes when confronted with multiple-choice and number extraction formats, lowering the overall performance of the models on this task, indicating that these models may not yet be sufficiently reliable for the task. This limits the applications of LLMs on applications demanding precise information extraction and inference from documents, such as meta-analysis tasks. Our work offers a framework for ongoing dataset evaluation, ensuring that LLM applications for information retrieval and document analysis continue to meet evolving standards.</p></div>","PeriodicalId":100944,"journal":{"name":"Natural Language Processing Journal","volume":"8 ","pages":"Article 100083"},"PeriodicalIF":0.0,"publicationDate":"2024-06-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2949719124000311/pdfft?md5=99895c63882405f8b66929d134da8f31&pid=1-s2.0-S2949719124000311-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141438456","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Topic specificity: A descriptive metric for algorithm selection and finding the right number of topics 主题特异性：用于选择算法和寻找合适主题数量的描述性指标

Natural Language Processing Journal Pub Date : 2024-06-04 DOI: 10.1016/j.nlp.2024.100082

Emil Rijcken , Kalliopi Zervanou , Pablo Mosteiro , Floortje Scheepers , Marco Spruit , Uzay Kaymak

{"title":"Topic specificity: A descriptive metric for algorithm selection and finding the right number of topics","authors":"Emil Rijcken , Kalliopi Zervanou , Pablo Mosteiro , Floortje Scheepers , Marco Spruit , Uzay Kaymak","doi":"10.1016/j.nlp.2024.100082","DOIUrl":"10.1016/j.nlp.2024.100082","url":null,"abstract":"<div><p>Topic modeling is a prevalent task for discovering the latent structure of a corpus, identifying a set of topics that represent the underlying themes of the documents. Despite its popularity, issues with its evaluation metric, the coherence score, result in two common challenges: <em>algorithm selection</em> and <em>determining the number of topics</em>. To address these two issues, we propose the <em>topic specificity</em> metric, which captures the relative frequency of topic words in the corpus and is used as a proxy for the specificity of a word. In this work, we formulate the metric firstly. Secondly, we demonstrate that algorithms train topics at different specificity levels. This insight can be used to address algorithm selection as it allows users to distinguish and select algorithms with the desired specificity level. Lastly, we show a strictly positive monotonic correlation between the topic specificity and the number of topics for LDA, FLSA-W, NMF and LSI. This correlation can be used to address the selection of the number of topics, as it allows users to adjust the number of topics to their desired level. Moreover, our descriptive metric provides a new perspective to characterize topic models, allowing them to be understood better.</p></div>","PeriodicalId":100944,"journal":{"name":"Natural Language Processing Journal","volume":"8 ","pages":"Article 100082"},"PeriodicalIF":0.0,"publicationDate":"2024-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S294971912400030X/pdfft?md5=af15e6c29d867b39aae58eedf84c6eda&pid=1-s2.0-S294971912400030X-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141406979","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Summarizing long scientific documents through hierarchical structure extraction 通过层次结构提取总结长篇科学文献

Natural Language Processing Journal Pub Date : 2024-05-29 DOI: 10.1016/j.nlp.2024.100080

Grishma Sharma , Deepak Sharma , M. Sasikumar

{"title":"Summarizing long scientific documents through hierarchical structure extraction","authors":"Grishma Sharma , Deepak Sharma , M. Sasikumar","doi":"10.1016/j.nlp.2024.100080","DOIUrl":"https://doi.org/10.1016/j.nlp.2024.100080","url":null,"abstract":"<div><p>In the realm of academia, staying updated with the latest advancements has become increasingly difficult due to the rapid rise in scientific publications. Text summarization emerges as a solution to this challenge by distilling essential contributions into concise summaries. Despite the structured nature of scientific documents, current summarization techniques often overlook this valuable structural information. Our proposed method addresses this gap through an unsupervised, extractive, user preference-based, and hierarchical iterative graph-based ranking algorithm for summarizing long scientific documents. Unlike existing approaches, our method operates by leveraging the inherent structural information within scientific texts to generate diverse summaries tailored to user preferences. To assess the efficiency of our approach, we conducted evaluations on two distinct long document datasets: ScisummNet and a custom dataset comprising papers from esteemed journals and conferences with human-extracted sentences as gold summaries. The results obtained using automatic evaluation metric Rouge scores as well as human evaluation, demonstrate that our method performs better than other well-known unsupervised algorithms. This emphasizes the need for structural information in text summarization, enabling more effective and customizable solutions.</p></div>","PeriodicalId":100944,"journal":{"name":"Natural Language Processing Journal","volume":"8 ","pages":"Article 100080"},"PeriodicalIF":0.0,"publicationDate":"2024-05-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2949719124000281/pdfft?md5=7e249fba3a7dd6613770889389366f05&pid=1-s2.0-S2949719124000281-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141291974","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Decoding depression: Analyzing social network insights for depression severity assessment with transformers and explainable AI 解码抑郁症：利用变换器和可解释人工智能分析社交网络洞察，评估抑郁症严重程度

Natural Language Processing Journal Pub Date : 2024-05-13 DOI: 10.1016/j.nlp.2024.100079

Tasnim Ahmed , Shahriar Ivan , Ahnaf Munir , Sabbir Ahmed

{"title":"Decoding depression: Analyzing social network insights for depression severity assessment with transformers and explainable AI","authors":"Tasnim Ahmed , Shahriar Ivan , Ahnaf Munir , Sabbir Ahmed","doi":"10.1016/j.nlp.2024.100079","DOIUrl":"10.1016/j.nlp.2024.100079","url":null,"abstract":"<div><p>Depression is a mental state characterized by recurrent feelings of melancholy, hopelessness, and disinterest in activities, having a significant negative influence on everyday functioning and general well-being. Millions of users express their thoughts and emotions on social media platforms, which can be used as a rich source of data for early detection of depression. In this connection, this work leverages an ensemble of transformer-based architectures for quantifying the severity of depression from social media posts into four categories — non-depressed, mild, moderate, and severe. At first, a diverse range of preprocessing techniques is employed to enhance the quality and relevance of the input. Then, the preprocessed samples are passed through three variants of transformer-based models, namely vanilla BERT, BERTweet, and ALBERT, for generating predictions, which are combined using a weighted soft-voting approach. We conduct a comprehensive explainability analysis to gain deeper insights into the decision-making process, examining both local and global perspectives. Furthermore, to the best of our knowledge, we are the first ones to explore the extent to which a Large Language Model (LLM) like ‘ChatGPT’ can perform this task. Evaluation of the model on the publicly available ‘DEPTWEET’ dataset produces state-of-the-art performance with 13.5% improvement in AUC–ROC score.</p></div>","PeriodicalId":100944,"journal":{"name":"Natural Language Processing Journal","volume":"7 ","pages":"Article 100079"},"PeriodicalIF":0.0,"publicationDate":"2024-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S294971912400027X/pdfft?md5=5d658d840266d01d808f9f0280aa58df&pid=1-s2.0-S294971912400027X-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141047775","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

SIDU-TXT: An XAI algorithm for NLP with a holistic assessment approach SIDU-TXT：采用整体评估方法的 XAI NLP 算法

Natural Language Processing Journal Pub Date : 2024-05-09 DOI: 10.1016/j.nlp.2024.100078

Mohammad N.S. Jahromi , Satya M. Muddamsetty , Asta Sofie Stage Jarlner , Anna Murphy Høgenhaug , Thomas Gammeltoft-Hansen , Thomas B. Moeslund

{"title":"SIDU-TXT: An XAI algorithm for NLP with a holistic assessment approach","authors":"Mohammad N.S. Jahromi , Satya M. Muddamsetty , Asta Sofie Stage Jarlner , Anna Murphy Høgenhaug , Thomas Gammeltoft-Hansen , Thomas B. Moeslund","doi":"10.1016/j.nlp.2024.100078","DOIUrl":"https://doi.org/10.1016/j.nlp.2024.100078","url":null,"abstract":"<div><p>Explainable AI (XAI) is pivotal for understanding complex ’black-box’ models, particularly in text analysis, where transparency is essential yet challenging. This paper introduces SIDU-TXT, an adaptation of the ’Similarity Difference and Uniqueness’ (SIDU) method, originally applied in image classification, to textual data. SIDU-TXT generates word-level heatmaps using feature activation maps, highlighting contextually important textual elements for model predictions. Given the absence of a unified standard for assessing XAI methods, to evaluate SIDU-TXT, we implement a comprehensive three-tiered evaluation framework – Functionally-Grounded, Human-Grounded, and Application-Grounded – across varied experimental setups. Our findings show SIDU-TXT’s effectiveness in sentiment analysis, outperforming benchmarks like Grad-CAM and LIME in both Functionally and Human-Grounded assessments. In a legal domain application involving complex asylum decision-making, SIDU-TXT displays competitive but not conclusive results, underscoring the nuanced expectations of domain experts. This work advances the field by offering a methodical holistic approach to XAI evaluation in NLP, urging further research to bridge the existing gap in expert expectations and refine interpretability methods for intricate applications. The study underscores the critical role of extensive evaluations in fostering AI technologies that are not only technically faithful to the model but also comprehensible and trustworthy for end-users.</p></div>","PeriodicalId":100944,"journal":{"name":"Natural Language Processing Journal","volume":"7 ","pages":"Article 100078"},"PeriodicalIF":0.0,"publicationDate":"2024-05-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2949719124000268/pdfft?md5=dbdccfd078388f5068c55b70fac52f1d&pid=1-s2.0-S2949719124000268-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140950536","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Harnessing large language models over transformer models for detecting Bengali depressive social media text: A comprehensive study 利用大语言模型而非转换器模型检测孟加拉语抑郁社交媒体文本综合研究

Natural Language Processing Journal Pub Date : 2024-05-04 DOI: 10.1016/j.nlp.2024.100075

Ahmadul Karim Chowdhury , Saidur Rahman Sujon , Md. Shirajus Salekin Shafi , Tasin Ahmmad , Sifat Ahmed , Khan Md Hasib , Faisal Muhammad Shah

{"title":"Harnessing large language models over transformer models for detecting Bengali depressive social media text: A comprehensive study","authors":"Ahmadul Karim Chowdhury , Saidur Rahman Sujon , Md. Shirajus Salekin Shafi , Tasin Ahmmad , Sifat Ahmed , Khan Md Hasib , Faisal Muhammad Shah","doi":"10.1016/j.nlp.2024.100075","DOIUrl":"https://doi.org/10.1016/j.nlp.2024.100075","url":null,"abstract":"<div><p>In an era where the silent struggle of underdiagnosed depression pervades globally, our research delves into the crucial link between mental health and social media. This work focuses on early detection of depression, particularly in extroverted social media users, using LLMs such as GPT 3.5, GPT 4 and our proposed GPT 3.5 fine-tuned model DepGPT, as well as advanced Deep learning models(LSTM, Bi-LSTM, GRU, BiGRU) and Transformer models(BERT, BanglaBERT, SahajBERT, BanglaBERT-Base). The study categorized Reddit and X datasets into “Depressive” and “Non-Depressive” segments, translated into Bengali by native speakers with expertise in mental health, resulting in the creation of the Bengali Social Media Depressive Dataset (BSMDD). Our work provides full architecture details for each model and a methodical way to assess their performance in Bengali depressive text categorization using zero-shot and few-shot learning techniques. Our work demonstrates the superiority of SahajBERT and Bi-LSTM with FastText embeddings in their respective domains also tackles explainability issues with transformer models and emphasizes the effectiveness of LLMs, especially DepGPT (GPT 3.5 fine-tuned), demonstrating flexibility and competence in a range of learning contexts. According to the experiment results, the proposed model, DepGPT, outperformed not only Alpaca Lora 7B in zero-shot and few-shot scenarios but also every other model, achieving a near-perfect accuracy of 0.9796 and an F1-score of 0.9804, high recall, and exceptional precision. Although competitive, GPT-3.5 Turbo and Alpaca Lora 7B show relatively poorer effectiveness in zero-shot and few-shot situations. The work emphasizes the effectiveness and flexibility of LLMs in a variety of linguistic circumstances, providing insightful information about the complex field of depression detection models.</p></div>","PeriodicalId":100944,"journal":{"name":"Natural Language Processing Journal","volume":"7 ","pages":"Article 100075"},"PeriodicalIF":0.0,"publicationDate":"2024-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2949719124000232/pdfft?md5=6264329603560d04e05467aa89f65a60&pid=1-s2.0-S2949719124000232-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140901894","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Challenges and Opportunities of Using Transformer-Based Multi-Task Learning in NLP Through ML Lifecycle: A Position Paper 通过 ML 生命周期在 NLP 中使用基于变压器的多任务学习的挑战与机遇：立场文件

Natural Language Processing Journal Pub Date : 2024-04-30 DOI: 10.1016/j.nlp.2024.100076

Lovre Torbarina , Tin Ferkovic , Lukasz Roguski , Velimir Mihelcic, Bruno Sarlija, Zeljko Kraljevic

{"title":"Challenges and Opportunities of Using Transformer-Based Multi-Task Learning in NLP Through ML Lifecycle: A Position Paper","authors":"Lovre Torbarina , Tin Ferkovic , Lukasz Roguski , Velimir Mihelcic, Bruno Sarlija, Zeljko Kraljevic","doi":"10.1016/j.nlp.2024.100076","DOIUrl":"https://doi.org/10.1016/j.nlp.2024.100076","url":null,"abstract":"<div><p>The increasing adoption of natural language processing (NLP) models across industries has led to practitioners’ need for machine learning (ML) systems to handle these models efficiently, from training to serving them in production. However, training, deploying, and updating multiple models can be complex, costly, and time-consuming, mainly when using transformer-based pre-trained language models. Multi-Task Learning (MTL) has emerged as a promising approach to improve efficiency and performance through joint training, rather than training separate models. Motivated by this, we present an overview of MTL approaches in NLP, followed by an in-depth discussion of our position on opportunities they introduce to a set of challenges across various ML lifecycle phases including data engineering, model development, deployment, and monitoring. Our position emphasizes the role of transformer-based MTL approaches in streamlining these lifecycle phases, and we assert that our systematic analysis demonstrates how transformer-based MTL in NLP effectively integrates into ML lifecycle phases. Furthermore, we hypothesize that developing a model that combines MTL for periodic re-training, and continual learning for continual updates and new capabilities integration could be practical, although its viability and effectiveness still demand a substantial empirical investigation.</p></div>","PeriodicalId":100944,"journal":{"name":"Natural Language Processing Journal","volume":"7 ","pages":"Article 100076"},"PeriodicalIF":0.0,"publicationDate":"2024-04-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2949719124000244/pdfft?md5=9be47fda7d1ff816f43310f77a7417c3&pid=1-s2.0-S2949719124000244-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140901892","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

MISTRA: Misogyny Detection through Text–Image Fusion and Representation Analysis MISTRA：通过文本图像融合和表征分析进行厌女症检测

Natural Language Processing Journal Pub Date : 2024-04-30 DOI: 10.1016/j.nlp.2024.100073

Nitesh Jindal , Prasanna Kumar Kumaresan , Rahul Ponnusamy , Sajeetha Thavareesan , Saranya Rajiakodi , Bharathi Raja Chakravarthi

{"title":"MISTRA: Misogyny Detection through Text–Image Fusion and Representation Analysis","authors":"Nitesh Jindal , Prasanna Kumar Kumaresan , Rahul Ponnusamy , Sajeetha Thavareesan , Saranya Rajiakodi , Bharathi Raja Chakravarthi","doi":"10.1016/j.nlp.2024.100073","DOIUrl":"https://doi.org/10.1016/j.nlp.2024.100073","url":null,"abstract":"<div><p>Detecting misogynous memes poses a significant challenge due to the presence of multiple modalities (image + text). The inherent complexity arises from the lack of direct correspondence between the textual and visual elements, where an image and overlaid text often convey disparate meanings. Additionally, memes conveying messages of hatred or taunting, particularly targeted towards women, present additional comprehension difficulties. This article introduces the MISTRA framework, which leverages variational autoencoders for dimensionality reduction of the large-sized image features before fusing multimodal features. The framework also harnesses the capabilities of large language models through transfer learning to develop fusion embeddings by extracting and concatenating features from different modalities (image, text, and image-generated caption text) for the misogynous classification task. The components of the framework include state-of-the-art models such as the Vision Transformer model (ViT), textual model (DistilBERT), CLIP (Contrastive Language–Image Pre-training), and BLIP (Bootstrapping Language–Image Pre-training for Unified Vision-Language Understanding and Generation) models. Our experiments are conducted on the SemEval-2022 Task 5 MAMI dataset. To establish a baseline model, we perform separate experiments using the Naive Bayes machine learning classifier on meme texts and ViT on meme images. We evaluate the performance on six different bootstrap samples and report evaluation metrics such as precision, recall, and Macro-F1 score for each bootstrap sample. Additionally, we compute the confidence interval on our evaluation scores and conduct paired t-tests to understand whether our best-performing model has significant differences from the other experiments or not. The experimental results demonstrate that the dimensionality reduction approach on multimodal features with a multilayer perceptron classifier achieved the highest performance with a Macro–F1 score of 71.5 percent, outperforming the baseline approaches in individual modalities.</p></div>","PeriodicalId":100944,"journal":{"name":"Natural Language Processing Journal","volume":"7 ","pages":"Article 100073"},"PeriodicalIF":0.0,"publicationDate":"2024-04-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2949719124000219/pdfft?md5=b1becf6173b99dae8a0f29ea4d466646&pid=1-s2.0-S2949719124000219-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140822872","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Enhancing aspect-based sentiment analysis with BERT-driven context generation and quality filtering 利用 BERT 驱动的上下文生成和质量过滤技术加强基于方面的情感分析

Natural Language Processing Journal Pub Date : 2024-04-30 DOI: 10.1016/j.nlp.2024.100077

Chuanjun Zhao , Rong Feng , Xuzhuang Sun , Lihua Shen , Jing Gao , Yanjie Wang

{"title":"Enhancing aspect-based sentiment analysis with BERT-driven context generation and quality filtering","authors":"Chuanjun Zhao , Rong Feng , Xuzhuang Sun , Lihua Shen , Jing Gao , Yanjie Wang","doi":"10.1016/j.nlp.2024.100077","DOIUrl":"https://doi.org/10.1016/j.nlp.2024.100077","url":null,"abstract":"<div><p>Fine-grained sentiment analysis, commonly referred to as aspect-based sentiment analysis (ABSA), has garnered substantial attention in both academic and industrial circles. ABSA focuses on unveiling the sentiment orientation associated with specific entities or attributes within textual data, resulting in a more precise depiction of intricate emotional nuances. However, due to the extensive range of applications for ABSA, certain domains face challenges such as constrained dataset sizes and the absence of exhaustive, high-quality corpora, leading to issues like few-shot learning and resource scarcity scenarios. To address the issue of limited training dataset sizes, one viable approach involves the utilization of text-based context generation to expand the dataset. In this study, we amalgamate Bert-based text generation with text filtering algorithms to formulate our model. Our model fully leverages contextual information using the Bert model, with a particular emphasis on the interrelationships between sentences. This approach effectively integrates the relationships between sentences and labels, resulting in the creation of an initial data augmentation corpus. Subsequently, filtering algorithms have been devised to enhance the quality of the initial augmentation corpus by eliminating low-quality generated data, ultimately yielding the final text-enhanced dataset. Experimental findings on the Semeval-2014 Laptop and Restaurant datasets demonstrate that the enhanced dataset enhances text quality and markedly boosts the performance of models for aspect-level sentiment classification.</p></div>","PeriodicalId":100944,"journal":{"name":"Natural Language Processing Journal","volume":"7 ","pages":"Article 100077"},"PeriodicalIF":0.0,"publicationDate":"2024-04-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2949719124000256/pdfft?md5=fb2f6fcf5ed35029fd2b0d07eb4519d0&pid=1-s2.0-S2949719124000256-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140822873","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0