Natural Language Processing Journal最新文献

筛选
英文 中文
Summarizing long scientific documents through hierarchical structure extraction 通过层次结构提取总结长篇科学文献
Natural Language Processing Journal Pub Date : 2024-05-29 DOI: 10.1016/j.nlp.2024.100080
Grishma Sharma , Deepak Sharma , M. Sasikumar
{"title":"Summarizing long scientific documents through hierarchical structure extraction","authors":"Grishma Sharma ,&nbsp;Deepak Sharma ,&nbsp;M. Sasikumar","doi":"10.1016/j.nlp.2024.100080","DOIUrl":"https://doi.org/10.1016/j.nlp.2024.100080","url":null,"abstract":"<div><p>In the realm of academia, staying updated with the latest advancements has become increasingly difficult due to the rapid rise in scientific publications. Text summarization emerges as a solution to this challenge by distilling essential contributions into concise summaries. Despite the structured nature of scientific documents, current summarization techniques often overlook this valuable structural information. Our proposed method addresses this gap through an unsupervised, extractive, user preference-based, and hierarchical iterative graph-based ranking algorithm for summarizing long scientific documents. Unlike existing approaches, our method operates by leveraging the inherent structural information within scientific texts to generate diverse summaries tailored to user preferences. To assess the efficiency of our approach, we conducted evaluations on two distinct long document datasets: ScisummNet and a custom dataset comprising papers from esteemed journals and conferences with human-extracted sentences as gold summaries. The results obtained using automatic evaluation metric Rouge scores as well as human evaluation, demonstrate that our method performs better than other well-known unsupervised algorithms. This emphasizes the need for structural information in text summarization, enabling more effective and customizable solutions.</p></div>","PeriodicalId":100944,"journal":{"name":"Natural Language Processing Journal","volume":"8 ","pages":"Article 100080"},"PeriodicalIF":0.0,"publicationDate":"2024-05-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2949719124000281/pdfft?md5=7e249fba3a7dd6613770889389366f05&pid=1-s2.0-S2949719124000281-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141291974","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Decoding depression: Analyzing social network insights for depression severity assessment with transformers and explainable AI 解码抑郁症:利用变换器和可解释人工智能分析社交网络洞察,评估抑郁症严重程度
Natural Language Processing Journal Pub Date : 2024-05-13 DOI: 10.1016/j.nlp.2024.100079
Tasnim Ahmed , Shahriar Ivan , Ahnaf Munir , Sabbir Ahmed
{"title":"Decoding depression: Analyzing social network insights for depression severity assessment with transformers and explainable AI","authors":"Tasnim Ahmed ,&nbsp;Shahriar Ivan ,&nbsp;Ahnaf Munir ,&nbsp;Sabbir Ahmed","doi":"10.1016/j.nlp.2024.100079","DOIUrl":"10.1016/j.nlp.2024.100079","url":null,"abstract":"<div><p>Depression is a mental state characterized by recurrent feelings of melancholy, hopelessness, and disinterest in activities, having a significant negative influence on everyday functioning and general well-being. Millions of users express their thoughts and emotions on social media platforms, which can be used as a rich source of data for early detection of depression. In this connection, this work leverages an ensemble of transformer-based architectures for quantifying the severity of depression from social media posts into four categories — non-depressed, mild, moderate, and severe. At first, a diverse range of preprocessing techniques is employed to enhance the quality and relevance of the input. Then, the preprocessed samples are passed through three variants of transformer-based models, namely vanilla BERT, BERTweet, and ALBERT, for generating predictions, which are combined using a weighted soft-voting approach. We conduct a comprehensive explainability analysis to gain deeper insights into the decision-making process, examining both local and global perspectives. Furthermore, to the best of our knowledge, we are the first ones to explore the extent to which a Large Language Model (LLM) like ‘ChatGPT’ can perform this task. Evaluation of the model on the publicly available ‘DEPTWEET’ dataset produces state-of-the-art performance with 13.5% improvement in AUC–ROC score.</p></div>","PeriodicalId":100944,"journal":{"name":"Natural Language Processing Journal","volume":"7 ","pages":"Article 100079"},"PeriodicalIF":0.0,"publicationDate":"2024-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S294971912400027X/pdfft?md5=5d658d840266d01d808f9f0280aa58df&pid=1-s2.0-S294971912400027X-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141047775","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SIDU-TXT: An XAI algorithm for NLP with a holistic assessment approach SIDU-TXT:采用整体评估方法的 XAI NLP 算法
Natural Language Processing Journal Pub Date : 2024-05-09 DOI: 10.1016/j.nlp.2024.100078
Mohammad N.S. Jahromi , Satya M. Muddamsetty , Asta Sofie Stage Jarlner , Anna Murphy Høgenhaug , Thomas Gammeltoft-Hansen , Thomas B. Moeslund
{"title":"SIDU-TXT: An XAI algorithm for NLP with a holistic assessment approach","authors":"Mohammad N.S. Jahromi ,&nbsp;Satya M. Muddamsetty ,&nbsp;Asta Sofie Stage Jarlner ,&nbsp;Anna Murphy Høgenhaug ,&nbsp;Thomas Gammeltoft-Hansen ,&nbsp;Thomas B. Moeslund","doi":"10.1016/j.nlp.2024.100078","DOIUrl":"https://doi.org/10.1016/j.nlp.2024.100078","url":null,"abstract":"<div><p>Explainable AI (XAI) is pivotal for understanding complex ’black-box’ models, particularly in text analysis, where transparency is essential yet challenging. This paper introduces SIDU-TXT, an adaptation of the ’Similarity Difference and Uniqueness’ (SIDU) method, originally applied in image classification, to textual data. SIDU-TXT generates word-level heatmaps using feature activation maps, highlighting contextually important textual elements for model predictions. Given the absence of a unified standard for assessing XAI methods, to evaluate SIDU-TXT, we implement a comprehensive three-tiered evaluation framework – Functionally-Grounded, Human-Grounded, and Application-Grounded – across varied experimental setups. Our findings show SIDU-TXT’s effectiveness in sentiment analysis, outperforming benchmarks like Grad-CAM and LIME in both Functionally and Human-Grounded assessments. In a legal domain application involving complex asylum decision-making, SIDU-TXT displays competitive but not conclusive results, underscoring the nuanced expectations of domain experts. This work advances the field by offering a methodical holistic approach to XAI evaluation in NLP, urging further research to bridge the existing gap in expert expectations and refine interpretability methods for intricate applications. The study underscores the critical role of extensive evaluations in fostering AI technologies that are not only technically faithful to the model but also comprehensible and trustworthy for end-users.</p></div>","PeriodicalId":100944,"journal":{"name":"Natural Language Processing Journal","volume":"7 ","pages":"Article 100078"},"PeriodicalIF":0.0,"publicationDate":"2024-05-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2949719124000268/pdfft?md5=dbdccfd078388f5068c55b70fac52f1d&pid=1-s2.0-S2949719124000268-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140950536","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Harnessing large language models over transformer models for detecting Bengali depressive social media text: A comprehensive study 利用大语言模型而非转换器模型检测孟加拉语抑郁社交媒体文本综合研究
Natural Language Processing Journal Pub Date : 2024-05-04 DOI: 10.1016/j.nlp.2024.100075
Ahmadul Karim Chowdhury , Saidur Rahman Sujon , Md. Shirajus Salekin Shafi , Tasin Ahmmad , Sifat Ahmed , Khan Md Hasib , Faisal Muhammad Shah
{"title":"Harnessing large language models over transformer models for detecting Bengali depressive social media text: A comprehensive study","authors":"Ahmadul Karim Chowdhury ,&nbsp;Saidur Rahman Sujon ,&nbsp;Md. Shirajus Salekin Shafi ,&nbsp;Tasin Ahmmad ,&nbsp;Sifat Ahmed ,&nbsp;Khan Md Hasib ,&nbsp;Faisal Muhammad Shah","doi":"10.1016/j.nlp.2024.100075","DOIUrl":"https://doi.org/10.1016/j.nlp.2024.100075","url":null,"abstract":"<div><p>In an era where the silent struggle of underdiagnosed depression pervades globally, our research delves into the crucial link between mental health and social media. This work focuses on early detection of depression, particularly in extroverted social media users, using LLMs such as GPT 3.5, GPT 4 and our proposed GPT 3.5 fine-tuned model DepGPT, as well as advanced Deep learning models(LSTM, Bi-LSTM, GRU, BiGRU) and Transformer models(BERT, BanglaBERT, SahajBERT, BanglaBERT-Base). The study categorized Reddit and X datasets into “Depressive” and “Non-Depressive” segments, translated into Bengali by native speakers with expertise in mental health, resulting in the creation of the Bengali Social Media Depressive Dataset (BSMDD). Our work provides full architecture details for each model and a methodical way to assess their performance in Bengali depressive text categorization using zero-shot and few-shot learning techniques. Our work demonstrates the superiority of SahajBERT and Bi-LSTM with FastText embeddings in their respective domains also tackles explainability issues with transformer models and emphasizes the effectiveness of LLMs, especially DepGPT (GPT 3.5 fine-tuned), demonstrating flexibility and competence in a range of learning contexts. According to the experiment results, the proposed model, DepGPT, outperformed not only Alpaca Lora 7B in zero-shot and few-shot scenarios but also every other model, achieving a near-perfect accuracy of 0.9796 and an F1-score of 0.9804, high recall, and exceptional precision. Although competitive, GPT-3.5 Turbo and Alpaca Lora 7B show relatively poorer effectiveness in zero-shot and few-shot situations. The work emphasizes the effectiveness and flexibility of LLMs in a variety of linguistic circumstances, providing insightful information about the complex field of depression detection models.</p></div>","PeriodicalId":100944,"journal":{"name":"Natural Language Processing Journal","volume":"7 ","pages":"Article 100075"},"PeriodicalIF":0.0,"publicationDate":"2024-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2949719124000232/pdfft?md5=6264329603560d04e05467aa89f65a60&pid=1-s2.0-S2949719124000232-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140901894","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Challenges and Opportunities of Using Transformer-Based Multi-Task Learning in NLP Through ML Lifecycle: A Position Paper 通过 ML 生命周期在 NLP 中使用基于变压器的多任务学习的挑战与机遇:立场文件
Natural Language Processing Journal Pub Date : 2024-04-30 DOI: 10.1016/j.nlp.2024.100076
Lovre Torbarina , Tin Ferkovic , Lukasz Roguski , Velimir Mihelcic, Bruno Sarlija, Zeljko Kraljevic
{"title":"Challenges and Opportunities of Using Transformer-Based Multi-Task Learning in NLP Through ML Lifecycle: A Position Paper","authors":"Lovre Torbarina ,&nbsp;Tin Ferkovic ,&nbsp;Lukasz Roguski ,&nbsp;Velimir Mihelcic,&nbsp;Bruno Sarlija,&nbsp;Zeljko Kraljevic","doi":"10.1016/j.nlp.2024.100076","DOIUrl":"https://doi.org/10.1016/j.nlp.2024.100076","url":null,"abstract":"<div><p>The increasing adoption of natural language processing (NLP) models across industries has led to practitioners’ need for machine learning (ML) systems to handle these models efficiently, from training to serving them in production. However, training, deploying, and updating multiple models can be complex, costly, and time-consuming, mainly when using transformer-based pre-trained language models. Multi-Task Learning (MTL) has emerged as a promising approach to improve efficiency and performance through joint training, rather than training separate models. Motivated by this, we present an overview of MTL approaches in NLP, followed by an in-depth discussion of our position on opportunities they introduce to a set of challenges across various ML lifecycle phases including data engineering, model development, deployment, and monitoring. Our position emphasizes the role of transformer-based MTL approaches in streamlining these lifecycle phases, and we assert that our systematic analysis demonstrates how transformer-based MTL in NLP effectively integrates into ML lifecycle phases. Furthermore, we hypothesize that developing a model that combines MTL for periodic re-training, and continual learning for continual updates and new capabilities integration could be practical, although its viability and effectiveness still demand a substantial empirical investigation.</p></div>","PeriodicalId":100944,"journal":{"name":"Natural Language Processing Journal","volume":"7 ","pages":"Article 100076"},"PeriodicalIF":0.0,"publicationDate":"2024-04-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2949719124000244/pdfft?md5=9be47fda7d1ff816f43310f77a7417c3&pid=1-s2.0-S2949719124000244-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140901892","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
MISTRA: Misogyny Detection through Text–Image Fusion and Representation Analysis MISTRA:通过文本图像融合和表征分析进行厌女症检测
Natural Language Processing Journal Pub Date : 2024-04-30 DOI: 10.1016/j.nlp.2024.100073
Nitesh Jindal , Prasanna Kumar Kumaresan , Rahul Ponnusamy , Sajeetha Thavareesan , Saranya Rajiakodi , Bharathi Raja Chakravarthi
{"title":"MISTRA: Misogyny Detection through Text–Image Fusion and Representation Analysis","authors":"Nitesh Jindal ,&nbsp;Prasanna Kumar Kumaresan ,&nbsp;Rahul Ponnusamy ,&nbsp;Sajeetha Thavareesan ,&nbsp;Saranya Rajiakodi ,&nbsp;Bharathi Raja Chakravarthi","doi":"10.1016/j.nlp.2024.100073","DOIUrl":"https://doi.org/10.1016/j.nlp.2024.100073","url":null,"abstract":"<div><p>Detecting misogynous memes poses a significant challenge due to the presence of multiple modalities (image + text). The inherent complexity arises from the lack of direct correspondence between the textual and visual elements, where an image and overlaid text often convey disparate meanings. Additionally, memes conveying messages of hatred or taunting, particularly targeted towards women, present additional comprehension difficulties. This article introduces the MISTRA framework, which leverages variational autoencoders for dimensionality reduction of the large-sized image features before fusing multimodal features. The framework also harnesses the capabilities of large language models through transfer learning to develop fusion embeddings by extracting and concatenating features from different modalities (image, text, and image-generated caption text) for the misogynous classification task. The components of the framework include state-of-the-art models such as the Vision Transformer model (ViT), textual model (DistilBERT), CLIP (Contrastive Language–Image Pre-training), and BLIP (Bootstrapping Language–Image Pre-training for Unified Vision-Language Understanding and Generation) models. Our experiments are conducted on the SemEval-2022 Task 5 MAMI dataset. To establish a baseline model, we perform separate experiments using the Naive Bayes machine learning classifier on meme texts and ViT on meme images. We evaluate the performance on six different bootstrap samples and report evaluation metrics such as precision, recall, and Macro-F1 score for each bootstrap sample. Additionally, we compute the confidence interval on our evaluation scores and conduct paired t-tests to understand whether our best-performing model has significant differences from the other experiments or not. The experimental results demonstrate that the dimensionality reduction approach on multimodal features with a multilayer perceptron classifier achieved the highest performance with a Macro–F1 score of 71.5 percent, outperforming the baseline approaches in individual modalities.</p></div>","PeriodicalId":100944,"journal":{"name":"Natural Language Processing Journal","volume":"7 ","pages":"Article 100073"},"PeriodicalIF":0.0,"publicationDate":"2024-04-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2949719124000219/pdfft?md5=b1becf6173b99dae8a0f29ea4d466646&pid=1-s2.0-S2949719124000219-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140822872","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Enhancing aspect-based sentiment analysis with BERT-driven context generation and quality filtering 利用 BERT 驱动的上下文生成和质量过滤技术加强基于方面的情感分析
Natural Language Processing Journal Pub Date : 2024-04-30 DOI: 10.1016/j.nlp.2024.100077
Chuanjun Zhao , Rong Feng , Xuzhuang Sun , Lihua Shen , Jing Gao , Yanjie Wang
{"title":"Enhancing aspect-based sentiment analysis with BERT-driven context generation and quality filtering","authors":"Chuanjun Zhao ,&nbsp;Rong Feng ,&nbsp;Xuzhuang Sun ,&nbsp;Lihua Shen ,&nbsp;Jing Gao ,&nbsp;Yanjie Wang","doi":"10.1016/j.nlp.2024.100077","DOIUrl":"https://doi.org/10.1016/j.nlp.2024.100077","url":null,"abstract":"<div><p>Fine-grained sentiment analysis, commonly referred to as aspect-based sentiment analysis (ABSA), has garnered substantial attention in both academic and industrial circles. ABSA focuses on unveiling the sentiment orientation associated with specific entities or attributes within textual data, resulting in a more precise depiction of intricate emotional nuances. However, due to the extensive range of applications for ABSA, certain domains face challenges such as constrained dataset sizes and the absence of exhaustive, high-quality corpora, leading to issues like few-shot learning and resource scarcity scenarios. To address the issue of limited training dataset sizes, one viable approach involves the utilization of text-based context generation to expand the dataset. In this study, we amalgamate Bert-based text generation with text filtering algorithms to formulate our model. Our model fully leverages contextual information using the Bert model, with a particular emphasis on the interrelationships between sentences. This approach effectively integrates the relationships between sentences and labels, resulting in the creation of an initial data augmentation corpus. Subsequently, filtering algorithms have been devised to enhance the quality of the initial augmentation corpus by eliminating low-quality generated data, ultimately yielding the final text-enhanced dataset. Experimental findings on the Semeval-2014 Laptop and Restaurant datasets demonstrate that the enhanced dataset enhances text quality and markedly boosts the performance of models for aspect-level sentiment classification.</p></div>","PeriodicalId":100944,"journal":{"name":"Natural Language Processing Journal","volume":"7 ","pages":"Article 100077"},"PeriodicalIF":0.0,"publicationDate":"2024-04-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2949719124000256/pdfft?md5=fb2f6fcf5ed35029fd2b0d07eb4519d0&pid=1-s2.0-S2949719124000256-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140822873","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An innovative GPT-based open-source intelligence using historical cyber incident reports 利用历史网络事件报告,基于 GPT 的创新型开放源情报
Natural Language Processing Journal Pub Date : 2024-04-24 DOI: 10.1016/j.nlp.2024.100074
Fahim Sufi
{"title":"An innovative GPT-based open-source intelligence using historical cyber incident reports","authors":"Fahim Sufi","doi":"10.1016/j.nlp.2024.100074","DOIUrl":"10.1016/j.nlp.2024.100074","url":null,"abstract":"<div><p>In contemporary discourse, the pervasive influences of Generative Pre-Trained (GPT) and Large Language Models (LLM) are evident, showcasing diverse applications. GPT-based technologies, transcending mere summarization, exhibit adeptness in discerning critical information from extensive textual corpuses. Through prudent extraction of semantically meaningful content from textual representations, GPT technologies engender automated feature extraction, a departure from the fallible manual extraction methodologies. This study posits an innovative paradigm for extracting multidimensional cyber threat-related features from textual depictions of cyber events, leveraging the prowess of GPT. These extracted features serve as inputs for artificial intelligence (AI) and deep learning algorithms, including Convolutional Neural Network (CNN), Decomposition analysis, and Natural Language Processing (NLP)-based modalities tailored for non-technical cyber strategists. The proposed framework empowers cyber strategists or analysts to articulate inquiries regarding historical cyber incidents in plain English, with the NLP-based interaction facet of the system proffering cogent AI-driven insights in natural language. Furthermore, salient insights, often elusive in dynamic visualizations, are succinctly presented in plain language. Empirical validation of the entire system ensued through autonomous acquisition of semantically enriched contextual information concerning 214 major cyber incidents spanning from 2016 to 2023. GPT-based responses on Actor Type, Target, Attack Source (i.e., Country Originating Attack), Attack Destination (i.e., Targeted Country), Attack Level, Attack Type, and Attack Timeline, underwent critical AI-driven analysis. This comprehensive 7-dimensional information gleaned from the corpus of 214 incidents yielded a corpus of 1498 informative outputs, attaining a commendable precision of 96%, a recall rate of 98%, and an F1-Score of 97%.</p></div>","PeriodicalId":100944,"journal":{"name":"Natural Language Processing Journal","volume":"7 ","pages":"Article 100074"},"PeriodicalIF":0.0,"publicationDate":"2024-04-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2949719124000220/pdfft?md5=51fa56bc0f6ecc9df3ea7e02efce3208&pid=1-s2.0-S2949719124000220-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140764119","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Enabling action crossmodality for a pretrained large language model 为预先训练的大型语言模型提供动作跨模态功能
Natural Language Processing Journal Pub Date : 2024-04-20 DOI: 10.1016/j.nlp.2024.100072
Anton Caesar, Ozan Özdemir, Cornelius Weber, Stefan Wermter
{"title":"Enabling action crossmodality for a pretrained large language model","authors":"Anton Caesar,&nbsp;Ozan Özdemir,&nbsp;Cornelius Weber,&nbsp;Stefan Wermter","doi":"10.1016/j.nlp.2024.100072","DOIUrl":"10.1016/j.nlp.2024.100072","url":null,"abstract":"<div><p>Natural language processing and vision tasks have recently seen large improvements through the rise of Transformer architectures. The high-performing large language models (LLMs) benefit from large textual datasets that are numerously available online. However, action and bidirectional action-language tasks are less developed, as these require more specific and labeled data. Therefore, we aim at enabling these robotic action capabilities for a pretrained LLM, while maintaining high efficiency with regards to the required training time and data size. To achieve this, we split up a Transformer-based LLM and insert a multimodal architecture into it. Specifically, we split a pretrained T5 LLM between its encoder and decoder parts, to insert a crossmodal Transformer component of a Paired Transformed Autoencoders (PTAE) bidirectional action-language model. The experiments are conducted on a new dataset, consisting of unimodal language translation and crossmodal bidirectional action-language translation. The natural language capabilities of the original T5 are re-established efficiently by training the crossmodal Transformer, which requires only one 5.7 millionth of the T5 model’s original training data. Furthermore, the new model, called CrossT5, achieves high accuracy for the vision- and language-guided robotic action tasks. By design, the CrossT5 agent acts robustly when tested with language commands not included in the dataset. The results demonstrate that this novel approach is successful in combining the advanced linguistic capabilities of LLMs with the low-level robotic control skills of vision-action models. The code is available at this URL: <span>https://github.com/samsoneko/CrossT5</span><svg><path></path></svg>.</p></div>","PeriodicalId":100944,"journal":{"name":"Natural Language Processing Journal","volume":"7 ","pages":"Article 100072"},"PeriodicalIF":0.0,"publicationDate":"2024-04-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2949719124000207/pdfft?md5=cc42b6eb8402b00afc108e973be38c4c&pid=1-s2.0-S2949719124000207-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140789609","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Sentiment analysis of Bangla language using a new comprehensive dataset BangDSA and the novel feature metric skipBangla-BERT 利用新的综合数据集 BangDSA 和新的特征指标 skipBangla-BERT 对孟加拉语进行情感分析
Natural Language Processing Journal Pub Date : 2024-04-16 DOI: 10.1016/j.nlp.2024.100069
Md. Shymon Islam, Kazi Masudul Alam
{"title":"Sentiment analysis of Bangla language using a new comprehensive dataset BangDSA and the novel feature metric skipBangla-BERT","authors":"Md. Shymon Islam,&nbsp;Kazi Masudul Alam","doi":"10.1016/j.nlp.2024.100069","DOIUrl":"https://doi.org/10.1016/j.nlp.2024.100069","url":null,"abstract":"<div><p>In this modern technologically advanced world, Sentiment Analysis (SA) is a very important topic in every language due to its various trendy applications. But SA in Bangla language is still in a dearth level. This work focuses on examining different hybrid feature extraction techniques and learning algorithms on <strong>Bang</strong>la <strong>D</strong>ocument level <strong>S</strong>entiment <strong>A</strong>nalysis using a new comprehensive dataset (BangDSA) of 203,493 comments collected from various microblogging sites. The proposed BangDSA dataset approximately follows the Zipf’s law, covering 32.84% function words with a vocabulary growth rate of 0.053, tagged both on 15 and 3 categories. In this study, we have implemented 21 different hybrid feature extraction methods including Bag of Words (BOW), N-gram, TF-IDF, TF-IDF-ICF, Word2Vec, FastText, GloVe, Bangla-BERT etc with CBOW and Skipgram mechanisms. The proposed novel method (Bangla-BERT+Skipgram), skipBangla-BERT outperforms all other feature extraction techniques in machine leaning (ML), ensemble learning (EL) and deep learning (DL) approaches. Among the built models from ML, EL and DL domains the hybrid method CNN-BiLSTM surpasses the others. The best acquired accuracy for the CNN-BiLSTM model is 90.24% in 15 categories and 95.71% in 3 categories. Friedman test has been performed on the obtained results to observe the statistical significance. For both real 15 and 3 categories, the results of the statistical test are significant.</p></div>","PeriodicalId":100944,"journal":{"name":"Natural Language Processing Journal","volume":"7 ","pages":"Article 100069"},"PeriodicalIF":0.0,"publicationDate":"2024-04-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2949719124000177/pdfft?md5=2a4b5d5dc62f48201e142e0cf3b9cb09&pid=1-s2.0-S2949719124000177-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140557852","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信