{"title":"The fine art of fine-tuning: A structured review of advanced LLM fine-tuning techniques","authors":"Samar Pratap , Alston Richard Aranha , Divyanshu Kumar , Gautam Malhotra , Anantharaman Palacode Narayana Iyer , Shylaja S.S.","doi":"10.1016/j.nlp.2025.100144","DOIUrl":"10.1016/j.nlp.2025.100144","url":null,"abstract":"<div><div>Transformer-based models have consistently demonstrated superior accuracy compared to various traditional models across a range of downstream tasks. However, due to their large nature, training or fine-tuning them for specific tasks has heavy computational and memory demands. This causes the creation of specialized transformer-based models to be almost impossible in the generally present constrained scenarios. To tackle this issue and to make these large models more accessible, a plethora of techniques have been developed. In this study, we will be reviewing the types of techniques developed, their impacts and benefits concerning performance and resource usage along with the latest developments in the domain. We have broadly categorized these techniques into six key areas: Changes in Training Method, Changes in Adapter, Quantization, Parameter Selection, Mixture of Experts, and Application based methods. We collated the results of various techniques on common benchmarks and also evaluated their performance on different datasets and base models.</div></div>","PeriodicalId":100944,"journal":{"name":"Natural Language Processing Journal","volume":"11 ","pages":"Article 100144"},"PeriodicalIF":0.0,"publicationDate":"2025-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143738563","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Konstantinos I. Roumeliotis , Nikolaos D. Tselikas , Dimitrios K. Nasiopoulos
{"title":"LLMs for product classification in e-commerce: A zero-shot comparative study of GPT and claude models","authors":"Konstantinos I. Roumeliotis , Nikolaos D. Tselikas , Dimitrios K. Nasiopoulos","doi":"10.1016/j.nlp.2025.100142","DOIUrl":"10.1016/j.nlp.2025.100142","url":null,"abstract":"<div><div>In the rapidly evolving e-commerce landscape, efficient and accurate product classification is essential for enhancing customer experience and streamlining operations. Traditional product classification methods, which depend heavily on labeled data and manual effort, struggle with scalability and adaptability to diverse product categories. This study explores the transformative potential of large language models (LLMs) for zero-shot product classification in e-commerce, addressing the challenge of automating product categorization without prior labeled training data. We evaluate the performance of four state-of-the-art LLMs — GPT-4o, GPT-4o mini, Claude 3.5 Sonnet, and Claude 3.5 Haiku — on a diverse dataset of 248 product categories, each containing 20 samples, structured into 8 subsets. Each model performs zero-shot classification, assigning products to predefined categories without prior exposure. Our findings reveal significant variations in classification accuracy across models, with certain LLMs demonstrating superior scalability and adaptability for real-world e-commerce applications. Based on these insights, we developed an API software to integrate the top-performing models into e-commerce systems, enhancing automation and efficiency. This study underscores the transformative role of LLMs in revolutionizing e-commerce workflows and recommends their adoption for scalable, intelligent product classification.</div></div>","PeriodicalId":100944,"journal":{"name":"Natural Language Processing Journal","volume":"11 ","pages":"Article 100142"},"PeriodicalIF":0.0,"publicationDate":"2025-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143724168","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Tusarkanta Dalai , Anupam Das , Tapas Kumar Mishra , Pankaj Kumar Sa
{"title":"OdNER: NER resource creation and system development for low-resource Odia language","authors":"Tusarkanta Dalai , Anupam Das , Tapas Kumar Mishra , Pankaj Kumar Sa","doi":"10.1016/j.nlp.2025.100139","DOIUrl":"10.1016/j.nlp.2025.100139","url":null,"abstract":"<div><div>This work aims to enhance the usability of natural language processing (NLP) based systems for the low-resource Odia language by focusing on the development of effective named entity recognition (NER) system. NLP applications rely heavily on NER to extract relevant information from massive amounts of unstructured text. The task of identifying and classifying the named entities included in a given text into a set of predetermined categories is referred to as NER. Already, the NER task has accomplished productive results in English as well as in a number of other European languages. On the other hand, because of a lack of supporting tools and resources, it has not yet been thoroughly investigated in Indian languages, particularly the Odia language. Recently, approaches based on machine learning (ML) and deep learning (DL) have demonstrated exceptional performance when it comes to constructing NLP tasks. Moreover, transformer models, particularly masked-language models (MLM), have demonstrated remarkable efficacy in the NER task; nevertheless, these methods generally call for massive volumes of annotated corpus. Unfortunately, we could not find any open-source NER corpus for the Odia language. The purpose of this research is to compile OdNER, a NER dataset with quality baselines for the low-resource Odia language. The Odia NER corpus OdNER contains 48,000 sentences having 6,71,354 tokens and 98,116 name entities annotated with 12 tags. To establish the quality of our corpus, we use conditional random field (CRF) and BiLSTM model as our baseline models. To demonstrate the efficacy of our dataset, we conduct a comparative evaluation of various transformer-based multilingual language models (IndicBERT, MuRIL, XLM-R) and utilize them to carry out the sequence labeling task for NER. With the pre-trained XLM-R multilingual model, our dataset achieves a maximum F1 score of 90.48%. When it comes to Odia NER, no other work comes close to matching the quality and quantity of ours. We anticipate that, this work will have made substantial progress toward the development of NLP tasks for the Odia language.</div></div>","PeriodicalId":100944,"journal":{"name":"Natural Language Processing Journal","volume":"11 ","pages":"Article 100139"},"PeriodicalIF":0.0,"publicationDate":"2025-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143644136","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Comparative analysis of Mixture-of-Agents models for natural language inference with ANLI data","authors":"Swathi Sowmya Bavirthi, Dama Pranati Sreya, Tanguturi Poojitha","doi":"10.1016/j.nlp.2025.100140","DOIUrl":"10.1016/j.nlp.2025.100140","url":null,"abstract":"<div><div>The Mixture-of-Agents (MoA) framework represents a significant contribution to artificial intelligence (AI) by enhancing the capabilities of large language models (LLMs) through the integration of multiple specialized agents. This approach addresses the limitations of traditional single-agent models, enabling more robust reasoning, improved accuracy in natural language inference (NLI), and better adaptability to diverse linguistic contexts. The key contribution to AI lies in MoA’s ability to dynamically orchestrate these agents, each focusing on different aspects of a task, leading to a more comprehensive and effective problem-solving approach. In the domain of engineering, MoA finds its application in real-time decision-making systems, particularly in autonomous systems and intelligent control environments. By deploying MoA within these systems, we demonstrate its effectiveness in enhancing precision and reliability in language-based decision-making processes. This integration significantly improves the system’s ability to adapt to dynamic scenarios, making MoA a valuable tool for bridging the gap between advanced AI methodologies and practical engineering solutions.</div></div>","PeriodicalId":100944,"journal":{"name":"Natural Language Processing Journal","volume":"11 ","pages":"Article 100140"},"PeriodicalIF":0.0,"publicationDate":"2025-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143686231","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Md Osama , Ashim Dey , Kawsar Ahmed , Muhammad Ashad Kabir
{"title":"BeliN: A novel corpus for Bengali religious news headline generation using contextual feature fusion","authors":"Md Osama , Ashim Dey , Kawsar Ahmed , Muhammad Ashad Kabir","doi":"10.1016/j.nlp.2025.100138","DOIUrl":"10.1016/j.nlp.2025.100138","url":null,"abstract":"<div><div>Automatic text summarization, particularly headline generation, remains a critical yet under-explored area for Bengali religious news. Existing approaches to headline generation typically rely solely on the article content, overlooking crucial contextual features such as sentiment, category, and aspect. This limitation significantly hinders their effectiveness and overall performance. This study addresses this limitation by introducing a novel corpus, BeliN (Bengali Religious News) – comprising religious news articles from prominent Bangladeshi online newspapers, and <em>MultiGen</em> – a contextual multi-input feature fusion headline generation approach. Leveraging transformer-based pre-trained language models such as BanglaT5, mBART, mT5, and mT0, <em>MultiGen</em> integrates additional contextual features – including category, aspect, and sentiment – with the news content. This fusion enables the model to capture critical contextual information often overlooked by traditional methods. Experimental results demonstrate the superiority of <em>MultiGen</em> over the baseline approach that uses only news content, achieving a BLEU score of 18.61 and ROUGE-L score of 24.19, compared to baseline approach scores of 16.08 and 23.08, respectively. These findings underscore the importance of incorporating contextual features in headline generation for low-resource languages. By bridging linguistic and cultural gaps, this research advances natural language processing for Bengali and other under-represented languages. To promote reproducibility and further exploration, the dataset and implementation code are publicly accessible at <span><span>https://github.com/akabircs/BeliN</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":100944,"journal":{"name":"Natural Language Processing Journal","volume":"11 ","pages":"Article 100138"},"PeriodicalIF":0.0,"publicationDate":"2025-03-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143629386","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Fine-tuning text-to-SQL models with reinforcement-learning training objectives","authors":"Xuan-Bang Nguyen , Xuan-Hieu Phan , Massimo Piccardi","doi":"10.1016/j.nlp.2025.100135","DOIUrl":"10.1016/j.nlp.2025.100135","url":null,"abstract":"<div><div>Text-to-SQL is an important natural language processing task that helps users automatically convert natural language queries into formal SQL code. While transformer-based models have pushed text-to-SQL to unprecedented accuracy levels in recent years, such performance is confined to models of very large size that can only be run in specialised clouds. For this reason, in this paper we explore the use of reinforcement learning to improve the performance of models of more conservative size, which can fit within standard user hardware. As reinforcement learning reward, we propose a novel function which better aligns with the text-to-SQL evaluation metrics, applied in conjunction with two strong policy gradient algorithms, REINFORCE and RELAX. Our experimental results over the popular Spider benchmark show that the proposed approach has been able to outperform a conventionally-trained T5 Small baseline by 6.6 pp (percentage points) of exact-set-match accuracy and 4.6 pp of execution accuracy, and a T5 Base baseline by 2.0 pp and 1.9 pp, respectively. The proposed model has also achieved a remarkable comparative performance against ChatGPT instances.</div></div>","PeriodicalId":100944,"journal":{"name":"Natural Language Processing Journal","volume":"10 ","pages":"Article 100135"},"PeriodicalIF":0.0,"publicationDate":"2025-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143510043","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"AI Linguistics","authors":"Guosheng Zhang","doi":"10.1016/j.nlp.2025.100137","DOIUrl":"10.1016/j.nlp.2025.100137","url":null,"abstract":"<div><div>This research investigates the development of a linguistics for artificial intelligence (AI) to demystify the ”black box” of AI. At its core, the language of AI is Embedding—a novel high-dimensional, intelligent language. Embedding exhibits dual characteristics: it operates both as a semantic domain and as a mathematical point. This duality enables Embedding to maintain the discrete, symbolic nature of human languages while facilitating continuous operations in high-dimensional spaces, unlocking significant potential for advanced intelligence. A series of specialized experiments were designed to explore Embedding’s intrinsic properties, including its behavior as a semantic cloud in high-dimensional space, its degrees of freedom, and spatial transformations. Key findings include the discovery of substantial redundant dimensions in embeddings, confirmation that embeddings lack critical dimensions, and the measurement of engineering dimensions in natural language. This research also establishes the linguistic foundations and application limits of techniques such as dropout strategies, AI model distillation, and scaling laws among others. Building on these insights, we propose innovative solutions across several fields, including AI architecture design, AI reasoning, domain-based embedding search, and the construction of a multi-intelligence spectrum for embeddings. Ultimately, we introduce a foundational methodology for embedding everything from real-world into the AI world, providing a comprehensive reference framework for the evolution of artificial general intelligence (AGI) and artificial superintelligence (ASI). Additionally, this research explores linguistic approaches to the co-evolution of human intelligence and artificial intelligence.</div></div>","PeriodicalId":100944,"journal":{"name":"Natural Language Processing Journal","volume":"10 ","pages":"Article 100137"},"PeriodicalIF":0.0,"publicationDate":"2025-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143520847","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Aspect-based sentiment classification with BERT and AI feedback","authors":"Lingling Xu, Weiming Wang","doi":"10.1016/j.nlp.2025.100136","DOIUrl":"10.1016/j.nlp.2025.100136","url":null,"abstract":"<div><div>Data augmentation has been widely employed in low-resource aspect-based sentiment classification (ABSC) tasks to alleviate the issue of data sparsity and enhance the performance of the model. Unlike previous data augmentation approaches that rely on back translation, synonym replacement, or generative language models such as T5, the generation power of large language models is explored rarely. Large language models like GPT-3.5-turbo are trained on extensive datasets and corpus to capture semantic and contextual relationships between words and sentences. To this end, we propose Masked Aspect Term Prediction (MATP), a novel data augmentation method that utilizes the world knowledge and powerful generative capacity of large language models to generate new aspect terms via word masking. By incorporating AI feedback from large language models, MATP increases the diversity and richness of aspect terms. Experimental results on the ABSC datasets with BERT as the backbone model show that the introduction of new augmented datasets leads to significant improvements over baseline models, validating the effectiveness of the proposed data augmentation strategy that combines AI feedback.</div></div>","PeriodicalId":100944,"journal":{"name":"Natural Language Processing Journal","volume":"10 ","pages":"Article 100136"},"PeriodicalIF":0.0,"publicationDate":"2025-02-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143474949","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A transformer based multi task learning approach to multimodal hate speech detection","authors":"Prashant Kapil , Asif Ekbal","doi":"10.1016/j.nlp.2025.100133","DOIUrl":"10.1016/j.nlp.2025.100133","url":null,"abstract":"<div><div>Online hate speech has become a major social issue in recent years, affecting both individuals and society as a whole. Memes are a multimodal kind of internet hate speech that is growing more common. Online memes are often entertaining and harmless. The seemingly innocent meme, on the other hand, transforms into a multimodal form of hate speech—a hateful meme—when specific types of text, graphics, or combinations of both are used. The spread of these harmful or undesirable memes has the potential to disrupt societal peace. Therefore, it is vital to limit inappropriate memes on social media. Multimodal hate speech identification is an inherently difficult and open question. It necessitates collaborative language, visual perception, and multimodal reasoning. This line of research has been progressed in this work by building a multi-task learning-based multimodal system for detecting hateful memes by training four hateful meme data sets concurrently. This MTL framework, which consists of Contrastive Language Image Pretraining (CLIP), UNiversal Image-TExt Representation Learning (UNITER), and BERT, was trained collaboratively to transfer common knowledge while simultaneously training four meme datasets. The results show that the recommended strategy outperforms unimodal and multimodal approaches on four multilingual benchmark datasets, with considerable AUC-ROC, accuracy, and F1-score. The ablation studies are undertaken to emphasise the impact of the sub-component in the MTL model. The confusion matrix is shown as quantitative analysis.</div></div>","PeriodicalId":100944,"journal":{"name":"Natural Language Processing Journal","volume":"11 ","pages":"Article 100133"},"PeriodicalIF":0.0,"publicationDate":"2025-02-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143760774","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mohammad Ali Dadgostarnia , Ramin Mousa , Saba Hesaraki , Mahdi Hemmasian
{"title":"CapsF: Capsule Fusion for Extracting psychiatric stressors for suicide from Twitter","authors":"Mohammad Ali Dadgostarnia , Ramin Mousa , Saba Hesaraki , Mahdi Hemmasian","doi":"10.1016/j.nlp.2025.100134","DOIUrl":"10.1016/j.nlp.2025.100134","url":null,"abstract":"<div><div>Along with factors such as cancer, blood pressure, street accidents and stroke, suicide has been one of Iran’s main causes of death. One of the main reasons for suicide is psychological stressors. Identifying psychological stressors in an at-risk population can help in the early prevention of suicidal and suicidal behaviours. In recent years, the widespread popularity and flow of real-time information sharing of social media have allowed for potential early intervention in large-scale and even small-scale populations. However, some automated approaches to extract psychiatric stressors from Twitter have been presented, but most of this research has been for non-Persian languages. This study aims to investigate the techniques of detecting psychiatric stress related to suicide from Persian tweets using learning-based methods. The proposed capsule-based approach achieved a binary classification accuracy of 0.83.</div></div>","PeriodicalId":100944,"journal":{"name":"Natural Language Processing Journal","volume":"10 ","pages":"Article 100134"},"PeriodicalIF":0.0,"publicationDate":"2025-02-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143454121","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}