{"title":"The performance of the LSTM-based code generated by Large Language Models (LLMs) in forecasting time series data","authors":"Saroj Gopali , Sima Siami-Namini , Faranak Abri , Akbar Siami Namin","doi":"10.1016/j.nlp.2024.100120","DOIUrl":"10.1016/j.nlp.2024.100120","url":null,"abstract":"<div><div>Generative AI, and in particular Large Language Models (LLMs), have gained substantial momentum due to their wide applications in various disciplines. While the use of these game changing technologies in generating textual information has already been demonstrated in several application domains, their abilities in generating complex models and executable codes need to be explored. As an intriguing case is the goodness of the machine and deep learning models generated by these LLMs in conducting automated scientific data analysis, where a data analyst may not have enough expertise in manually coding and optimizing complex deep learning models and codes and thus may opt to leverage LLMs to generate the required models. This paper investigates and compares the performance of the mainstream LLMs, such as ChatGPT, PaLM, LLama, and Falcon, in generating deep learning models for analyzing time series data, an important and popular data type with its prevalent applications in many application domains including financial and stock market. This research conducts a set of controlled experiments where the prompts for generating deep learning-based models are controlled with respect to sensitivity levels of four criteria including (1) Clarify and Specificity, (2) Objective and Intent, (3) Contextual Information, and (4) Format and Style. While the results are relatively mix, we observe some distinct patterns. We notice that using LLMs, we are able to generate deep learning-based models with executable codes for each dataset separately whose performance are comparable with the manually crafted and optimized LSTM models for predicting the whole time series dataset. We also noticed that ChatGPT outperforms the other LLMs in generating more accurate models. Furthermore, we observed that the goodness of the generated models vary with respect to the “temperature” parameter used in configuring LLMS. The results can be beneficial for data analysts and practitioners who would like to leverage generative AIs to produce good prediction models with acceptable goodness.</div></div>","PeriodicalId":100944,"journal":{"name":"Natural Language Processing Journal","volume":"9 ","pages":"Article 100120"},"PeriodicalIF":0.0,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143139094","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Lexical, sentiment and correlation analysis of sacred writings. A tale of cultural influxes and different ways to interpret reality","authors":"Alonso Felipe-Ruiz","doi":"10.1016/j.nlp.2024.100121","DOIUrl":"10.1016/j.nlp.2024.100121","url":null,"abstract":"<div><div>Natural Language Processing (NLP) has transformative potential for decoding sacred writings, bridging linguistic and temporal relationships between cultures. These texts, laden with cultural and religious significance. The study analyzes texts from 14 belief systems using lexical, sentiment and correlation assessment. The analysis revealed that sacred texts are complex due to archaic language, but tend to show similar themes, historical contexts, and emotional tones. The study highlights common terms found throughout the texts, but also revealing specific terms that are influenced by the cultural context of the belief system. It also explores the various depictions of fauna and flora, uncovering the impact of spatio-temporal contexts on the composition of sacred writings. Sentiment analysis reveals polarity variations between cultures and suggest conflicts of style during translation of the texts. Comparative analysis uncovers text clusters with high similarity and cultural influences between religions that coexisted. This work showcases NLP’s potential to enhance comprehension of sacred texts and promote cross-cultural understanding.</div></div>","PeriodicalId":100944,"journal":{"name":"Natural Language Processing Journal","volume":"9 ","pages":"Article 100121"},"PeriodicalIF":0.0,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143139095","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Daniel Cahn , Sarah Yeoh , Lakshya Soni , Ariele Noble , Mark A. Ungless , Emma Lawrance , Ovidiu Şerban
{"title":"Novel application of deep learning to evaluate conversations from a mental health text support service","authors":"Daniel Cahn , Sarah Yeoh , Lakshya Soni , Ariele Noble , Mark A. Ungless , Emma Lawrance , Ovidiu Şerban","doi":"10.1016/j.nlp.2024.100119","DOIUrl":"10.1016/j.nlp.2024.100119","url":null,"abstract":"<div><div>The Shout text support service supports individuals experiencing mental health distress through anonymous text conversations. As one of the first research projects on the Shout dataset and one of the first significant attempts to apply advanced deep learning to a text messaging service, this project is a proof-of-concept demonstrating the potential of using deep learning to text messages. Several areas of interest to Shout are identifying texter characteristics, emphasising high suicide-risk participants, and understanding what can make conversations helpful to texters. Therefore, from a mental health perspective, we look at (1) characterising texter demographics strictly based on the vocabulary used throughout the conversation, (2) predicting an individual’s risk of suicide or self-harm, and (3) assessing conversation success by developing robust outcome metrics. To fulfil these aims, a series of Machine Learning models were trained using data from post-conversation surveys to predict the different levels of suicide risk, whether a conversation was helpful, and texter characteristics, such as demographic information. The results show that language models based on Deep Learning significantly improve understanding of this highly subjective dataset. We compare traditional methods and basic meta-features with the latest developments in Transformer-based architectures and showcase the advantages of mental health research.</div></div>","PeriodicalId":100944,"journal":{"name":"Natural Language Processing Journal","volume":"9 ","pages":"Article 100119"},"PeriodicalIF":0.0,"publicationDate":"2024-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142702137","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Conceptual commonsense-aware attentive modeling with pre-trained masked language models for humor recognition","authors":"Yuta Sasaki , Jianwei Zhang , Yuhki Shiraishi","doi":"10.1016/j.nlp.2024.100117","DOIUrl":"10.1016/j.nlp.2024.100117","url":null,"abstract":"<div><div>Humor is an important component of daily communication and usually causes laughter that promotes mental and physical health. Understanding humor is sometimes difficult for humans and may be more difficult for AIs since it usually requires deep commonsense. In this paper, we focus on automatic humor recognition by extrapolating conceptual commonsense-aware modules to Pre-trained Masked Language Models (PMLMs) to provide external knowledge. Specifically, keywords are extracted from an input text and conceptual commonsense embeddings associated with the keywords are obtained by using a COMET decoder. By using multi-head attention the representations of the input text and the commonsense are integrated. In this way we attempt to enable the proposed model to access commonsense knowledge and thus recognize humor that is not detectable only by PMLM. Through the experiments on two datasets we explore different sizes of PMLMs and different amounts of commonsense and find some sweet spots of PMLMs’ scales for integrating commonsense to perform humor recognition well. Our proposed models improve the F1 score by up to 1.7% and 4.1% on the haHackathon and humicroedit datasets respectively. The detailed analyses show our models also improve the sensitivity to humor while retaining the predictive tendency of the corresponding PMLMs.</div></div>","PeriodicalId":100944,"journal":{"name":"Natural Language Processing Journal","volume":"9 ","pages":"Article 100117"},"PeriodicalIF":0.0,"publicationDate":"2024-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142702136","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Thushara Manjari Naduvilakandy, Hyeju Jang, Mohammad Al Hasan
{"title":"Unsupervised hypernymy directionality prediction using context terms","authors":"Thushara Manjari Naduvilakandy, Hyeju Jang, Mohammad Al Hasan","doi":"10.1016/j.nlp.2024.100118","DOIUrl":"10.1016/j.nlp.2024.100118","url":null,"abstract":"<div><div>Hypernymy directionality prediction is an important task in Natural Language Processing (NLP) due to its significant usages in natural language understanding and generation. Many supervised and unsupervised methods have been proposed for this task. Supervised methods require labeled examples, which are not readily available for many domains; besides, supervised models for this task that are trained on data from one domain performs poorly on data in a different domain. Therefore, unsupervised methods that are universally applicable for all domains are preferred. Existing unsupervised methods for hypernymy directionality prediction are outdated and suffer from poor performance. Specifically, they do not leverage distributional pre-trained vectors from neural language models, which have shown to be very effective in diverse NLP tasks. In this paper, we present DECIDE, a simple yet effective unsupervised method for hypernymy directionality prediction that exploits neural pre-trained vectors of words in context. By utilizing the distributional informativeness hypothesis over the context vectors, DECIDE predicts the hypernym directionality between a pair of words with a high accuracy. Extensive experiments on seven datasets demonstrate that DECIDE outperforms or achieves comparable performance to existing unsupervised and supervised methods.</div></div>","PeriodicalId":100944,"journal":{"name":"Natural Language Processing Journal","volume":"9 ","pages":"Article 100118"},"PeriodicalIF":0.0,"publicationDate":"2024-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142660100","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Nafiz Ahmed , Anik Kumar Saha , Md. Abdullah Al Noman , Jamin Rahman Jim , M.F. Mridha , Md Mohsin Kabir
{"title":"Deep learning-based natural language processing in human–agent interaction: Applications, advancements and challenges","authors":"Nafiz Ahmed , Anik Kumar Saha , Md. Abdullah Al Noman , Jamin Rahman Jim , M.F. Mridha , Md Mohsin Kabir","doi":"10.1016/j.nlp.2024.100112","DOIUrl":"10.1016/j.nlp.2024.100112","url":null,"abstract":"<div><div>Human–Agent Interaction is at the forefront of rapid development, with integrating deep learning techniques into natural language processing representing significant potential. This research addresses the complicated dynamics of Human–Agent Interaction and highlights the central role of Deep Learning in shaping the communication between humans and agents. In contrast to a narrow focus on sentiment analysis, this study encompasses various Human–Agent Interaction facets, including dialogue systems, language understanding and contextual communication. This study systematically examines applications, algorithms and models that define the current landscape of deep learning-based natural language processing in Human–Agent Interaction. It also presents common pre-processing techniques, datasets and customized evaluation metrics. Insights into the benefits and challenges of machine learning and Deep Learning algorithms in Human–Agent Interaction are provided, complemented by a comprehensive overview of the current state-of-the-art. The manuscript concludes with a comprehensive discussion of specific Human–Agent Interaction challenges and suggests thoughtful research directions. This study aims to provide a balanced understanding of models, applications, challenges and research directions in deep learning-based natural language processing in Human–Agent Interaction, focusing on recent contributions to the field.</div></div>","PeriodicalId":100944,"journal":{"name":"Natural Language Processing Journal","volume":"9 ","pages":"Article 100112"},"PeriodicalIF":0.0,"publicationDate":"2024-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142554496","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Persian readability classification using DeepWalk and tree-based ensemble methods","authors":"Mohammad Mahmoodi Varnamkhasti","doi":"10.1016/j.nlp.2024.100116","DOIUrl":"10.1016/j.nlp.2024.100116","url":null,"abstract":"<div><div>The Readability Classification (Difficulty classification) problem is the task of assessing the readability of text by categorizing it into different levels or classes based on its difficulty to understand. Applications ranging from language learning tools to website content optimization depend on readability classification. While numerous techniques have been proposed for readability classification in various languages, the topic has received little attention in the Persian (Farsi) language. Persian readability analysis poses unique challenges due to its complex morphology and flexible syntax, which necessitate a customized approach for accurate classification. In this research, we have proposed a method based on the nodes graph embedding and tree-based classification methods for sentence-level readability classification in the Persian language. The results indicate an F1-score of up to 0.961 in predicting the readability of Persian sentences.</div></div>","PeriodicalId":100944,"journal":{"name":"Natural Language Processing Journal","volume":"9 ","pages":"Article 100116"},"PeriodicalIF":0.0,"publicationDate":"2024-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142554497","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Hawk: An industrial-strength multi-label document classifier","authors":"Arshad Javeed","doi":"10.1016/j.nlp.2024.100115","DOIUrl":"10.1016/j.nlp.2024.100115","url":null,"abstract":"<div><div>There are a plethora of methods for solving the classical multi-label document classification problem. However, when it comes to deployment and usage in an industry setting, most if not all the contemporary approaches fail to address some of the vital aspects or requirements of an ideal solution: i) ability to operate on variable-length texts or rambling documents, ii) catastrophic forgetting problem, and iii) ability to visualize the model’s predictions. The paper describes the significance of these problems in detail and adopts the hydranet architecture to address these problems. The proposed architecture views documents as a sequence of sentences and leverages sentence-level embeddings for input representation, turning the problem into a sequence classification task. Furthermore, two specific architectures are explored as the architectures for the heads, Bi-LSTM and transformer heads. The proposed architecture is benchmarked on some of the popular benchmarking datasets such as Web of Science - 5763, Web of Science - 11967, BBC Sports, and BBC News datasets. The experimental results reveal that the proposed model performs at least as best as previous SOTA architectures and even outperforms prior SOTA in a few cases, along with the added advantages of the practicality issues discussed. The ablation study includes comparisons of the impact of the attention mechanism and the application of weighted loss functions to train the task-specific heads in the hydranet. The claims regarding catastrophic forgetfulness are further corroborated by empirical evaluations under incremental learning scenarios. The results reveal the robustness of the proposed architecture compared to other benchmarks.</div></div>","PeriodicalId":100944,"journal":{"name":"Natural Language Processing Journal","volume":"9 ","pages":"Article 100115"},"PeriodicalIF":0.0,"publicationDate":"2024-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142577899","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Validating pretrained language models for content quality classification with semantic-preserving metamorphic relations","authors":"Pak Yuen Patrick Chan, Jacky Keung","doi":"10.1016/j.nlp.2024.100114","DOIUrl":"10.1016/j.nlp.2024.100114","url":null,"abstract":"<div><h3>Context:</h3><div>Utilizing pretrained language models (PLMs) has become common practice in maintaining the content quality of question-answering (Q&A) websites. However, evaluating the effectiveness of PLMs poses a challenge as they tend to provide local optima rather than global optima.</div></div><div><h3>Objective:</h3><div>In this study, we propose using semantic-preserving Metamorphic Relations (MRs) derived from Metamorphic Testing (MT) to address this challenge and validate PLMs.</div></div><div><h3>Methods:</h3><div>To validate four selected PLMs, we conducted an empirical experiment using a publicly available dataset comprising 60000 data points. We defined three groups of Metamorphic Relations (MRGs), consisting of thirteen semantic-preserving MRs, which were then employed to generate “Follow-up” testing datasets based on the original “Source” testing datasets. The PLMs were trained using a separate training dataset. A comparison was made between the predictions of the four trained PLMs for “Source” and “Follow-up” testing datasets in order to identify instances of violations, which corresponded to inconsistent predictions between the two datasets. If no violation was found, it indicated that the PLM was insensitive to the associate MR; thereby, the MR can be used for validation. In cases where no violation occurred across the entire MRG, non-violation regions were identified and supported simulation metamorphic testing.</div></div><div><h3>Results:</h3><div>The results of this study demonstrated that the proposed MRs could effectively serve as a validation tool for content quality classification on Stack Overflow Q&A using PLMs. One PLM did not violate the “Uppercase conversion” MRG and the “Duplication” MRG. Furthermore, the absence of violations in the MRGs allowed for the identification of non-violation regions, confirming the ability of the proposed MRs to support simulation metamorphic testing.</div></div><div><h3>Conclusion:</h3><div>The experimental findings indicate that the proposed MRs can validate PLMs effectively and support simulation metamorphic testing for PLMs. However, further investigations are required to enhance the semantic comprehension and common sense knowledge of PLMs and explore highly informative statistical patterns of PLMs, in order to improve their overall performance.</div></div>","PeriodicalId":100944,"journal":{"name":"Natural Language Processing Journal","volume":"9 ","pages":"Article 100114"},"PeriodicalIF":0.0,"publicationDate":"2024-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142526279","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Unveiling personality traits through Bangla speech using Morlet wavelet transformation and BiG","authors":"Md. Sajeebul Islam Sk., Md. Golam Rabiul Alam","doi":"10.1016/j.nlp.2024.100113","DOIUrl":"10.1016/j.nlp.2024.100113","url":null,"abstract":"<div><div>Speech serves as a potent medium for expressing a wide array of psychologically significant attributes. While earlier research on deducing personality traits from user-generated speech predominantly focused on other languages, there is a noticeable absence of prior studies and datasets for automatically assessing user personalities from Bangla speech. In this paper, our objective is to bridge the research gap by generating speech samples, each imbued with distinct personality profiles. These personality impressions are subsequently linked to OCEAN (Openness, Conscientiousness, Extroversion, Agreeableness, and Neuroticism) personality traits. To gauge accuracy, human evaluators, unaware of the speaker’s identity, assess these five personality factors. The dataset is predominantly composed of around 90% content sourced from online Bangla newspapers, with the remaining 10% originating from renowned Bangla novels. We perform feature level fusion by combining MFCCs with LPC features to set MELP and MEWLP features. We introduce MoMF feature extraction method by transforming Morlet wavelet and fusing MFCCs feature. We develop two soft voting ensemble models, DistilRo (based on DistilBERT and RoBERTa) and BiG (based on Bi-LSTM and GRU), for personality classification in speech-to-text and speech modalities, respectively. The DistilRo model has gained F-1 score 89% in speech-to-text and the BiG model has gained F-1 score 90% in speech modality.</div></div>","PeriodicalId":100944,"journal":{"name":"Natural Language Processing Journal","volume":"9 ","pages":"Article 100113"},"PeriodicalIF":0.0,"publicationDate":"2024-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142526278","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}