{"title":"Unveiling personality traits through Bangla speech using Morlet wavelet transformation and BiG","authors":"Md. Sajeebul Islam Sk., Md. Golam Rabiul Alam","doi":"10.1016/j.nlp.2024.100113","DOIUrl":"10.1016/j.nlp.2024.100113","url":null,"abstract":"<div><div>Speech serves as a potent medium for expressing a wide array of psychologically significant attributes. While earlier research on deducing personality traits from user-generated speech predominantly focused on other languages, there is a noticeable absence of prior studies and datasets for automatically assessing user personalities from Bangla speech. In this paper, our objective is to bridge the research gap by generating speech samples, each imbued with distinct personality profiles. These personality impressions are subsequently linked to OCEAN (Openness, Conscientiousness, Extroversion, Agreeableness, and Neuroticism) personality traits. To gauge accuracy, human evaluators, unaware of the speaker’s identity, assess these five personality factors. The dataset is predominantly composed of around 90% content sourced from online Bangla newspapers, with the remaining 10% originating from renowned Bangla novels. We perform feature level fusion by combining MFCCs with LPC features to set MELP and MEWLP features. We introduce MoMF feature extraction method by transforming Morlet wavelet and fusing MFCCs feature. We develop two soft voting ensemble models, DistilRo (based on DistilBERT and RoBERTa) and BiG (based on Bi-LSTM and GRU), for personality classification in speech-to-text and speech modalities, respectively. The DistilRo model has gained F-1 score 89% in speech-to-text and the BiG model has gained F-1 score 90% in speech modality.</div></div>","PeriodicalId":100944,"journal":{"name":"Natural Language Processing Journal","volume":"9 ","pages":"Article 100113"},"PeriodicalIF":0.0,"publicationDate":"2024-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142526278","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"V-LTCS: Backbone exploration for Multimodal Misogynous Meme detection","authors":"Sneha Chinivar , Roopa M.S. , Arunalatha J.S. , Venugopal K.R.","doi":"10.1016/j.nlp.2024.100109","DOIUrl":"10.1016/j.nlp.2024.100109","url":null,"abstract":"<div><div>Memes have become a fundamental part of online communication and humour, reflecting and shaping the culture of today’s digital age. The amplified Meme culture is inadvertently endorsing and propagating casual Misogyny. This study proposes V-LTCS (Vision- Language Transformer Combination Search), a framework that encompasses all possible combinations of the most fitting Text (<em>i.e.</em> BERT, ALBERT, and XLM-R) and Vision (<em>i.e.</em> Swin, ConvNeXt, and ViT) Transformer Models to determine the backbone architecture for identifying Memes that contains misogynistic contents. All feasible Vision-Language Transformer Model combinations obtained from the recognized optimal Text and Vision Transformer Models are evaluated on two (smaller and larger) datasets using varied standard metrics (<em>viz.</em> Accuracy, Precision, Recall, and F1-Score). The BERT-ViT combinational Transformer Model demonstrated its efficiency on both datasets, validating its ability to serve as a backbone architecture for all subsequent efforts to recognize Multimodal Misogynous Memes.</div></div>","PeriodicalId":100944,"journal":{"name":"Natural Language Processing Journal","volume":"9 ","pages":"Article 100109"},"PeriodicalIF":0.0,"publicationDate":"2024-10-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142420060","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Recent advancements in automatic disordered speech recognition: A survey paper","authors":"Nada Gohider, Otman A. Basir","doi":"10.1016/j.nlp.2024.100110","DOIUrl":"10.1016/j.nlp.2024.100110","url":null,"abstract":"<div><div>Automatic Speech Recognition (ASR) technology has recently witnessed a paradigm shift with respect to performance accuracy. Nevertheless, impaired speech remains a significant challenge, evidenced by the inadequate accuracy of existing ASR solutions. This lacking is reported in various research reports. While this lacking has motivated new directions in <em>Automatic Disordered Speech Recognition</em> (ADSR), the gap between ASR performance accuracy and that of ADSR remains significant. In this paper, we report a consolidated account of research work conducted to date to address this gap, highlighting the root causes of such performance discrepancy and discussing prominent research directions in this area. The paper raises some fundamental issues and challenges that ADSR research faces today. Firstly, we discuss the adequacy of impaired speech representation in existing datasets, in terms of the diversity of speech impairments, speech continuity, speech style, vocabulary, age group, and the environments of the data collection process. We argue that disordered speech is poorly represented in the existing datasets; thus, it is expected that several fundamental components needed for training ADSR models are absent. Most of the open-access databases of impaired speech focus on adult dysarthric speakers, ignoring a wide spectrum of speech disorders and age groups. Furthermore, the paper reviews prominent research directions adopted by the ADSR research community in its effort to advance speech recognition technology for impaired speakers. We categorize this research effort into directions such as personalized models, model adaptation, data augmentation, and multi-modal learning. Although these research directions have advanced the performance of ADSR models, we believe there is still potential for further advancement since current efforts, in essence, make the false assumption that there is a limited distribution shift between the source and target data. Finally, we stress the need to investigate performance measures other than Word Error Rate (WER)- measures that can reliably encode the contribution of erroneous output tokens in the final uttered message.</div></div>","PeriodicalId":100944,"journal":{"name":"Natural Language Processing Journal","volume":"9 ","pages":"Article 100110"},"PeriodicalIF":0.0,"publicationDate":"2024-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142419914","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Silvia Sifath , Tania Islam , Md Erfan , Samrat Kumar Dey , MD. Minhaj Ul Islam , Md Samsuddoha , Tazizur Rahman
{"title":"Recurrent neural network based multiclass cyber bullying classification","authors":"Silvia Sifath , Tania Islam , Md Erfan , Samrat Kumar Dey , MD. Minhaj Ul Islam , Md Samsuddoha , Tazizur Rahman","doi":"10.1016/j.nlp.2024.100111","DOIUrl":"10.1016/j.nlp.2024.100111","url":null,"abstract":"<div><div>Cyberbullying is one of the crimes that arise rapidly through the daily use of technology by different types of people and, most notably, by sharing one’s opinions or feelings on social media in a harmful manner. It has several negative effects on society such as depression, anxiety, suicide, and so on. At the same time, it reduces productivity, causes psychological damage that can last a lifetime and increases violence among people. To prevent cyberbullying or take necessary steps against the harasser, the first step is to detect cyberbullying. Several works exist to detect and classify cyberbullying but a few works have been carried out to classify cyberbullying in the Bengali Language. As the number of people is increased day by day who communicate on social media using the Bengali language, it is crucial to address this situation and improve both accuracy and robustness to detect and classify cyberbullying. For this purpose, we propose an NLP-based model using machine learning and deep learning algorithms to detect and classify Bengali comments on social media. This research specifies cyberbullying comments using a multiclass classification strategy. Kaggle and Melany are used to collect the dataset to train and evaluate our model. The dataset contains 56308 Bengali comments, consisting of four distinct categories. The categories are not bully, trolls, sexual, and threats. We use different machine learning algorithms such as Support Vector Machine, Logistic Regression, Random Forest, XGBOOST, Multinomial Naïve Bayes, Deep learning algorithm, Recurrent Neural Network (RNN), and two fusion models. Along with that effective preprocessing steps are implemented to get a suitable dataset. In this study, the Recurrent Neural Network gives the best accuracy, which is 86%. The accuracy of our model is good enough to help social media users and encourage them to practice morality.</div></div>","PeriodicalId":100944,"journal":{"name":"Natural Language Processing Journal","volume":"9 ","pages":"Article 100111"},"PeriodicalIF":0.0,"publicationDate":"2024-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142420061","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Antônio Marcos Rodrigues Franco, Ítalo Cunha, Leonardo B. Oliveira
{"title":"Evaluation of deep neural network architectures for authorship obfuscation of Portuguese texts","authors":"Antônio Marcos Rodrigues Franco, Ítalo Cunha, Leonardo B. Oliveira","doi":"10.1016/j.nlp.2024.100107","DOIUrl":"10.1016/j.nlp.2024.100107","url":null,"abstract":"<div><div>Preserving authorship anonymity is paramount to protect activists, freedom of expression, and critical journalism. Although there are several mechanisms to provide anonymity on the Internet, one can still identify anonymous authors through their writing style. With the advances in neural network and natural language processing research, the success of a classifier when identifying the author of a text is growing. On the other hand, new approaches that use recurrent neural networks for automatic generation of obfuscated texts have also arisen to fight anonymity adversaries. In this work, we evaluate two approaches that use neural networks to generate obfuscated texts. The first approach uses Generative Adversarial Networks to train an encoder–decoder to transform sentences from an input style into a target style. The second one trains an auto encoder with Gradient Reversal Layer to learn invariant representations. In our experiments, we compared the efficiency of both techniques when removing the stylistic attributes of a text and preserving its original semantics. Our evaluation on real texts clarifies each technique’s trade-offs for Portuguese texts and provides guidance on practical deployment.</div></div>","PeriodicalId":100944,"journal":{"name":"Natural Language Processing Journal","volume":"9 ","pages":"Article 100107"},"PeriodicalIF":0.0,"publicationDate":"2024-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142323946","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Syed Sihab-Us-Sakib , Md. Rashadur Rahman , Md. Shafiul Alam Forhad , Md. Atiq Aziz
{"title":"Cyberbullying detection of resource constrained language from social media using transformer-based approach","authors":"Syed Sihab-Us-Sakib , Md. Rashadur Rahman , Md. Shafiul Alam Forhad , Md. Atiq Aziz","doi":"10.1016/j.nlp.2024.100104","DOIUrl":"10.1016/j.nlp.2024.100104","url":null,"abstract":"<div><div>The rise of the internet and social media has facilitated diverse interactions among individuals, but it has also led to an increase in cyberbullying—a phenomenon with detrimental effects on mental health, including the potential to induce suicidal thoughts. To combat this issue, we have developed the Cyberbullying Bengali Dataset (CBD), a novel resource containing 2751 manually labeled texts categorized into five classes, including various forms of cyberbullying and non-bullying instances. In our study on cyberbullying detection, we conducted an extensive evaluation of various machine learning and deep learning models. Specifically, we examined Support Vector Machine (SVM), Multinomial Naive Bayes (MNB), and Random Forest (RF) among the traditional machine learning models. For deep learning models, we explored Gated Recurrent Unit (GRU), Convolutional Neural Network (CNN), Long Short-Term Memory (LSTM), and Bidirectional LSTM (BiLSTM). We have also experimented with state-of-the-art transformer architectures, including m-BERT, BanglaBERT, and XLM-RoBERTa. After rigorous experimentation, XLM-RoBERTa emerged as the most effective model, achieving a significant F1-score of 0.83 and an accuracy of 82.61%, outperforming all other models. Our work provides insights into effective cyberbullying detection on platforms like Facebook, YouTube, and Instagram.</div></div>","PeriodicalId":100944,"journal":{"name":"Natural Language Processing Journal","volume":"9 ","pages":"Article 100104"},"PeriodicalIF":0.0,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142323945","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Zheng Lin Chia , Michal Ptaszynski , Marzena Karpinska , Juuso Eronen , Fumito Masui
{"title":"Initial exploration into sarcasm and irony through machine translation","authors":"Zheng Lin Chia , Michal Ptaszynski , Marzena Karpinska , Juuso Eronen , Fumito Masui","doi":"10.1016/j.nlp.2024.100106","DOIUrl":"10.1016/j.nlp.2024.100106","url":null,"abstract":"<div><p>In this paper, we investigate sarcasm and irony as seen through a novel perspective of machine translation. We employ various techniques for translation, comparing both manually and automatically translated datasets of irony and sarcasm. We first clarify the definitions of irony and sarcasm and present an exhaustive field review of studies on irony both from purely linguistic as well as computational linguistic perspectives. We also propose a novel evaluation metric for the purpose of evaluating translations of figurative language, with a focus on machine-translated irony and sarcasm. The constructed English and Chinese parallel dataset includes polarized content from tweets as well as forum posts, categorized by irony types. The preferred translation model, mBART-50, is identified through a thorough experimental process. Optimal translation settings and the best-finetuned model for irony are explored, with the most effective model being finetuned on both ironic and non-ironic data. We also experimented which types of irony are best suitable for training in this specific task — short microblogging messages or longer forum posts. Moreover, we compare the capabilities of a well fine-tuned mBART to a prompt-based method using the recently popular ChatGPT model, with the conclusion that the former still outperforms the latter, although ChatGPT without any training can be considered as a “good enough” ad hoc solution in the case of a lack of data for training. Finally, we verify if the translated data – either manually, or with an MT model – can be used as training data in a task of irony detection. We believe that the presented research can be expanded into languages other than the presented here Chinese and English, which together with the ability to detect various categories of irony, could contribute to deepening the understanding of figurative language, especially irony and sarcasm.</p></div>","PeriodicalId":100944,"journal":{"name":"Natural Language Processing Journal","volume":"9 ","pages":"Article 100106"},"PeriodicalIF":0.0,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2949719124000542/pdfft?md5=1fda68d5c29cfb5c586ec5b4c9c004ae&pid=1-s2.0-S2949719124000542-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142239626","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Md. Ali Akber , Tahira Ferdousi , Rasel Ahmed , Risha Asfara , Raqeebir Rab , Umme Zakia
{"title":"Personality and emotion—A comprehensive analysis using contextual text embeddings","authors":"Md. Ali Akber , Tahira Ferdousi , Rasel Ahmed , Risha Asfara , Raqeebir Rab , Umme Zakia","doi":"10.1016/j.nlp.2024.100105","DOIUrl":"10.1016/j.nlp.2024.100105","url":null,"abstract":"<div><p>Personality and emotions have always been closely intertwined since humans evolved, adapting to these two forms. Emotions are indicative of a person’s personality, and vice versa. This paper aims to investigate the complex relationship between these two fundamental aspects of human behavior using the concepts of machine learning and statistical analysis. The objective is to automate the process of determining the relationship between personality traits of the MBTI (Myers-Briggs Type Indicator) and Ekman’s emotions based on the context of user-written social media posts using contextual embedding. A robust mechanism is employed, involving two main phases to figure out emotions from the social media posts. The first phase involves determining the cosine similarity scores between each MBTI personality trait and predefined emotions. The second phase introduces a cross-dataset learning approach where several machine learning models are trained on a dataset labeled with emotions to learn patterns of emotions found in the text. After training, these models utilize the patterns they learned to predict emotions in a targeted dataset. With an overall accuracy of 85.23%, the Support Vector Machine (SVM) is chosen as the most effective and high-performing model for emotion prediction tasks. We employed a vetting mechanism combining two approaches to improve accuracy, reliability, and trustworthiness for the final emotion prediction. Finally, using statistical quantification, this paper finds patterns that link each MBTI personality trait with Ekman emotions. It reveals that extroverts (E), sensing (S), and feeling (F) personality types are more likely to share joyful and surprising emotional posts, while individuals with extroversion (E), intuition (N), thinking (T), and perception (P) traits tend to express negative emotions such as anger and disgust. Conversely, introverts (I), intuitive (N), thinking (T), and judging (J) personalities are more inclined to share posts reflecting fear and sadness. This comprehensive study provides valuable insights on how individuals with different personality types typically express emotions on social media.</p></div>","PeriodicalId":100944,"journal":{"name":"Natural Language Processing Journal","volume":"9 ","pages":"Article 100105"},"PeriodicalIF":0.0,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2949719124000530/pdfft?md5=7f4d308abb64fc3a802f27722eaef0b5&pid=1-s2.0-S2949719124000530-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142274263","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Auto-DSM: Using a Large Language Model to generate a Design Structure Matrix","authors":"Edwin C.Y. Koh","doi":"10.1016/j.nlp.2024.100103","DOIUrl":"10.1016/j.nlp.2024.100103","url":null,"abstract":"<div><p>The Design Structure Matrix (DSM) is an established method used in dependency modelling, especially in the design of complex engineering systems. The generation of DSM is traditionally carried out through manual means and can involve interviewing experts to elicit critical system elements and the relationships between them. Such manual approaches can be time-consuming and costly. This paper presents a workflow that uses a Large Language Model (LLM) to support the generation of DSM and improve productivity. A prototype of the workflow was developed in this work and applied on a diesel engine DSM published previously. It was found that the prototype could reproduce 357 out of 462 DSM entries published (i.e. 77.3%), suggesting that the work can aid DSM generation. A no-code version of the prototype is made available online to support future research.</p></div>","PeriodicalId":100944,"journal":{"name":"Natural Language Processing Journal","volume":"9 ","pages":"Article 100103"},"PeriodicalIF":0.0,"publicationDate":"2024-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2949719124000517/pdfft?md5=0a3d0db24ef947b4a2f3aa1b8fd3ddb1&pid=1-s2.0-S2949719124000517-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142239627","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Job description parsing with explainable transformer based ensemble models to extract the technical and non-technical skills","authors":"Abbas Akkasi","doi":"10.1016/j.nlp.2024.100102","DOIUrl":"10.1016/j.nlp.2024.100102","url":null,"abstract":"<div><p>The rapid digitization of the economy is transforming the job market, creating new roles and reshaping existing ones. As skill requirements evolve, identifying essential competencies becomes increasingly critical. This paper introduces a novel ensemble model that combines traditional and transformer-based neural networks to extract both technical and non-technical skills from job descriptions. A substantial dataset of job descriptions from reputable platforms was meticulously annotated for 22 IT roles. The model demonstrated superior performance in extracting both non-technical (67% F-score) and technical skills (72% F-score) compared to conventional CRF and hybrid deep learning models. Specifically, the proposed model outperformed these baselines by an average margin of 10% and 6%, respectively, for non-technical skills, and 29% and 6.8% for technical skills. A 5 × 2cv paired t-test confirmed the statistical significance of these improvements. In addition, to enhance model interpretability, Local Interpretable Model-Agnostic Explanations (LIME) were employed in the experiments.</p></div>","PeriodicalId":100944,"journal":{"name":"Natural Language Processing Journal","volume":"9 ","pages":"Article 100102"},"PeriodicalIF":0.0,"publicationDate":"2024-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2949719124000505/pdfft?md5=a597d9732dfab2f3ac80c6409cc94264&pid=1-s2.0-S2949719124000505-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142239061","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}