Natural Language Processing Journal最新文献

Whose morality do they speak? Unraveling cultural bias in multilingual language models 他们说的是谁的道德？揭示多语言模型中的文化偏见

Natural Language Processing Journal Pub Date : 2025-06-30 DOI: 10.1016/j.nlp.2025.100172

Meltem Aksoy

引用次数: 0

A hybrid BERT-BiRNN framework for mental health prediction using textual data 基于文本数据的心理健康预测混合BERT-BiRNN框架

Natural Language Processing Journal Pub Date : 2025-06-26 DOI: 10.1016/j.nlp.2025.100165

Muhammad Nouman , Sui Yang Khoo , M.A. Parvez Mahmud , Abbas Z. Kouzani

{"title":"A hybrid BERT-BiRNN framework for mental health prediction using textual data","authors":"Muhammad Nouman , Sui Yang Khoo , M.A. Parvez Mahmud , Abbas Z. Kouzani","doi":"10.1016/j.nlp.2025.100165","DOIUrl":"10.1016/j.nlp.2025.100165","url":null,"abstract":"<div><div>Effective mental health prediction requires training of artificial intelligence algorithms on relevant datasets obtained from individuals suffering from mental illnesses. This study employs a labelled text dataset derived from <em>Lyf Support app</em>. To harness the potential of this dataset for the development of a mental health prediction tool, we propose a novel technique that utilises the bidirectional encoder representations from transformers (BERT) model to identify mental health-related text chats. This technique enables effective and accurate identification of textual content relevant to mental health, facilitating the creation of an advanced prediction model. It is capable of extracting word embeddings retaining the semantic and contextual meaning of words. Then, the bidirectional long-short-term memory (BiLSTM) and bidirectional gated recurrent unit (BiGRU) models are employed as a sequence processing classifier to effectively analyse and detect signs of mental illness from text chats. Extensive experiments are conducted, and the results are compared against the state-of-the-art models, suggesting that our method outperforms the others by achieving 92.4% accuracy. Overall, this study establishes a good foundation for future research endeavours in mental health prediction approaches. The methodologies and findings presented herein pave the way for further advancements and innovations in this field of study.</div></div>","PeriodicalId":100944,"journal":{"name":"Natural Language Processing Journal","volume":"12 ","pages":"Article 100165"},"PeriodicalIF":0.0,"publicationDate":"2025-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144522057","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A survey and evaluation of text-to-speech systems for the Tamil language 泰米尔语文本转语音系统的调查与评价

Natural Language Processing Journal Pub Date : 2025-06-25 DOI: 10.1016/j.nlp.2025.100171

Ahrane Mahaganapathy, Kengatharaiyer Sarveswaran

{"title":"A survey and evaluation of text-to-speech systems for the Tamil language","authors":"Ahrane Mahaganapathy, Kengatharaiyer Sarveswaran","doi":"10.1016/j.nlp.2025.100171","DOIUrl":"10.1016/j.nlp.2025.100171","url":null,"abstract":"<div><div>This survey provides a comprehensive review of existing Tamil Text-to-Speech (TTS) synthesis systems, synthesis approaches, evaluation approaches, and highlights state-of-the-art approaches and challenges in handling linguistic nuances. Voice-based interfaces are becoming part of life. Therefore, it is import to have an expensive TTS system which can make human experience better. Tamil, with its rich linguistic features and diagnostic nature, presents significant challenges to speech synthesis. In addition to the survey, importantly this work proposes a perceptual evaluation framework which consists of expressiveness, low listening fatigue, and overall quality, in addition to traditional intelligibility and naturalness, dimensions to evaluate better human experience. This study also uses the Comparative Mean Opinion Score (CMOS) for the subjective evaluation instead of the Mean Opinion Score. A dataset for the evaluation was also carefully prepared and six widely used Tamil TTS systems were evaluated using Word Error Rate and the subjective evaluation was done using the proposed evaluation framework with the support of 30 evaluators. The reliability of the subjective evaluation is also assessed using Krippendorff’s Alpha. The results indicate the existing systems have significant room for improvement in all perceptual dimensions. The study underscores the need for evaluation datasets and evaluation approaches that cater to subjective perceptual dimensions of speech synthesis for better human experience and lays a foundation for future research and development in Tamil and similar TTS systems.</div></div>","PeriodicalId":100944,"journal":{"name":"Natural Language Processing Journal","volume":"12 ","pages":"Article 100171"},"PeriodicalIF":0.0,"publicationDate":"2025-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144549312","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Improving multilabel text emotion detection with emotion interrelation anchors 利用情感关联锚点改进多标签文本情感检测

Natural Language Processing Journal Pub Date : 2025-06-25 DOI: 10.1016/j.nlp.2025.100170

Polydoros Giannouris , Vasileios Mygdalis , Ioannis Pitas

{"title":"Improving multilabel text emotion detection with emotion interrelation anchors","authors":"Polydoros Giannouris , Vasileios Mygdalis , Ioannis Pitas","doi":"10.1016/j.nlp.2025.100170","DOIUrl":"10.1016/j.nlp.2025.100170","url":null,"abstract":"<div><div>Emotion detection studies the problem of automatic identification of emotions expressed in text. Since multiple emotions may co-occur in a single text excerpt, state-of-the-art approaches often cast this multi-label classification task to multiple, independent binary classification tasks, each specialized for one emotion class. The main disadvantage of such approaches is that, by design, each binary classifier overlooks typical emotion interrelationships, such as co-occurrence (e.g., anger and fear) or mutual exclusiveness (e.g., sadness and joy). This paper proposes a simple and lightweight approach to re-introduce emotion interrelations into each binary classification task, where each binary classifier is able to understand the presence of other emotions, without directly inferring them. This is achieved by incorporating the proposed emotion anchors (i.e. features of representative emotional phrases) into the model of each binary classifier. More specifically, the model is trained to incorporate other emotions in its representation by learning the parameters of an attention mechanism. Based on experiments on multiple datasets, our approach improves emotion classification performance in both supervised and few-shot domain adaptation settings, outperforming standard binary models in terms of accuracy and macro averaged F1-scores. The approach is generic and can be applied to other interrelated multi-label binary classification tasks.</div></div>","PeriodicalId":100944,"journal":{"name":"Natural Language Processing Journal","volume":"12 ","pages":"Article 100170"},"PeriodicalIF":0.0,"publicationDate":"2025-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144489871","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Homophobia and transphobia span identification in low-resource languages 同性恋恐惧症和跨性别恐惧症跨越了低资源语言的识别

Natural Language Processing Journal Pub Date : 2025-06-24 DOI: 10.1016/j.nlp.2025.100169

Prasanna Kumar Kumaresan , Devendra Deepak Kayande , Ruba Priyadharshini , Paul Buitelaar , Bharathi Raja Chakravarthi

{"title":"Homophobia and transphobia span identification in low-resource languages","authors":"Prasanna Kumar Kumaresan , Devendra Deepak Kayande , Ruba Priyadharshini , Paul Buitelaar , Bharathi Raja Chakravarthi","doi":"10.1016/j.nlp.2025.100169","DOIUrl":"10.1016/j.nlp.2025.100169","url":null,"abstract":"<div><div>Online platforms have become prevalent because they promote free speech and group discussions. However, they also serve as platforms for hate speech, which can negatively impact the psychological well-being of vulnerable people. This is especially true for members of the LGBTQ＋ community, who are often the targets of homophobia and transphobia in online environments. Our study makes three main contributions: (1) we developed a new dataset with span-level annotations for homophobia and transphobia in Tamil, English, and Marathi; (2) we employed advanced language models using BERT-based architectures, Conditional Random Field (CRF), and Bidirectional Long Short-Term Memory (BiLSTM) layers to enhance span-level detection of harmful content; and (3) we conducted benchmarking to evaluate the effectiveness of monolingual and multilingual models in detecting subtle forms of hate speech. The annotated dataset, which is collected from real-world social media (YouTube) content, provides diverse language contexts and enhances the representation of low-resource languages. The span-based detection approach enables models to detect subtle linguistic nuances, leading to more precise content moderation that accounts for cultural differences. The experimental results show that our models achieve effective span detection, which provides valuable information for creating inclusive moderation tools. Our research leads to the development of AI systems, and we aim to reduce the burden on moderators and improve the quality of online experiences for LGBTQ＋ vulnerable.</div></div>","PeriodicalId":100944,"journal":{"name":"Natural Language Processing Journal","volume":"12 ","pages":"Article 100169"},"PeriodicalIF":0.0,"publicationDate":"2025-06-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144489870","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

The evolution of language models: From N-Grams to LLMs, and beyond 语言模型的演变：从n - gram到llm，以及其他

Natural Language Processing Journal Pub Date : 2025-06-24 DOI: 10.1016/j.nlp.2025.100168

Mohammad Ghaseminejad Raeini

引用次数: 0

On English-Chinese Neural Machine Translation leveraging Transformer model 基于Transformer模型的英汉神经机器翻译研究

Natural Language Processing Journal Pub Date : 2025-06-23 DOI: 10.1016/j.nlp.2025.100166

Subrota Kumar Mondal , Yijun Chen , Yuning Cheng , Hong-Ning Dai , Syed B. Alam , H.M. Dipu Kabir

{"title":"On English-Chinese Neural Machine Translation leveraging Transformer model","authors":"Subrota Kumar Mondal , Yijun Chen , Yuning Cheng , Hong-Ning Dai , Syed B. Alam , H.M. Dipu Kabir","doi":"10.1016/j.nlp.2025.100166","DOIUrl":"10.1016/j.nlp.2025.100166","url":null,"abstract":"<div><div>In today’s era of globalization, people’s cross-cultural communication has become increasingly frequent, and photo translation (photo, image, or scene text translation) technology has become an important tool. By using this translation technology, people can easily recognize and translate text from other languages without the need for manual input or translation. This has important practical value for people in fields such as tourism, business, education, and research. Therefore, photo translation technology has become an indispensable tool, providing more convenience to people’s lives and work. To this, this paper aims to achieve high accuracy English to Chinese photo translation, which can be divided into three stages: <span>text detection</span>, <span>text recognition</span>, and <span>text translation (i.e., machine translation)</span>. We observe that in text detection and recognition, we have challenges with occluded text, hand-written text, scene text, text with complex layout, distorted text, and many others. However, in this paper, we limit our analysis to Translation phase. For detection and recognition phase, we make use of current state-of-the-art methodologies, such as <span>DBNet</span> (Liao et al., 2020) model for detection and the <span>ABINet</span> (Fang et al., 2021) model for recognition. In the translation part, we use Transformer model with modifications towards improving the translation accuracy. The modifications are mainly reflected in two aspects: <span>data preprocessing</span> and <span>optimizer</span>. In the data preprocessing part, we use the <span>BPE</span> (Byte Pair Encoding) algorithm instead of basic word-centered tokenization algorithms. In the context, <span>BPE</span> algorithm can divide words into smaller subwords, which can solve the problem of rare words to some extent and provide better word vectors for language model training. In the optimizer part, we use the <span>Lion</span> model proposed by Google instead of the widely used <span>Adam</span> optimizer that helps reduce the loss more quickly than using Adam optimizer for small size batch — with batch size 256 achieves the lowest test loss 0.392842 (−1.072171) and the highest BLEU4 score 0.381281 (+0.24063). This adds value in reducing the consumption of training resources and the sustainability of deep learning.</div></div>","PeriodicalId":100944,"journal":{"name":"Natural Language Processing Journal","volume":"12 ","pages":"Article 100166"},"PeriodicalIF":0.0,"publicationDate":"2025-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144472472","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Can “consciousness” be observed from large language model (LLM) internal states? Dissecting LLM representations obtained from Theory of Mind test with Integrated Information Theory and Span Representation analysis “意识”可以从大型语言模型（LLM）内部状态观察到吗？运用综合信息理论和广度表征分析方法对心理理论测试的法学硕士表征进行剖析

Natural Language Processing Journal Pub Date : 2025-06-19 DOI: 10.1016/j.nlp.2025.100163

Jingkai Li

{"title":"Can “consciousness” be observed from large language model (LLM) internal states? Dissecting LLM representations obtained from Theory of Mind test with Integrated Information Theory and Span Representation analysis","authors":"Jingkai Li","doi":"10.1016/j.nlp.2025.100163","DOIUrl":"10.1016/j.nlp.2025.100163","url":null,"abstract":"<div><div>Integrated Information Theory (IIT) provides a quantitative framework for explaining consciousness phenomenon, positing that conscious systems comprise elements integrated through causal properties. We apply IIT 3.0 and 4.0 — the latest iterations of this framework — to sequences of Large Language Model (LLM) representations, analyzing data derived from existing Theory of Mind (ToM) test results. Our study systematically investigates whether the differences of ToM test performances, when presented in the LLM representations, can be revealed by IIT estimates, i.e., <span><math><msup><mrow><mi>Φ</mi></mrow><mrow><mo>max</mo></mrow></msup></math></span> (IIT 3.0), <span><math><mi>Φ</mi></math></span> (IIT 4.0), Conceptual Information (IIT 3.0), and <span><math><mi>Φ</mi></math></span>-structure (IIT 4.0). Furthermore, we compare these metrics with the Span Representations independent of any estimate for consciousness. This additional effort aims to differentiate between potential “consciousness” phenomena and inherent separations within LLM representational space. We conduct comprehensive experiments examining variations across LLM transformer layers and linguistic spans from stimuli. Our results suggest that sequences of contemporary Transformer-based LLM representations lack statistically significant indicators of observed “consciousness” phenomena but exhibit intriguing patterns under <em>spatio</em>-permutational analyses.</div></div>","PeriodicalId":100944,"journal":{"name":"Natural Language Processing Journal","volume":"12 ","pages":"Article 100163"},"PeriodicalIF":0.0,"publicationDate":"2025-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144472473","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Enhancing grammatical documentation for endangered languages with graph-based meaning representation and Loopy Belief Propagation 利用基于图的意义表示和循环信念传播增强濒危语言的语法文档

Natural Language Processing Journal Pub Date : 2025-06-18 DOI: 10.1016/j.nlp.2025.100164

Sebastien Christian

{"title":"Enhancing grammatical documentation for endangered languages with graph-based meaning representation and Loopy Belief Propagation","authors":"Sebastien Christian","doi":"10.1016/j.nlp.2025.100164","DOIUrl":"10.1016/j.nlp.2025.100164","url":null,"abstract":"<div><div>DIG4EL (Digital Inferential Grammars for Endangered Languages) is a method embodied in software designed to assist linguists and teachers in producing grammatical descriptions for endangered languages. DIG4EL integrates linguistic knowledge from extensive databases such as WALS and Grambank with automated observations of controlled data collected using Conversational Questionnaires.</div><div>Linguistic knowledge and automated observations provide priors to a Bayesian network of grammatical parameters, where parameters are interconnected by directional conditional probability matrices derived from statistics on world languages. Inference of unknown parameter values is performed using Loopy Belief Propagation, achieving an average accuracy of 76% and a median accuracy of 85% in an experimental grammatical domain, determining the values of eight parameters related to canonical word order across 116 languages from diverse language families.</div><div>DIG4EL produces outputs either as structured files for computational use, Microsoft Word files, or plain-language grammatical descriptions generated by a Large Language Model. These descriptions rely solely on vetted data and observed examples, with prompts crafted explicitly to prevent external information or hallucinations.</div><div>By leveraging probabilistic modeling and rich, yet quickly assembled linguistic data, DIG4EL provides a powerful, accessible tool for creating grammatical descriptions and language teaching materials with minimal intervention from linguists. It significantly reduces the time and expertise required for traditional documentation workflows, ensuring endangered languages are better documented and taught.</div></div>","PeriodicalId":100944,"journal":{"name":"Natural Language Processing Journal","volume":"12 ","pages":"Article 100164"},"PeriodicalIF":0.0,"publicationDate":"2025-06-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144329749","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Next-generation image captioning: A survey of methodologies and emerging challenges from transformers to Multimodal Large Language Models 下一代图像字幕：从变形器到多模态大语言模型的方法和新挑战的调查

Natural Language Processing Journal Pub Date : 2025-06-10 DOI: 10.1016/j.nlp.2025.100159

Huda Diab Abdulgalil, Otman A. Basir

{"title":"Next-generation image captioning: A survey of methodologies and emerging challenges from transformers to Multimodal Large Language Models","authors":"Huda Diab Abdulgalil, Otman A. Basir","doi":"10.1016/j.nlp.2025.100159","DOIUrl":"10.1016/j.nlp.2025.100159","url":null,"abstract":"<div><div>The widespread availability of visual data on the Internet has fueled a significant interest in image-to-text captioning systems. Automated image captioning remains a challenging multimodal analytics task, integrating advances in both Computer Vision (CV) and Natural Language Processing (NLP) to understand image content and generate semantically meaningful textual descriptions. Modern deep learning-based approaches have supplanted traditional approaches in image captioning, leading to more efficient and sophisticated models. The development of attention mechanisms and transformer-based architectures has further enhanced the modeling of both language and visual data. Despite these gains, challenges such as long-tailed object recognition, bias in training data, and shortcomings in evaluation metrics constrain the capabilities of current models. Furthermore, an important breakthrough has been made with the recent emergence of Multimodal Large Language Models (MLLMs). By incorporating textual and visual data, MLLMs provide improved captioning flexibility, generative capabilities, and reasoning. However, these models introduce new challenges, including faithfulness, grounding, and computational cost. Although relatively few studies have comprehensively surveyed these developments, this paper provides a thorough analysis of Transformer-based captioning approaches, investigates the shift to MLLMs, and discusses associated challenges and opportunities. We also present a performance comparison of the latest models on the MS-COCO benchmark and conclude with perspectives on potential future research directions.</div></div>","PeriodicalId":100944,"journal":{"name":"Natural Language Processing Journal","volume":"12 ","pages":"Article 100159"},"PeriodicalIF":0.0,"publicationDate":"2025-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144313028","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0