Natural Language Processing Journal最新文献

筛选
英文 中文
Cyberbullying detection of resource constrained language from social media using transformer-based approach 使用基于变换器的方法从社交媒体中检测资源受限语言的网络欺凌行为
Natural Language Processing Journal Pub Date : 2024-09-16 DOI: 10.1016/j.nlp.2024.100104
Syed Sihab-Us-Sakib , Md. Rashadur Rahman , Md. Shafiul Alam Forhad , Md. Atiq Aziz
{"title":"Cyberbullying detection of resource constrained language from social media using transformer-based approach","authors":"Syed Sihab-Us-Sakib ,&nbsp;Md. Rashadur Rahman ,&nbsp;Md. Shafiul Alam Forhad ,&nbsp;Md. Atiq Aziz","doi":"10.1016/j.nlp.2024.100104","DOIUrl":"10.1016/j.nlp.2024.100104","url":null,"abstract":"<div><div>The rise of the internet and social media has facilitated diverse interactions among individuals, but it has also led to an increase in cyberbullying—a phenomenon with detrimental effects on mental health, including the potential to induce suicidal thoughts. To combat this issue, we have developed the Cyberbullying Bengali Dataset (CBD), a novel resource containing 2751 manually labeled texts categorized into five classes, including various forms of cyberbullying and non-bullying instances. In our study on cyberbullying detection, we conducted an extensive evaluation of various machine learning and deep learning models. Specifically, we examined Support Vector Machine (SVM), Multinomial Naive Bayes (MNB), and Random Forest (RF) among the traditional machine learning models. For deep learning models, we explored Gated Recurrent Unit (GRU), Convolutional Neural Network (CNN), Long Short-Term Memory (LSTM), and Bidirectional LSTM (BiLSTM). We have also experimented with state-of-the-art transformer architectures, including m-BERT, BanglaBERT, and XLM-RoBERTa. After rigorous experimentation, XLM-RoBERTa emerged as the most effective model, achieving a significant F1-score of 0.83 and an accuracy of 82.61%, outperforming all other models. Our work provides insights into effective cyberbullying detection on platforms like Facebook, YouTube, and Instagram.</div></div>","PeriodicalId":100944,"journal":{"name":"Natural Language Processing Journal","volume":"9 ","pages":"Article 100104"},"PeriodicalIF":0.0,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142323945","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Initial exploration into sarcasm and irony through machine translation 通过机器翻译对讽刺和挖苦进行初步探索
Natural Language Processing Journal Pub Date : 2024-09-12 DOI: 10.1016/j.nlp.2024.100106
Zheng Lin Chia , Michal Ptaszynski , Marzena Karpinska , Juuso Eronen , Fumito Masui
{"title":"Initial exploration into sarcasm and irony through machine translation","authors":"Zheng Lin Chia ,&nbsp;Michal Ptaszynski ,&nbsp;Marzena Karpinska ,&nbsp;Juuso Eronen ,&nbsp;Fumito Masui","doi":"10.1016/j.nlp.2024.100106","DOIUrl":"10.1016/j.nlp.2024.100106","url":null,"abstract":"<div><p>In this paper, we investigate sarcasm and irony as seen through a novel perspective of machine translation. We employ various techniques for translation, comparing both manually and automatically translated datasets of irony and sarcasm. We first clarify the definitions of irony and sarcasm and present an exhaustive field review of studies on irony both from purely linguistic as well as computational linguistic perspectives. We also propose a novel evaluation metric for the purpose of evaluating translations of figurative language, with a focus on machine-translated irony and sarcasm. The constructed English and Chinese parallel dataset includes polarized content from tweets as well as forum posts, categorized by irony types. The preferred translation model, mBART-50, is identified through a thorough experimental process. Optimal translation settings and the best-finetuned model for irony are explored, with the most effective model being finetuned on both ironic and non-ironic data. We also experimented which types of irony are best suitable for training in this specific task — short microblogging messages or longer forum posts. Moreover, we compare the capabilities of a well fine-tuned mBART to a prompt-based method using the recently popular ChatGPT model, with the conclusion that the former still outperforms the latter, although ChatGPT without any training can be considered as a “good enough” ad hoc solution in the case of a lack of data for training. Finally, we verify if the translated data – either manually, or with an MT model – can be used as training data in a task of irony detection. We believe that the presented research can be expanded into languages other than the presented here Chinese and English, which together with the ability to detect various categories of irony, could contribute to deepening the understanding of figurative language, especially irony and sarcasm.</p></div>","PeriodicalId":100944,"journal":{"name":"Natural Language Processing Journal","volume":"9 ","pages":"Article 100106"},"PeriodicalIF":0.0,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2949719124000542/pdfft?md5=1fda68d5c29cfb5c586ec5b4c9c004ae&pid=1-s2.0-S2949719124000542-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142239626","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Personality and emotion—A comprehensive analysis using contextual text embeddings 人格与情感--利用上下文文本嵌入进行综合分析
Natural Language Processing Journal Pub Date : 2024-09-12 DOI: 10.1016/j.nlp.2024.100105
Md. Ali Akber , Tahira Ferdousi , Rasel Ahmed , Risha Asfara , Raqeebir Rab , Umme Zakia
{"title":"Personality and emotion—A comprehensive analysis using contextual text embeddings","authors":"Md. Ali Akber ,&nbsp;Tahira Ferdousi ,&nbsp;Rasel Ahmed ,&nbsp;Risha Asfara ,&nbsp;Raqeebir Rab ,&nbsp;Umme Zakia","doi":"10.1016/j.nlp.2024.100105","DOIUrl":"10.1016/j.nlp.2024.100105","url":null,"abstract":"<div><p>Personality and emotions have always been closely intertwined since humans evolved, adapting to these two forms. Emotions are indicative of a person’s personality, and vice versa. This paper aims to investigate the complex relationship between these two fundamental aspects of human behavior using the concepts of machine learning and statistical analysis. The objective is to automate the process of determining the relationship between personality traits of the MBTI (Myers-Briggs Type Indicator) and Ekman’s emotions based on the context of user-written social media posts using contextual embedding. A robust mechanism is employed, involving two main phases to figure out emotions from the social media posts. The first phase involves determining the cosine similarity scores between each MBTI personality trait and predefined emotions. The second phase introduces a cross-dataset learning approach where several machine learning models are trained on a dataset labeled with emotions to learn patterns of emotions found in the text. After training, these models utilize the patterns they learned to predict emotions in a targeted dataset. With an overall accuracy of 85.23%, the Support Vector Machine (SVM) is chosen as the most effective and high-performing model for emotion prediction tasks. We employed a vetting mechanism combining two approaches to improve accuracy, reliability, and trustworthiness for the final emotion prediction. Finally, using statistical quantification, this paper finds patterns that link each MBTI personality trait with Ekman emotions. It reveals that extroverts (E), sensing (S), and feeling (F) personality types are more likely to share joyful and surprising emotional posts, while individuals with extroversion (E), intuition (N), thinking (T), and perception (P) traits tend to express negative emotions such as anger and disgust. Conversely, introverts (I), intuitive (N), thinking (T), and judging (J) personalities are more inclined to share posts reflecting fear and sadness. This comprehensive study provides valuable insights on how individuals with different personality types typically express emotions on social media.</p></div>","PeriodicalId":100944,"journal":{"name":"Natural Language Processing Journal","volume":"9 ","pages":"Article 100105"},"PeriodicalIF":0.0,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2949719124000530/pdfft?md5=7f4d308abb64fc3a802f27722eaef0b5&pid=1-s2.0-S2949719124000530-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142274263","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Auto-DSM: Using a Large Language Model to generate a Design Structure Matrix Auto-DSM:使用大型语言模型生成设计结构矩阵
Natural Language Processing Journal Pub Date : 2024-09-06 DOI: 10.1016/j.nlp.2024.100103
Edwin C.Y. Koh
{"title":"Auto-DSM: Using a Large Language Model to generate a Design Structure Matrix","authors":"Edwin C.Y. Koh","doi":"10.1016/j.nlp.2024.100103","DOIUrl":"10.1016/j.nlp.2024.100103","url":null,"abstract":"<div><p>The Design Structure Matrix (DSM) is an established method used in dependency modelling, especially in the design of complex engineering systems. The generation of DSM is traditionally carried out through manual means and can involve interviewing experts to elicit critical system elements and the relationships between them. Such manual approaches can be time-consuming and costly. This paper presents a workflow that uses a Large Language Model (LLM) to support the generation of DSM and improve productivity. A prototype of the workflow was developed in this work and applied on a diesel engine DSM published previously. It was found that the prototype could reproduce 357 out of 462 DSM entries published (i.e. 77.3%), suggesting that the work can aid DSM generation. A no-code version of the prototype is made available online to support future research.</p></div>","PeriodicalId":100944,"journal":{"name":"Natural Language Processing Journal","volume":"9 ","pages":"Article 100103"},"PeriodicalIF":0.0,"publicationDate":"2024-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2949719124000517/pdfft?md5=0a3d0db24ef947b4a2f3aa1b8fd3ddb1&pid=1-s2.0-S2949719124000517-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142239627","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Job description parsing with explainable transformer based ensemble models to extract the technical and non-technical skills 利用基于可解释变换器的集合模型进行职位描述解析,以提取技术和非技术技能
Natural Language Processing Journal Pub Date : 2024-09-03 DOI: 10.1016/j.nlp.2024.100102
Abbas Akkasi
{"title":"Job description parsing with explainable transformer based ensemble models to extract the technical and non-technical skills","authors":"Abbas Akkasi","doi":"10.1016/j.nlp.2024.100102","DOIUrl":"10.1016/j.nlp.2024.100102","url":null,"abstract":"<div><p>The rapid digitization of the economy is transforming the job market, creating new roles and reshaping existing ones. As skill requirements evolve, identifying essential competencies becomes increasingly critical. This paper introduces a novel ensemble model that combines traditional and transformer-based neural networks to extract both technical and non-technical skills from job descriptions. A substantial dataset of job descriptions from reputable platforms was meticulously annotated for 22 IT roles. The model demonstrated superior performance in extracting both non-technical (67% F-score) and technical skills (72% F-score) compared to conventional CRF and hybrid deep learning models. Specifically, the proposed model outperformed these baselines by an average margin of 10% and 6%, respectively, for non-technical skills, and 29% and 6.8% for technical skills. A 5 × 2cv paired t-test confirmed the statistical significance of these improvements. In addition, to enhance model interpretability, Local Interpretable Model-Agnostic Explanations (LIME) were employed in the experiments.</p></div>","PeriodicalId":100944,"journal":{"name":"Natural Language Processing Journal","volume":"9 ","pages":"Article 100102"},"PeriodicalIF":0.0,"publicationDate":"2024-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2949719124000505/pdfft?md5=a597d9732dfab2f3ac80c6409cc94264&pid=1-s2.0-S2949719124000505-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142239061","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Towards effective teaching assistants: From intent-based chatbots to LLM-powered teaching assistants 实现有效的教学助手:从基于意图的聊天机器人到 LLM 驱动的教学助手
Natural Language Processing Journal Pub Date : 2024-09-01 DOI: 10.1016/j.nlp.2024.100101
Bashaer Alsafari , Eric Atwell , Aisha Walker , Martin Callaghan
{"title":"Towards effective teaching assistants: From intent-based chatbots to LLM-powered teaching assistants","authors":"Bashaer Alsafari ,&nbsp;Eric Atwell ,&nbsp;Aisha Walker ,&nbsp;Martin Callaghan","doi":"10.1016/j.nlp.2024.100101","DOIUrl":"10.1016/j.nlp.2024.100101","url":null,"abstract":"<div><p>As chatbot technology undergoes a transformative phase in the era of artificial intelligence (AI), the integration of advanced AI models emerges as a focal point for reshaping conversational agents within the education sector. This paper explores the evolution of educational chatbot development, specifically focusing on building a teaching assistant for Data Mining and Text Analytics courses at the University of Leeds. The primary objective is to investigate and compare traditional intent-based chatbot approaches with the advanced retrieval-augmented generation (RAG) method, aiming to improve the efficiency and adaptability of teaching assistants in higher education. The study begins with the development of an Amazon Alexa teaching skill, assessing the efficacy of traditional chatbot development in higher education. To enrich the chatbot knowledge base, the research then employs an automated question–answer generation (QAG) approach using the QG Lumos Learning tool to extract contextually grounded question–answer datasets from course materials. Subsequently, the RAG-based system is proposed, leveraging LangChain with the OpenAI GPT-3.5 Turbo model. Findings highlight limitations in intent-based approaches, emphasising the need for more adaptive solutions. The proposed RAG-based teaching assistant demonstrates significant improvements in efficiently handling diverse queries, representing a paradigm shift in educational chatbot capabilities. These findings provide an in-depth understanding of the development phase, specifically illustrating the impact on chatbot performance by contrasting traditional methods with large language model-based approaches. The study contributes valuable perspectives on enhancing adaptability and effectiveness in AI-powered educational tools, providing essential considerations for future developments in the field.</p></div>","PeriodicalId":100944,"journal":{"name":"Natural Language Processing Journal","volume":"8 ","pages":"Article 100101"},"PeriodicalIF":0.0,"publicationDate":"2024-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2949719124000499/pdfft?md5=7de3208cd4d6adf93098711dcb0bb283&pid=1-s2.0-S2949719124000499-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142162994","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A combined AraBERT and Voting Ensemble classifier model for Arabic sentiment analysis 用于阿拉伯语情感分析的 AraBERT 和投票集合分类器组合模型
Natural Language Processing Journal Pub Date : 2024-09-01 DOI: 10.1016/j.nlp.2024.100100
Dhaou Ghoul , Jérémy Patrix , Gaël Lejeune , Jérôme Verny
{"title":"A combined AraBERT and Voting Ensemble classifier model for Arabic sentiment analysis","authors":"Dhaou Ghoul ,&nbsp;Jérémy Patrix ,&nbsp;Gaël Lejeune ,&nbsp;Jérôme Verny","doi":"10.1016/j.nlp.2024.100100","DOIUrl":"10.1016/j.nlp.2024.100100","url":null,"abstract":"<div><p>For sentiment analysis of short texts (e.g. movie reviews, tweets, etc.), one approach is to build machine learning models that can determine their tones (positive, negative, neutral). However, these natural language processing (NLP) studies are missing when there is a lack of high-quality and large-scale training data for specific languages such as Arabic. In this paper, we present three machine learning models designed to classify sentiment Arabic tweets developed for a Kaggle competition. We present a Voting Ensemble classifier taking advantage of both character-level and word-level features. We also propose an AraBERT (Arabic Bidirectional Encoder Representations from Transformers) model with preprocessing using Farasa Segmenter. Finally, we combine these first two approaches as a third approach (Voting Ensemble classifier using AraBERT embeddings). Performance measures of results show improvement over previous efforts for all models. The third model exhibits strong performance with a 73.98% F-score score. The work presented here could be useful for future studies and for new Arabic sentiment analysis online services or competitions.</p></div>","PeriodicalId":100944,"journal":{"name":"Natural Language Processing Journal","volume":"8 ","pages":"Article 100100"},"PeriodicalIF":0.0,"publicationDate":"2024-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2949719124000487/pdfft?md5=0cdd68616cd0023e6f056de98e086b2d&pid=1-s2.0-S2949719124000487-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142274873","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A novel prompting method for few-shot NER via LLMs 一种通过 LLMs 进行少量 NER 的新型提示方法
Natural Language Processing Journal Pub Date : 2024-08-24 DOI: 10.1016/j.nlp.2024.100099
Qi Cheng , Liqiong Chen , Zhixing Hu , Juan Tang , Qiang Xu , Binbin Ning
{"title":"A novel prompting method for few-shot NER via LLMs","authors":"Qi Cheng ,&nbsp;Liqiong Chen ,&nbsp;Zhixing Hu ,&nbsp;Juan Tang ,&nbsp;Qiang Xu ,&nbsp;Binbin Ning","doi":"10.1016/j.nlp.2024.100099","DOIUrl":"10.1016/j.nlp.2024.100099","url":null,"abstract":"<div><p>In various natural language processing tasks, significant strides have been made by Large Language Models (LLMs). Researchers leverage prompt method to conduct LLMs in accomplishing specific tasks under few-shot conditions. However, the prevalent use of LLMs’ prompt methods mainly focuses on guiding generative tasks, and employing existing prompts may result in poor performance in Named Entity Recognition (NER) tasks. To tackle this challenge, we propose a novel prompting method for few-shot NER. By enhancing existing prompt methods, we devise a standardized prompts tailored for the utilization of LLMs in NER tasks. Specifically, we structure the prompts into three components: task definition, few-shot demonstration, and output format. The task definition conducts LLMs in performing NER tasks, few-shot demonstration assists LLMs in understanding NER task objectives through specific output demonstration, and output format restricts LLMs’ output to prevent the generation of unnecessary results. The content of these components has been specifically tailored for NER tasks. Moreover, for the few-shot demonstration within the prompts, we propose a selection strategy that utilizes feedback from LLMs’ outputs to identify more suitable few-shot demonstration as prompts. Additionally, to enhance entity recognition performance, we enrich the prompts by summarizing error examples from the output process of LLMs and integrating them as additional prompts.</p></div>","PeriodicalId":100944,"journal":{"name":"Natural Language Processing Journal","volume":"8 ","pages":"Article 100099"},"PeriodicalIF":0.0,"publicationDate":"2024-08-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2949719124000475/pdfft?md5=e7e56213f461ce5ea69e8b3be1581d14&pid=1-s2.0-S2949719124000475-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142076491","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
HarmonyNet: Navigating hate speech detection 和谐网络:仇恨言论检测导航
Natural Language Processing Journal Pub Date : 2024-08-20 DOI: 10.1016/j.nlp.2024.100098
Shaina Raza, Veronica Chatrath
{"title":"HarmonyNet: Navigating hate speech detection","authors":"Shaina Raza,&nbsp;Veronica Chatrath","doi":"10.1016/j.nlp.2024.100098","DOIUrl":"10.1016/j.nlp.2024.100098","url":null,"abstract":"<div><p>In the digital era, social media platforms have become central to communication across various domains. However, the vast spread of unregulated content often leads to the prevalence of hate speech and toxicity. Existing methods to detect this toxicity struggle with context sensitivity, accommodating diverse dialects, and adapting to varied communication styles. To tackle these challenges, we introduce an ensemble classifier that leverages the strengths of language models and traditional deep neural network architectures for more effective hate speech detection on social media. Our evaluations show that this hybrid approach outperforms individual models and exhibits robustness against adversarial attacks. Future efforts will aim to enhance the model’s architecture to further boost its efficiency and extend its capability to recognize hate speech across an even wider range of languages and dialects.</p></div>","PeriodicalId":100944,"journal":{"name":"Natural Language Processing Journal","volume":"8 ","pages":"Article 100098"},"PeriodicalIF":0.0,"publicationDate":"2024-08-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2949719124000463/pdfft?md5=a4006c27711b7b1ab993698b402c7e9e&pid=1-s2.0-S2949719124000463-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142076492","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Kurdish end-to-end speech synthesis using deep neural networks 使用深度神经网络进行库尔德语端到端语音合成
Natural Language Processing Journal Pub Date : 2024-08-13 DOI: 10.1016/j.nlp.2024.100096
Sabat Salih Muhamad , Hadi Veisi , Aso Mahmudi , Abdulhady Abas Abdullah , Farhad Rahimi
{"title":"Kurdish end-to-end speech synthesis using deep neural networks","authors":"Sabat Salih Muhamad ,&nbsp;Hadi Veisi ,&nbsp;Aso Mahmudi ,&nbsp;Abdulhady Abas Abdullah ,&nbsp;Farhad Rahimi","doi":"10.1016/j.nlp.2024.100096","DOIUrl":"10.1016/j.nlp.2024.100096","url":null,"abstract":"<div><p>This article introduces an end-to-end text-to-speech (TTS) system for the low-resourced language of Central Kurdish (CK, also known as Sorani) and tackles the challenges associated with limited data availability. We have compiled a dataset suitable for end-to-end text-to-speech that includes 21 h of CK female voice paired with corresponding texts. To identify the optimal performing system, we employed Tacotron2, an end-to-end deep neural network for speech synthesis, in three training experiments. The process involves training Tacotron2 using a pre-trained English system, followed by training two models from scratch with full and intonationally balanced datasets. We evaluated the effectiveness of these models using Mean Opinion Score (MOS), a subjective evaluation metric. Our findings demonstrate that the model trained from scratch on the full CK dataset surpasses both the model trained with the intonationally balanced dataset and the model trained using a pre-trained English model in terms of naturalness and intelligibility by achieving a MOS of 4.78 out of 5.</p></div>","PeriodicalId":100944,"journal":{"name":"Natural Language Processing Journal","volume":"8 ","pages":"Article 100096"},"PeriodicalIF":0.0,"publicationDate":"2024-08-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S294971912400044X/pdfft?md5=1041be1fe8e6d421c55a8b57704a5308&pid=1-s2.0-S294971912400044X-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142040515","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信