{"title":"Entity and relationship extraction based on span contribution evaluation and focusing framework","authors":"Qibin Li , Nianmin Yao , Nai Zhou , Jian Zhao","doi":"10.1016/j.csl.2024.101744","DOIUrl":"10.1016/j.csl.2024.101744","url":null,"abstract":"<div><div>Entity and relationship extraction involves identifying named entities and extracting relationships between them. Existing research focuses on enhancing span representations, yet overlooks the impact of non-target spans(ie, the span is non-entity or the span pair has no relationship) on model training. In this work, we propose a span contribution evaluation and focusing framework named CEFF, which assigns a contribution score to each non-target span in a sentence through pre-training, which reflects the contribution of span to model performance improvement. To a certain extent, this method considers the impact of different spans on model training, making the training more targeted. Additionally, leveraging the contribution scores of non-target spans, we introduce a simplified variant of the model, termed CEFF<span><math><msub><mrow></mrow><mrow><mi>s</mi></mrow></msub></math></span>, which achieves comparable performance to models trained with all spans while utilizing fewer spans. This approach reduces training costs and improves training efficiency. Through extensive validation, we demonstrate that our contribution scores accurately reflect span contributions and achieve state-of-the-art results on five benchmark datasets.</div></div>","PeriodicalId":50638,"journal":{"name":"Computer Speech and Language","volume":null,"pages":null},"PeriodicalIF":3.1,"publicationDate":"2024-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142572765","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Taking relations as known conditions: A tagging based method for relational triple extraction","authors":"Guanqing Kong , Qi Lei","doi":"10.1016/j.csl.2024.101734","DOIUrl":"10.1016/j.csl.2024.101734","url":null,"abstract":"<div><div>Relational triple extraction refers to extracting entities and relations from natural texts, which is a crucial task in the construction of knowledge graph. Recently, tagging based methods have received increasing attention because of their simple and effective structural form. Among them, the two-step extraction method is easy to cause the problem of category imbalance. To address this issue, we propose a novel two-step extraction method, which first extracts subjects, generates a fixed-size embedding for each relation, and then regards these relations as known conditions to extract the objects directly with the identified subjects. In order to eliminate the influence of irrelevant relations when predicting objects, we use a relation-special attention mechanism and a gate unit to select appropriate relations. In addition, most current models do not account for two-way interaction between tasks, so we design a feature interactive network to achieve bidirectional interaction between subject and object extraction tasks and enhance their connection. Experimental results on NYT, WebNLG, NYT<span><math><msup><mrow></mrow><mrow><mo>⋆</mo></mrow></msup></math></span> and WebNLG<span><math><msup><mrow></mrow><mrow><mo>⋆</mo></mrow></msup></math></span> datasets show that our model is competitive among joint extraction models.</div></div>","PeriodicalId":50638,"journal":{"name":"Computer Speech and Language","volume":null,"pages":null},"PeriodicalIF":3.1,"publicationDate":"2024-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142572764","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Julian Linke , Bernhard C. Geiger , Gernot Kubin , Barbara Schuppler
{"title":"What’s so complex about conversational speech? A comparison of HMM-based and transformer-based ASR architectures","authors":"Julian Linke , Bernhard C. Geiger , Gernot Kubin , Barbara Schuppler","doi":"10.1016/j.csl.2024.101738","DOIUrl":"10.1016/j.csl.2024.101738","url":null,"abstract":"<div><div>Highly performing speech recognition is important for more fluent human–machine interaction (e.g., dialogue systems). Modern ASR architectures achieve human-level recognition performance on read speech but still perform sub-par on conversational speech, which arguably is or, at least, will be instrumental for human–machine interaction. Understanding the factors behind this shortcoming of modern ASR systems may suggest directions for improving them. In this work, we compare the performances of HMM- vs. transformer-based ASR architectures on a corpus of Austrian German conversational speech. Specifically, we investigate how strongly utterance length, prosody, pronunciation, and utterance complexity as measured by perplexity affect different ASR architectures. Among other findings, we observe that single-word utterances – which are characteristic of conversational speech and constitute roughly 30% of the corpus – are recognized more accurately if their F0 contour is flat; for longer utterances, the effects of the F0 contour tend to be weaker. We further find that zero-shot systems require longer utterance lengths and are less robust to pronunciation variation, which indicates that pronunciation lexicons and fine-tuning on the respective corpus are essential ingredients for the successful recognition of conversational speech.</div></div>","PeriodicalId":50638,"journal":{"name":"Computer Speech and Language","volume":null,"pages":null},"PeriodicalIF":3.1,"publicationDate":"2024-10-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142572767","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Combining replay and LoRA for continual learning in natural language understanding","authors":"Zeinab Borhanifard, Heshaam Faili, Yadollah Yaghoobzadeh","doi":"10.1016/j.csl.2024.101737","DOIUrl":"10.1016/j.csl.2024.101737","url":null,"abstract":"<div><div>Large language models have significantly improved dialogue systems through enhanced capabilities in understanding queries and generating responses. Despite these enhancements, task-oriented dialogue systems- – which power many intelligent assistants – face challenges when adapting to new domains and applications. This challenge arises from a phenomenon known as catastrophic forgetting, where models forget previously acquired knowledge when learning new tasks. This paper addresses this issue through continual learning techniques to preserve previously learned knowledge while seamlessly integrating new tasks and domains. We propose <strong>E</strong>xperience <strong>R</strong>eplay <strong>I</strong>nformative-<strong>Lo</strong>w <strong>R</strong>ank <strong>A</strong>daptation or ERI-LoRA, a hybrid continual learning method for natural language understanding in dialogue systems that effectively combines the replay-based methods with parameter-efficient techniques. Our experiments on intent detection and slot-filling tasks demonstrate that ERI-LoRA significantly outperforms competitive baselines in continual learning. The results of our catastrophic forgetting experiments demonstrate that ERI-LoRA maintains robust memory stability in the model, demonstrating its effectiveness in mitigating these effects.</div></div>","PeriodicalId":50638,"journal":{"name":"Computer Speech and Language","volume":null,"pages":null},"PeriodicalIF":3.1,"publicationDate":"2024-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142553128","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Optimizing pipeline task-oriented dialogue systems using post-processing networks","authors":"Atsumoto Ohashi, Ryuichiro Higashinaka","doi":"10.1016/j.csl.2024.101742","DOIUrl":"10.1016/j.csl.2024.101742","url":null,"abstract":"<div><div>Many studies have proposed methods for optimizing the dialogue performance of an entire pipeline task-oriented dialogue system by jointly training modules in the system using reinforcement learning. However, these methods are limited in that they can only be applied to modules implemented using trainable neural-based methods. To solve this problem, we propose a method for optimizing the dialogue performance of a pipeline system that consists of modules implemented with arbitrary methods for dialogue. With our method, neural-based components called post-processing networks (PPNs) are installed inside such a system to post-process the output of each module. All PPNs are updated to improve the overall dialogue performance of the system by using reinforcement learning, not necessitating that each module be differentiable. Through dialogue simulations and human evaluations on two well-studied task-oriented dialogue datasets, CamRest676 and MultiWOZ, we show that our method can improve the dialogue performance of pipeline systems consisting of various modules. In addition, a comprehensive analysis of the results of the MultiWOZ experiments reveals the patterns of post-processing by PPNs that contribute to the overall dialogue performance of the system.</div></div>","PeriodicalId":50638,"journal":{"name":"Computer Speech and Language","volume":null,"pages":null},"PeriodicalIF":3.1,"publicationDate":"2024-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142572768","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Deepak Kumar Jain , S. Neelakandan , Ankit Vidyarthi , Anand Mishra , Ahmed Alkhayyat
{"title":"A knowledge-Aware NLP-Driven conversational model to detect deceptive contents on social media posts","authors":"Deepak Kumar Jain , S. Neelakandan , Ankit Vidyarthi , Anand Mishra , Ahmed Alkhayyat","doi":"10.1016/j.csl.2024.101743","DOIUrl":"10.1016/j.csl.2024.101743","url":null,"abstract":"<div><div>The widespread dissemination of deceptive content on social media presents a substantial challenge to preserving authenticity and trust. The epidemic growth of false news is due to the greater use of social media to transmit news, rather than conventional mass media such as newspapers, magazines, radio, and television. Humans' incapacity to differentiate among true and false facts exposes fake news as a threat to logical truth, democracy, journalism, and government credibility. Using combination of advanced methodologies, Deep learning (DL) methods, and Natural Language Processing (NLP) approaches, researchers and technology developers attempt to make robust systems proficient in discerning the subtle nuances that betray deceptive intent. Analysing conversational linguistic patterns of misleading data, these techniques’ purpose to progress the resilience of social platforms against the spread of deceptive content, eventually contributing to an additional informed and trustworthy online platform. This paper proposed a Knowledge-Aware NLP-Driven AlBiruni Earth Radius Optimization Algorithm with Deep Learning Tool for Enhanced Deceptive Content Detection (BER-DLEDCD) algorithm on Social Media. The purpose of the BER-DLEDCD system is to identify and classify the existence of deceptive content utilizing NLP with optimal DL model. In the BER-DLEDCD technique, data pre-processing takes place to change the input data into compatible format. Furthermore, the BER-DLEDCD approach applies hybrid DL technique encompassing Convolutional Neural Network with Long Short-Term Memory (CNN-LSTM) methodology for deceptive content detection. Moreover, the BER approach has been deployed to boost hyperparameter choice of the CNN-LSTM technique which leads to enhanced detection performance. The simulation outcome of the BER-DLEDCD system has been examined employing benchmark database. The extensive outcomes stated the BER-DLEDCD system achieved excellent performance with the accuracy of 94 %, 94.83 % precision, 94.30 % F-score with other recent approaches.</div></div>","PeriodicalId":50638,"journal":{"name":"Computer Speech and Language","volume":null,"pages":null},"PeriodicalIF":3.1,"publicationDate":"2024-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142572766","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"ECDG-DST: A dialogue state tracking model based on efficient context and domain guidance for smart dialogue systems","authors":"Meng Zhu , Xiaolong Xu","doi":"10.1016/j.csl.2024.101741","DOIUrl":"10.1016/j.csl.2024.101741","url":null,"abstract":"<div><div>Dialogue state tracking (DST) is an important component of smart dialogue systems, with the goal of predicting the current dialogue state at conversation turn. However, most of the previous works had problems with storing a large amount of data and storing a large amount of noisy information when the conversation takes many turns. In addition, they also overlooked the effect of the domain in the task of dialogue state tracking. In this paper, we propose ECDG-DST <sup>1</sup> (A dialogue state tracking model based on efficient context and domain guidance) for smart dialogue systems, which preserves key information but retains less dialogue history, and masks the domain effectively in dialogue state tracking. Our model utilizes the efficient conversation context, the previous conversation state and the relationship between domains and slots to narrow the range of slots to be updated, and also limit the directions of values to reduce the generation of irrelevant words. The ECDG-DST model consists of four main components, including an encoder, a domain guide, an operation predictor, and a value generator. We conducted experiments on three popular task-oriented dialogue datasets, Wizard-of-Oz2.0, MultiWOZ2.0, and MultiWOZ2.1, and the empirical results demonstrate that ECDG-DST respectively improved joint goal accuracy by 0.45 % on Wizard-of-Oz2.0, 2.44 % on MultiWOZ2.0 and 2.05 % on MultiWOZ2.1 compared to the baselines. In addition, we analyzed the scope of the efficient context through experiments and validate the effectiveness of our proposed domain guide mechanism through ablation study.</div></div>","PeriodicalId":50638,"journal":{"name":"Computer Speech and Language","volume":null,"pages":null},"PeriodicalIF":3.1,"publicationDate":"2024-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142534368","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yaping Xu , Mengtao Ying , Kunyu Fang, Ruixing Ming
{"title":"Chinese Named Entity Recognition based on adaptive lexical weights","authors":"Yaping Xu , Mengtao Ying , Kunyu Fang, Ruixing Ming","doi":"10.1016/j.csl.2024.101735","DOIUrl":"10.1016/j.csl.2024.101735","url":null,"abstract":"<div><div>Currently, many researchers use weights to merge self-matched words obtained through dictionary matching in order to enhance the performance of Named Entity Recognition (NER). However, these studies overlook the relationship between words and sentences when calculating lexical weights, resulting in fused word information that often does not align with the intended meaning of the sentence. Addressing above issue and enhance the prediction performance, we propose an adaptive lexical weight approach for determining lexical weights. Given a sentence, we utilize an enhanced global attention mechanism to compute the correlation between self-matching words and sentences, thereby focusing attention on crucial words while disregarding unreliable portions. Experimental results demonstrate that our proposed model outperforms existing state-of-the-art methods for Chinese NER of MRSA, Weibo, and Resume datasets.</div></div>","PeriodicalId":50638,"journal":{"name":"Computer Speech and Language","volume":null,"pages":null},"PeriodicalIF":3.1,"publicationDate":"2024-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142533843","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Measuring and implementing lexical alignment: A systematic literature review","authors":"Sumit Srivastava , Suzanna D. Wentzel , Alejandro Catala , Mariët Theune","doi":"10.1016/j.csl.2024.101731","DOIUrl":"10.1016/j.csl.2024.101731","url":null,"abstract":"<div><div>Lexical Alignment is a phenomenon often found in human–human conversations, where the interlocutors converge during a conversation to use the same terms and phrases for the same underlying concepts. Alignment (linguistic) is a mechanism used by humans for better communication between interlocutors at various levels of linguistic knowledge and features, and one of them is lexical. The existing literature suggests that alignment has a significant role in communication between humans, and is also beneficial in human–agent communication. Various methods have been proposed in the past to measure lexical alignment in human–human conversations, and also to implement them in conversational agents. In this research, we carry out an analysis of the existing methods to measure lexical alignment and also dissect methods to implement it in a conversational agent for personalizing human–agent interactions. We propose a new set of criteria that such methods should meet and discuss the possible improvements that can be made to existing methods.</div></div>","PeriodicalId":50638,"journal":{"name":"Computer Speech and Language","volume":null,"pages":null},"PeriodicalIF":3.1,"publicationDate":"2024-10-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142553127","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A hybrid approach to Natural Language Inference for the SICK dataset","authors":"Rodrigo Souza, Marcos Lopes","doi":"10.1016/j.csl.2024.101736","DOIUrl":"10.1016/j.csl.2024.101736","url":null,"abstract":"<div><div>Natural Language Inference (NLI) can be described as the task of answering if a short text called <em>Hypothesis</em> (H) can be inferred from another text called <em>Premise</em> (P) (Poliak, 2020; Dagan et al., 2013). Affirmative answers are considered as semantic entailments and negative ones are either contradictions or semantically “neutral” statements. In the last three decades, many Natural Language Processing (NLP) methods have been put to use for solving this task. As it so happened to almost every other NLP task, Deep Learning (DL) techniques in general (and Transformer neural networks in particular) have been achieving the best results in this task in recent years, progressively increasing their outcomes when compared to classical, symbolic Knowledge Representation models in solving NLI.</div><div>Nevertheless, however successful DL models are in measurable results like accuracy and F-score, their outcomes are far from being explicable, and this is an undesirable feature specially in a task such as NLI, which is meant to deal with language understanding together with rational reasoning inherent to entailment and to contradiction judgements. It is therefore tempting to evaluate how more explainable models would perform in NLI and to compare their performance with DL models later on.</div><div>This paper puts forth a pipeline that we called IsoLex. It provides explainable, transparent NLP models for NLI. It has been tested on a partial version of the SICK corpus (Marelli, 2014) called SICK-CE, containing only the contradiction and the entailment pairs (4245 in total), thus leaving aside the neutral pairs, as an attempt to concentrate on unambiguous semantic relationships, which arguably favor the intelligibility of the results.</div><div>The pipeline consists of three serialized commonly used NLP models: first, an Isolation Forest module is used to filter off highly dissimilar Premise-Hypothesis pairs; second, a WordNet-based Lexical Relations module is employed to check whether the Premise and the Hypothesis textual contents are related to each other in terms of synonymy, hyperonymy, or holonymy; finally, similarities between Premise and Hypothesis texts are evaluated by a simple cosine similarity function based on Word2Vec embeddings.</div><div>IsoLex has achieved 92% accuracy and 94% F-1 on SICK-CE. This is close to SOTA models for this kind of task, such as RoBERTa with a 98% accuracy and 99% F-1 on the same dataset.</div><div>The small performance gap between IsoLex and SOTA DL models is largely compensated by intelligibility on every step of the proposed pipeline. At anytime it is possible to evaluate the role of similarity, lexical relatedness and so forth in the overall process of inference.</div></div>","PeriodicalId":50638,"journal":{"name":"Computer Speech and Language","volume":null,"pages":null},"PeriodicalIF":3.1,"publicationDate":"2024-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142441799","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}