Computer Speech and Language最新文献_第7页

Adaptive feature extraction for entity relation extraction 用于实体关系提取的自适应特征提取

IF 3.1 3区计算机科学

Computer Speech and Language Pub Date : 2024-08-13 DOI: 10.1016/j.csl.2024.101712

Weizhe Yang , Yongbin Qin , Ruizhang Huang , Yanping Chen

{"title":"Adaptive feature extraction for entity relation extraction","authors":"Weizhe Yang , Yongbin Qin , Ruizhang Huang , Yanping Chen","doi":"10.1016/j.csl.2024.101712","DOIUrl":"10.1016/j.csl.2024.101712","url":null,"abstract":"<div><p>Effective capturing of semantic dependencies within sentences is pivotal to support relation extraction. However, challenges such as feature sparsity, and the complexity of identifying the structure of target entity pairs brought by the traditional methods of feature extraction pose significant obstacles for relation extraction. Existing methods that rely on combined features or recurrent networks also face limitations, such as over-reliance on prior knowledge or the gradient vanishing problem. To address these limitations, we propose an Adaptive Feature Extraction (AFE) method, combining neural networks with feature engineering to capture high-order abstract and long-distance semantic dependencies. Our approach extracts atomic features from sentences, maps them into distributed representations, and categorizes these representations into multiple mixed features through adaptive combination, setting it apart from other methods. The proposed AFE-based model uses four different convolutional layers to facilitate feature learning and weighting from the adaptive feature representations, thereby enhancing the discriminative power of deep networks for relation extraction. Experimental results on the English datasets ACE05 English, SciERC, and the Chinese datasets ACE05 Chinese, and CLTC(SanWen) demonstrated the superiority of our method, the F1 scores were improved by 4.16%, 3.99%, 0.82%, and 1.60%, respectively. In summary, our AFE method provides a flexible, and effective solution to some challenges in cross-domain and cross-language relation extraction.</p></div>","PeriodicalId":50638,"journal":{"name":"Computer Speech and Language","volume":"89 ","pages":"Article 101712"},"PeriodicalIF":3.1,"publicationDate":"2024-08-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0885230824000950/pdfft?md5=adb04036e83a59bb4a0206084d42c6c1&pid=1-s2.0-S0885230824000950-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142048663","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A neural network approach for speech enhancement and noise-robust bandwidth extension 用于语音增强和噪声带宽扩展的神经网络方法

IF 3.1 3区计算机科学

Computer Speech and Language Pub Date : 2024-08-13 DOI: 10.1016/j.csl.2024.101709

Xiang Hao , Chenglin Xu , Chen Zhang , Lei Xie

{"title":"A neural network approach for speech enhancement and noise-robust bandwidth extension","authors":"Xiang Hao , Chenglin Xu , Chen Zhang , Lei Xie","doi":"10.1016/j.csl.2024.101709","DOIUrl":"10.1016/j.csl.2024.101709","url":null,"abstract":"<div><p>When processing noisy utterances with varying frequency bandwidths using an enhancement model, the effective bandwidth of the resulting enhanced speech often remains unchanged. However, high-frequency components are crucial for perceived audio quality, underscoring the need for noise-robust bandwidth extension capabilities in speech enhancement networks. In this study, we addressed this challenge by proposing a novel network architecture and loss function based on the CAUNet, which is a state-of-the-art speech enhancement method. We introduced a multi-scale loss and implemented a coordinate embedded upsampling block to facilitate bandwidth extension while maintaining the ability of speech enhancement. Additionally, we proposed a gradient loss function to promote the neural network’s convergence, leading to significant performance improvements. Our experimental results validate these modifications and clearly demonstrate the superiority of our approach over competing methods.</p></div>","PeriodicalId":50638,"journal":{"name":"Computer Speech and Language","volume":"89 ","pages":"Article 101709"},"PeriodicalIF":3.1,"publicationDate":"2024-08-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0885230824000925/pdfft?md5=3c5e79967537c7a56d567c957963e01b&pid=1-s2.0-S0885230824000925-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141993007","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Syntax-controlled paraphrases generation with VAE and multi-task learning 利用 VAE 和多任务学习生成受语法控制的转述

IF 3.1 3区计算机科学

Computer Speech and Language Pub Date : 2024-08-08 DOI: 10.1016/j.csl.2024.101705

Xiyuan Jia , Zongqing Mao , Zhen Zhang , Qiyun Lv , Xin Wang , Guohua Wu

{"title":"Syntax-controlled paraphrases generation with VAE and multi-task learning","authors":"Xiyuan Jia , Zongqing Mao , Zhen Zhang , Qiyun Lv , Xin Wang , Guohua Wu","doi":"10.1016/j.csl.2024.101705","DOIUrl":"10.1016/j.csl.2024.101705","url":null,"abstract":"<div><p>Paraphrase generation is an important method for augmenting text data, which has a crucial role in Natural Language Generation (NLG). However, existing methods lack the ability to capture the semantic representation of input sentences and the syntactic structure of exemplars, which can easily lead to problems such as redundant content, semantic inaccuracies, and poor diversity. To tackle these challenges, we propose a Syntax-Controlled Paraphrase Generator (SCPG), which utilizes attention networks and VAE-based hidden variables to model the semantics of input sentences and the syntax of exemplars. In addition, in order to achieve controllability of the target paraphrase structure, we propose a method for learning semantic and syntactic representations based on multi-task learning, and successfully integrate the two through a gating mechanism. Extensive experimental results show that SCPG achieves SOTA results in terms of both semantic consistency and syntactic controllability, and is able to make a better trade-off between preserving semantics and novelty of sentence structure.</p></div>","PeriodicalId":50638,"journal":{"name":"Computer Speech and Language","volume":"89 ","pages":"Article 101705"},"PeriodicalIF":3.1,"publicationDate":"2024-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0885230824000883/pdfft?md5=a172f9652be80ec2012b298f58353215&pid=1-s2.0-S0885230824000883-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142011679","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A transformer-based spelling error correction framework for Bangla and resource scarce Indic languages 基于转换器的孟加拉语和资源匮乏的印度语言拼写错误纠正框架

IF 3.1 3区计算机科学

Computer Speech and Language Pub Date : 2024-08-07 DOI: 10.1016/j.csl.2024.101703

Mehedi Hasan Bijoy , Nahid Hossain , Salekul Islam , Swakkhar Shatabda

{"title":"A transformer-based spelling error correction framework for Bangla and resource scarce Indic languages","authors":"Mehedi Hasan Bijoy , Nahid Hossain , Salekul Islam , Swakkhar Shatabda","doi":"10.1016/j.csl.2024.101703","DOIUrl":"10.1016/j.csl.2024.101703","url":null,"abstract":"<div><p>Spelling error correction is the task of identifying and rectifying misspelled words in texts. It is a potential and active research topic in Natural Language Processing because of numerous applications in human language understanding. The phonetically or visually similar yet semantically distinct characters make it an arduous task in any language. Earlier efforts on spelling error correction in Bangla and resource-scarce Indic languages focused on rule-based, statistical, and machine learning-based methods which we found rather inefficient. In particular, machine learning-based approaches, which exhibit superior performance to rule-based and statistical methods, are ineffective as they correct each character regardless of its appropriateness. In this paper, we propose a novel detector-purificator-corrector framework, DPCSpell based on denoising transformers by addressing previous issues. In addition to that, we present a method for large-scale corpus creation from scratch which in turn resolves the resource limitation problem of any left-to-right scripted language. The empirical outcomes demonstrate the effectiveness of our approach, which outperforms previous state-of-the-art methods by attaining an exact match (EM) score of 94.78%, a precision score of 0.9487, a recall score of 0.9478, an f1 score of 0.948, an f0.5 score of 0.9483, and a modified accuracy (MA) score of 95.16% for Bangla spelling error correction. The models and corpus are publicly available at <span><span>https://tinyurl.com/DPCSpell</span><svg><path></path></svg></span>.</p></div>","PeriodicalId":50638,"journal":{"name":"Computer Speech and Language","volume":"89 ","pages":"Article 101703"},"PeriodicalIF":3.1,"publicationDate":"2024-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S088523082400086X/pdfft?md5=42e971181da3ed460a728ce6126888c9&pid=1-s2.0-S088523082400086X-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141947071","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Modelling child comprehension: A case of suffixal passive construction in Korean 儿童理解模型：韩语中的后缀被动结构案例

IF 3.1 3区计算机科学

Computer Speech and Language Pub Date : 2024-08-05 DOI: 10.1016/j.csl.2024.101701

Gyu-Ho Shin , Seongmin Mun

{"title":"Modelling child comprehension: A case of suffixal passive construction in Korean","authors":"Gyu-Ho Shin , Seongmin Mun","doi":"10.1016/j.csl.2024.101701","DOIUrl":"10.1016/j.csl.2024.101701","url":null,"abstract":"<div><div>The present study investigates a computational model's ability to capture monolingual children's language behaviour during comprehension in Korean, an understudied language in the field. Specifically, we test whether and how two neural network architectures (LSTM, GPT-2) cope with a suffixal passive construction involving verbal morphology and required interpretive procedures (i.e., revising the mapping between thematic roles and case markers) driven by that morphology. To this end, we fine-tune our models via patching (i.e., pre-trained model + caregiver input) and hyperparameter adjustments, and measure their binary classification performance on the test sentences used in a behavioural study manifesting scrambling and omission of sentential components to varying degrees. We find that, while these models’ performance converges with the children's response patterns found in the behavioural study to some extent, the models do not faithfully simulate the children's comprehension behaviour pertaining to the suffixal passive, yielding by-model, by-condition, and by-hyperparameter asymmetries. This points to the limits of the neural networks’ capacity to address child language features. The implications of this study invite subsequent inquiries on the extent to which computational models reveal developmental trajectories of children's linguistic knowledge that have been unveiled through corpus-based or experimental research.</div></div>","PeriodicalId":50638,"journal":{"name":"Computer Speech and Language","volume":"90 ","pages":"Article 101701"},"PeriodicalIF":3.1,"publicationDate":"2024-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142446405","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Knowledge-aware audio-grounded generative slot filling for limited annotated data 针对有限注释数据的知识感知音频生成槽填充

IF 3.1 3区计算机科学

Computer Speech and Language Pub Date : 2024-08-05 DOI: 10.1016/j.csl.2024.101707

Guangzhi Sun , Chao Zhang , Ivan Vulić , Paweł Budzianowski , Philip C. Woodland

{"title":"Knowledge-aware audio-grounded generative slot filling for limited annotated data","authors":"Guangzhi Sun , Chao Zhang , Ivan Vulić , Paweł Budzianowski , Philip C. Woodland","doi":"10.1016/j.csl.2024.101707","DOIUrl":"10.1016/j.csl.2024.101707","url":null,"abstract":"<div><p>Manually annotating fine-grained slot-value labels for task-oriented dialogue (ToD) systems is an expensive and time-consuming endeavour. This motivates research into slot-filling methods that operate with limited amounts of labelled data. Moreover, the majority of current work on ToD is based solely on text as the input modality, neglecting the additional challenges of imperfect automatic speech recognition (ASR) when working with spoken language. In this work, we propose a Knowledge-Aware Audio-Grounded generative slot filling framework, termed KA2G, that focuses on few-shot and zero-shot slot filling for ToD with speech input. KA2G achieves robust and data-efficient slot filling for speech-based ToD by (1) framing it as a text generation task, (2) grounding text generation additionally in the audio modality, and (3) conditioning on available external knowledge (<em>e.g.</em> a predefined list of possible slot values). We show that combining both modalities within the KA2G framework improves the robustness against ASR errors. Further, the knowledge-aware slot-value generator in KA2G, implemented via a pointer generator mechanism, particularly benefits few-shot and zero-shot learning. Experiments, conducted on the standard speech-based single-turn SLURP dataset and a multi-turn dataset extracted from a commercial ToD system, display strong and consistent gains over prior work, especially in few-shot and zero-shot setups.</p></div>","PeriodicalId":50638,"journal":{"name":"Computer Speech and Language","volume":"89 ","pages":"Article 101707"},"PeriodicalIF":3.1,"publicationDate":"2024-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0885230824000901/pdfft?md5=f629f96f3e24fa1b58c6bf9d7f53386f&pid=1-s2.0-S0885230824000901-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141947072","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Speech self-supervised representations benchmarking: A case for larger probing heads 语音自监督表征基准：更大探测头的案例

IF 3.1 3区计算机科学

Computer Speech and Language Pub Date : 2024-08-03 DOI: 10.1016/j.csl.2024.101695

Salah Zaiem , Youcef Kemiche , Titouan Parcollet , Slim Essid , Mirco Ravanelli

{"title":"Speech self-supervised representations benchmarking: A case for larger probing heads","authors":"Salah Zaiem , Youcef Kemiche , Titouan Parcollet , Slim Essid , Mirco Ravanelli","doi":"10.1016/j.csl.2024.101695","DOIUrl":"10.1016/j.csl.2024.101695","url":null,"abstract":"<div><p>Self-supervised learning (SSL) leverages large datasets of unlabeled speech to reach impressive performance with reduced amounts of annotated data. The high number of proposed approaches fostered the emergence of comprehensive benchmarks that evaluate their performance on a set of downstream tasks exploring various aspects of the speech signal. However, while the number of considered tasks has been growing, most proposals rely upon a single downstream architecture that maps the frozen SSL representations to the task labels. This study examines how benchmarking results are affected by changes in the probing head architecture. Interestingly, we found that altering the downstream architecture structure leads to significant fluctuations in the performance ranking of the evaluated models. Against common practices in speech SSL benchmarking, we evaluate larger-capacity probing heads, showing their impact on performance, inference costs, generalization, and multi-level feature exploitation.</p></div>","PeriodicalId":50638,"journal":{"name":"Computer Speech and Language","volume":"89 ","pages":"Article 101695"},"PeriodicalIF":3.1,"publicationDate":"2024-08-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0885230824000780/pdfft?md5=2b21a1caf20c9b6cfe8c476d74149c9f&pid=1-s2.0-S0885230824000780-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141978381","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Improved relation extraction through key phrase identification using community detection on dependency trees 利用依存树上的社群检测，通过关键短语识别改进关系提取

IF 3.1 3区计算机科学

Computer Speech and Language Pub Date : 2024-08-02 DOI: 10.1016/j.csl.2024.101706

Shuang Liu , Xunqin Chen , Jiana Meng , Niko Lukač

引用次数: 0

Assessing language models’ task and language transfer capabilities for sentiment analysis in dialog data 评估语言模型在对话数据情感分析中的任务和语言转换能力

IF 3.1 3区计算机科学

Computer Speech and Language Pub Date : 2024-07-31 DOI: 10.1016/j.csl.2024.101704

Vlad-Andrei Negru, Vasile Suciu, Alex-Mihai Lăpuşan, Camelia Lemnaru, Mihaela Dînşoreanu, Rodica Potolea

{"title":"Assessing language models’ task and language transfer capabilities for sentiment analysis in dialog data","authors":"Vlad-Andrei Negru, Vasile Suciu, Alex-Mihai Lăpuşan, Camelia Lemnaru, Mihaela Dînşoreanu, Rodica Potolea","doi":"10.1016/j.csl.2024.101704","DOIUrl":"10.1016/j.csl.2024.101704","url":null,"abstract":"<div><p>Our work explores the differences between GRU-based and transformer-based approaches in the context of sentiment analysis on text dialog. In addition to the overall performance on the downstream task, we assess the knowledge transfer capabilities of the models by applying a thorough zero-shot analysis at task level, and on the cross-lingual performance between five European languages. The ability to generalize over different tasks and languages is of high importance, as the data needed for a particular application may be scarce or non existent. We perform evaluations on both known benchmark datasets and a novel synthetic dataset for dialog data, containing Romanian call-center conversations. We study the most appropriate combination of synthetic and real data for fine-tuning on the downstream task, enabling our models to perform in low-resource environments. We leverage the informative power of the conversational context, showing that appending the previous four utterances of the same speaker to the input sequence has the greatest benefit on the inference performance. The cross-lingual and cross-task evaluations have shown that the transformer-based models possess superior transfer abilities to the GRU model, especially in the zero-shot setting. Considering its prior intensive fine-tuning on multiple labeled datasets for various tasks, FLAN-T5 excels in the zero-shot task experiments, obtaining a zero-shot accuracy of 51.27% on the IEMOCAP dataset, alongside the classical BERT that obtained the highest zero-shot accuracy on the MELD dataset with 55.08%.</p></div>","PeriodicalId":50638,"journal":{"name":"Computer Speech and Language","volume":"89 ","pages":"Article 101704"},"PeriodicalIF":3.1,"publicationDate":"2024-07-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0885230824000871/pdfft?md5=a2ab3e37131135c69cec0ed9bbef500a&pid=1-s2.0-S0885230824000871-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141947075","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

COfEE: A comprehensive ontology for event extraction from text COfEE：从文本中提取事件的综合本体论

IF 3.1 3区计算机科学

Computer Speech and Language Pub Date : 2024-07-31 DOI: 10.1016/j.csl.2024.101702

Ali Balali, Masoud Asadpour, Seyed Hossein Jafari

{"title":"COfEE: A comprehensive ontology for event extraction from text","authors":"Ali Balali, Masoud Asadpour, Seyed Hossein Jafari","doi":"10.1016/j.csl.2024.101702","DOIUrl":"10.1016/j.csl.2024.101702","url":null,"abstract":"<div><p>Large volumes of data are constantly being published on the web; however, the majority of this data is often unstructured, making it difficult to comprehend and interpret. To extract meaningful and structured information from such data, researchers and practitioners have turned to Information Extraction (IE) methods. One of the most challenging IE tasks is Event Extraction (EE), which involves extracting information related to specific incidents and their associated actors from text. EE has broad applications, including building a knowledge base, information retrieval, summarization, and online monitoring systems. Over the past few decades, various event ontologies, such as ACE, CAMEO, and ICEWS, have been developed to define event forms, actors, and dimensions of events observed in text. However, these ontologies have some limitations, such as covering only a few topics like political events, having inflexible structures in defining argument roles, lacking analytical dimensions, and insufficient gold-standard data. To address these concerns, we propose a new event ontology, COfEE, which integrates expert domain knowledge, previous ontologies, and a data-driven approach for identifying events from text. COfEE comprises two hierarchy levels (event types and event sub-types) that include new categories related to environmental issues, cyberspace, criminal activity, and natural disasters that require real-time monitoring. In addition, dynamic roles are defined for each event sub-type to capture various dimensions of events. The proposed ontology is evaluated on Wikipedia events, and it is shown to be comprehensive and general. Furthermore, to facilitate the preparation of gold-standard data for event extraction, we present a language-independent online tool based on COfEE. A gold-standard dataset annotated by ten human experts consisting of 24,000 news articles in Persian according to the COfEE ontology is also prepared. To diversify the data, news articles from the Wikipedia event portal and the 100 most popular Persian news agencies between 2008 and 2021 were collected. Finally, we introduce a supervised method based on deep learning techniques to automatically extract relevant events and their corresponding actors.</p></div>","PeriodicalId":50638,"journal":{"name":"Computer Speech and Language","volume":"89 ","pages":"Article 101702"},"PeriodicalIF":3.1,"publicationDate":"2024-07-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0885230824000858/pdfft?md5=edd34515a4d99328a0c8d35808aa0fe2&pid=1-s2.0-S0885230824000858-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141947076","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0