Language Resources and Evaluation最新文献_第9页

RUN-AS: a novel approach to annotate news reliability for disinformation detection RUN-AS:一种用于虚假信息检测的标注新闻可靠性的新方法

IF 2.7 3区计算机科学

Language Resources and Evaluation Pub Date : 2023-08-06 DOI: 10.1007/s10579-023-09678-9

Alba Bonet-Jover, Robiert Sepúlveda-Torres, E. Saquete, P. Martínez-Barco, Mario Nieto-Pérez

引用次数: 0

The limitations of irony detection in Dutch social media 荷兰社交媒体中反讽检测的局限性

IF 2.7 3区计算机科学

Language Resources and Evaluation Pub Date : 2023-07-23 DOI: 10.1007/s10579-023-09656-1

Aaron Maladry, Els Lefever, Cynthia Van Hee, Veronique Hoste

引用次数: 2

Fine-tuning language models to recognize semantic relations 微调语言模型以识别语义关系

IF 2.7 3区计算机科学

Language Resources and Evaluation Pub Date : 2023-07-23 DOI: 10.1007/s10579-023-09677-w

D. Roussinov, S. Sharoff, Nadezhda Puchnina

引用次数: 0

Assessment of pragmatic abilities and cognitive substrates (APACS) brief remote: a novel tool for the rapid and tele-evaluation of pragmatic skills in Italian 语用能力和认知基础评估(APACS):一种用于意大利语语用技能快速和远程评估的新工具

IF 2.7 3区计算机科学

Language Resources and Evaluation Pub Date : 2023-07-23 DOI: 10.1007/s10579-023-09667-y

L. Bischetti, C. Pompei, Biagio Scalingi, F. Frau, M. Bosia, G. Arcara, V. Bambini

引用次数: 0

MarIA and BETO are sexist: evaluating gender bias in large language models for Spanish MarIA和BETO是性别歧视者：评估西班牙语大型语言模型中的性别偏见

IF 2.7 3区计算机科学

Language Resources and Evaluation Pub Date : 2023-07-23 DOI: 10.1007/s10579-023-09670-3

Ismael Garrido-Muñoz, F. Martínez-Santiago, Arturo Montejo-Ráez

引用次数: 1

FullStop: punctuation and segmentation prediction for Dutch with transformers FullStop:带变压器的荷兰语标点和分词预测

IF 2.7 3区计算机科学

Language Resources and Evaluation Pub Date : 2023-07-14 DOI: 10.1007/s10579-023-09676-x

Vincent Vandeghinste, Oliver Guhr

{"title":"FullStop: punctuation and segmentation prediction for Dutch with transformers","authors":"Vincent Vandeghinste, Oliver Guhr","doi":"10.1007/s10579-023-09676-x","DOIUrl":"https://doi.org/10.1007/s10579-023-09676-x","url":null,"abstract":"<p>When applying automated speech recognition (ASR) for Belgian Dutch, the output consists of an unsegmented stream of words, without any punctuation. A next step is to perform segmentation and insert punctuation, making the ASR output more readable and easy to manually correct. We present the first (as far as we know) publicly available punctuation insertion system for Dutch that functions at a usable level and that is publicly available. The model we present here is an extension of the approach of Guhr et al. (In: Swiss Text Analytics Conference. Shared task on Sentence End and Punctuation Prediction in NLG Text, 2021) for Dutch: we finetuned the Dutch language model RobBERT on a punctuation prediction sequence classification task. The model was finetuned on two datasets: the Dutch side of Europarl and the SoNaR corpus. For every word in the input sequence, the model predicts a punctuation marker that follows the word. In cases where the language is unknown or where code switching applies, we have extended an existing multilingual model with Dutch. Previous work showed that such a multilingual model, based on “xlm-roberta-base” performs on par or sometimes even better than the monolingual cases. The system was evaluated on in-domain data as a classifier and on out-of-domain data as a sentence segmentation system through full stop prediction. The evaluations on sentence segmentation on out of domain data show that models finetuned on SoNaR show the best results, which can be attributed to SoNaR being a reference corpus containing different language registers. The multilingual models show an even better precision (at the cost of a lower recall) compared to the monolingual models.</p>","PeriodicalId":49927,"journal":{"name":"Language Resources and Evaluation","volume":"3 1","pages":""},"PeriodicalIF":2.7,"publicationDate":"2023-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138513877","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

adaptNMT: an open-source, language-agnostic development environment for neural machine translation adaptNMT:一个开源的、与语言无关的神经机器翻译开发环境

IF 2.7 3区计算机科学

Language Resources and Evaluation Pub Date : 2023-07-14 DOI: 10.1007/s10579-023-09671-2

Séamus Lankford, Haithem Afli, Andy Way

引用次数: 2

The Visual Language Research Corpus (VLRC): an annotated corpus of comics from Asia, Europe, and the United States 视觉语言研究语料库(VLRC):一个来自亚洲、欧洲和美国的漫画注释语料库

IF 2.7 3区计算机科学

Language Resources and Evaluation Pub Date : 2023-07-14 DOI: 10.1007/s10579-023-09673-0

Neil Cohn, Bruno Cardoso, Bien Klomberg, Irmak Hacımusaoğlu

引用次数: 2

Evaluation of a rule-based approach to automatic factual question generation using syntactic and semantic analysis 使用句法和语义分析评估基于规则的事实问题自动生成方法

IF 2.7 3区计算机科学

Language Resources and Evaluation Pub Date : 2023-07-10 DOI: 10.1007/s10579-023-09672-1

A. Gašpar, Ani Grubišić, Ines Šarić-Grgić

引用次数: 1

Sentiment analysis in Portuguese tweets: an evaluation of diverse word representation models 葡萄牙语推文的情感分析:不同词表示模型的评价

IF 2.7 3区计算机科学

Language Resources and Evaluation Pub Date : 2023-06-28 DOI: 10.1007/s10579-023-09661-4

Daniela Vianna, Fernando Carneiro, Jonnathan Carvalho, Alexandre Plastino, A. Paes

引用次数: 0