Proces. del Leng. Natural最新文献

筛选
英文 中文
Overview of the EmoEvalEs task on emotion detection for Spanish at IberLEF 2021 在IberLEF 2021上,EmoEvalEs在西班牙语情感检测方面的任务概述
Proces. del Leng. Natural Pub Date : 2021-09-06 DOI: 10.26342/2021-67-13
F. Plaza-Del-Arco, S. M. J. Zafra, Arturo Montejo Ráez, M. González, L. A. U. López, M. T. M. Valdivia
{"title":"Overview of the EmoEvalEs task on emotion detection for Spanish at IberLEF 2021","authors":"F. Plaza-Del-Arco, S. M. J. Zafra, Arturo Montejo Ráez, M. González, L. A. U. López, M. T. M. Valdivia","doi":"10.26342/2021-67-13","DOIUrl":"https://doi.org/10.26342/2021-67-13","url":null,"abstract":"This work has been partially supported by a grant from Fondo Social Europeo, Administration of the Junta de Andalucia (DOC 01073 and P20 00956-PAIDI 2020), Fondo Europeo de Desarrollo Regional (FEDER), LIVING-LANG project (RTI2018-094653-B-C21) and the Ministry of Science, Innovation and Universities (scholarship [FPI-PRE2019-089310]) from the Spanish Government.","PeriodicalId":258781,"journal":{"name":"Proces. del Leng. Natural","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125254713","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 21
Overview of DETOXIS at IberLEF 2021: DEtection of TOXicity in comments In Spanish IberLEF 2021的解毒概述:西班牙语评论中的毒性检测
Proces. del Leng. Natural Pub Date : 2021-09-06 DOI: 10.26342/2021-67-18
M. Taulé, Alejandro Ariza-Casabona, Montserrat Nofre, Enrique Amigó, Paolo Rosso
{"title":"Overview of DETOXIS at IberLEF 2021: DEtection of TOXicity in comments In Spanish","authors":"M. Taulé, Alejandro Ariza-Casabona, Montserrat Nofre, Enrique Amigó, Paolo Rosso","doi":"10.26342/2021-67-18","DOIUrl":"https://doi.org/10.26342/2021-67-18","url":null,"abstract":"In this paper we present the DETOXIS task, DEtection of TOxicity in comments In Spanish, which took place as part of the IberLEF 2021 Workshop on Iberian Languages Evaluation Forum at the SEPLN 2021 Conference. We describe the NewsCom-TOX dataset used for training and testing the systems, the metrics applied for their evaluation and the results obtained by the submitted approaches. We also provide an error analysis of the results of these systems.","PeriodicalId":258781,"journal":{"name":"Proces. del Leng. Natural","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121887697","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
Inducción automática de una taxonomía multilingüe de marcadores discursivos: primeros resultados en castellano, inglés, francés, alemán y catalán 自动归纳论述标记的多语言分类法:第一个结果在西班牙语,英语,法语,德语和加泰罗尼亚语
Proces. del Leng. Natural Pub Date : 2021-09-06 DOI: 10.26342/2021-67-11
Rogelio Nazar
{"title":"Inducción automática de una taxonomía multilingüe de marcadores discursivos: primeros resultados en castellano, inglés, francés, alemán y catalán","authors":"Rogelio Nazar","doi":"10.26342/2021-67-11","DOIUrl":"https://doi.org/10.26342/2021-67-11","url":null,"abstract":"Esta investigacion ha sido financiada por el Gobierno de Chile a traves del Proyecto Fondecyt Regular 1191481: Induccion automatica de taxonomias de marcadores discursivos a partir de corpus multilingues (2019-2021).","PeriodicalId":258781,"journal":{"name":"Proces. del Leng. Natural","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125165596","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
VaxxStance@IberLEF 2021: Overview of the Task on Going Beyond Text in Cross-Lingual Stance Detection VaxxStance@IberLEF 2021:跨语言姿态检测中超越文本的任务概述
Proces. del Leng. Natural Pub Date : 2021-09-06 DOI: 10.26342/2021-67-15
Rodrigo Agerri Gascón, Roberto Centeno, María S. Espinosa, Joseba Fernandez de Landa, Álvaro Rodrigo Yuste
{"title":"VaxxStance@IberLEF 2021: Overview of the Task on Going Beyond Text in Cross-Lingual Stance Detection","authors":"Rodrigo Agerri Gascón, Roberto Centeno, María S. Espinosa, Joseba Fernandez de Landa, Álvaro Rodrigo Yuste","doi":"10.26342/2021-67-15","DOIUrl":"https://doi.org/10.26342/2021-67-15","url":null,"abstract":"This work has been partially supported by the European Social Fund through the Youth Employment Initiative (YEI 2019) and the Spanish Ministry of Science, Innovation and Universities (DeepReading RTI2018-096846-B-C21, MCIU/AEI/FEDER, UE), and by the DeepText project (KK-2020/00088), funded by the Basque Government. Rodrigo Agerri is also funded by the RYC-2017-23647 fellowship.","PeriodicalId":258781,"journal":{"name":"Proces. del Leng. Natural","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125908495","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Un enfoque semántico en la seleccion de características basadas en léxico para la detección de emociones 基于词汇的特征选择情绪检测的语义方法
Proces. del Leng. Natural Pub Date : 2021-09-06 DOI: 10.26342/2021-67-10
Harold González-Guerra, Alfredo Simón-Cuevas, J. Ortega, J. A. Olivas
{"title":"Un enfoque semántico en la seleccion de características basadas en léxico para la detección de emociones","authors":"Harold González-Guerra, Alfredo Simón-Cuevas, J. Ortega, J. A. Olivas","doi":"10.26342/2021-67-10","DOIUrl":"https://doi.org/10.26342/2021-67-10","url":null,"abstract":"Este trabajo ha sido parcialmente financiado por el Fondo Europeo de Desarrollo Regional (FEDER), la Junta de Extremadura (GR18135), y el Ministerio de Ciencia, Innovacion y Universidades de Espana, a traves del proyecto SAFER (PID2019-104735RB-C42).","PeriodicalId":258781,"journal":{"name":"Proces. del Leng. Natural","volume":"50 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130030010","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Overview of the IDPT Task on Irony Detection in Portuguese at IberLEF 2021 在IberLEF 2021上关于葡萄牙语反语检测的IDPT任务概述
Proces. del Leng. Natural Pub Date : 2021-09-06 DOI: 10.26342/2021-67-23
U. Corrêa, Leonardo Coelho, Leonardo Pereira dos Santos, L. Freitas
{"title":"Overview of the IDPT Task on Irony Detection in Portuguese at IberLEF 2021","authors":"U. Corrêa, Leonardo Coelho, Leonardo Pereira dos Santos, L. Freitas","doi":"10.26342/2021-67-23","DOIUrl":"https://doi.org/10.26342/2021-67-23","url":null,"abstract":"This paper presents the Task on Irony Detection in Portuguese (IDPT), held within Iberian Languages Evaluation Forum (IberLEF 2021). We asked the participants to develop systems capable of identifying irony in texts. We created two corpora containing tweets and news articles. Twelve teams registered to the task, among which six submitted both predictions and technical reports. The best performing system achieved a Balanced Accuracy (Bacc) value of 0.52 for tweets (Team PiLN) and 0.92 for news (Team BERT4EVER).","PeriodicalId":258781,"journal":{"name":"Proces. del Leng. Natural","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127996248","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 17
AutoPunct: A BERT-based Automatic Punctuation and Capitalisation System for Spanish and Basque AutoPunct:基于bert的西班牙语和巴斯克语自动标点和大写系统
Proces. del Leng. Natural Pub Date : 2021-09-06 DOI: 10.26342/2021-67-5
Ander González-Docasal, Aitor García-Pablos, Haritz Arzelus, Aitor Álvarez
{"title":"AutoPunct: A BERT-based Automatic Punctuation and Capitalisation System for Spanish and Basque","authors":"Ander González-Docasal, Aitor García-Pablos, Haritz Arzelus, Aitor Álvarez","doi":"10.26342/2021-67-5","DOIUrl":"https://doi.org/10.26342/2021-67-5","url":null,"abstract":"The raw output of an Automatic Speech Recognition system usually consists in a stream of words without any casing nor punctuation. In order to improve the readability and enable further uses of this output, punctuation and capitalisation have to be included. In this context, we present AutoPunct, a Transformers-based automatic punctuation and capitalisation model that combines both acoustic (i.e. silences duration) and lexical information (the words themselves). We compared its performance with a system based on Bidirectional Recurrent Neural Networks (BRNN) on Basque (a low-resource language) and Spanish, both individually and simultaneously. The result is a system that achieves high accuracy for punctuation and capitalisation in both languages at the same time, with a throughput of several thousand words per second using a standard GPU.","PeriodicalId":258781,"journal":{"name":"Proces. del Leng. Natural","volume":"48 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134533092","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Procesamiento de Expresiones Multipalabra en gallego mediante Aprendizaje Profundo 通过深度学习处理加利西亚多词表达
Proces. del Leng. Natural Pub Date : 2021-09-06 DOI: 10.26342/2021-67-4
Víctor Manuel Darriba Bilbao, Yerai Doval, Elmurod Kuriyozov
{"title":"Procesamiento de Expresiones Multipalabra en gallego mediante Aprendizaje Profundo","authors":"Víctor Manuel Darriba Bilbao, Yerai Doval, Elmurod Kuriyozov","doi":"10.26342/2021-67-4","DOIUrl":"https://doi.org/10.26342/2021-67-4","url":null,"abstract":"Este trabajo ha sido parcialmente financiado por la Xunta de Galicia, a traves del Convenio de colaboracion plurianual entre el Centro Ramon Pineiro para la Investigacion en Humanidades y la Universidad de Vigo, y la Ayuda para la Consolidacion y Estructuracion de Unidades de Investigacion Competitivas ED431C 2018/50, y por el Ministerio de Economia, Industria y Competitividad a traves del proyecto TIN2017-85160-C2-2-R.","PeriodicalId":258781,"journal":{"name":"Proces. del Leng. Natural","volume":"400 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116230702","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
MarIA: Spanish Language Models 玛丽亚:西班牙语模特
Proces. del Leng. Natural Pub Date : 2021-07-15 DOI: 10.26342/2022-68-3
Asier Gutiérrez-Fandiño, Jordi Armengol-Estap'e, Marc Pàmies, Joan Llop-Palao, Joaquín Silveira-Ocampo, C. Carrino, Carme Armentano-Oller, C. R. Penagos, Aitor Gonzalez-Agirre, Marta Villegas
{"title":"MarIA: Spanish Language Models","authors":"Asier Gutiérrez-Fandiño, Jordi Armengol-Estap'e, Marc Pàmies, Joan Llop-Palao, Joaquín Silveira-Ocampo, C. Carrino, Carme Armentano-Oller, C. R. Penagos, Aitor Gonzalez-Agirre, Marta Villegas","doi":"10.26342/2022-68-3","DOIUrl":"https://doi.org/10.26342/2022-68-3","url":null,"abstract":"This work presents MarIA, a family of Spanish language models and associated resources made available to the industry and the research community. Currently, MarIA includes RoBERTa-base, RoBERTa-large, GPT2 and GPT2-large Spanish language models, which can arguably be presented as the largest and most proficient language models in Spanish. The models were pretrained using a massive corpus of 570GB of clean and deduplicated texts with 135 billion words extracted from the Spanish Web Archive crawled by the National Library of Spain between 2009 and 2019. We assessed the performance of the models with nine existing evaluation datasets and with a novel extractive Question Answering dataset created ex novo. Overall, MarIA models outperform the existing Spanish models across a variety of NLU tasks and training settings.","PeriodicalId":258781,"journal":{"name":"Proces. del Leng. Natural","volume":"70 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126281048","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 60
Bertinho: Galician BERT Representations Bertinho:加利西亚伯特表示
Proces. del Leng. Natural Pub Date : 2021-03-25 DOI: 10.26342/2021-66-1
David Vilares, Marcos Garcia, Carlos Gómez-Rodríguez
{"title":"Bertinho: Galician BERT Representations","authors":"David Vilares, Marcos Garcia, Carlos Gómez-Rodríguez","doi":"10.26342/2021-66-1","DOIUrl":"https://doi.org/10.26342/2021-66-1","url":null,"abstract":"This paper presents a monolingual BERT model for Galician. We follow the recent trend that shows that it is feasible to build robust monolingual BERT models even for relatively low-resource languages, while performing better than the well-known official multilingual BERT (mBERT). More particularly, we release two monolingual Galician BERT models, built using 6 and 12 transformer layers, respectively; trained with limited resources (~45 million tokens on a single GPU of 24GB). We then provide an exhaustive evaluation on a number of tasks such as POS-tagging, dependency parsing and named entity recognition. For this purpose, all these tasks are cast in a pure sequence labeling setup in order to run BERT without the need to include any additional layers on top of it (we only use an output classification layer to map the contextualized representations into the predicted label). The experiments show that our models, especially the 12-layer one, outperform the results of mBERT in most tasks.","PeriodicalId":258781,"journal":{"name":"Proces. del Leng. Natural","volume":"66 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131203482","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 15
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信