Proces. del Leng. Natural最新文献

筛选
英文 中文
Overview of FakeDeS at IberLEF 2021: Fake News Detection in Spanish Shared Task 在IberLEF 2021上的假新闻概述:西班牙语假新闻检测共享任务
Proces. del Leng. Natural Pub Date : 2021-09-06 DOI: 10.26342/2021-67-19
Helena Gómez-Adorno, J. Posadas-Durán, Gemma Bel Enguix, Claudia Porto Capetillo
{"title":"Overview of FakeDeS at IberLEF 2021: Fake News Detection in Spanish Shared Task","authors":"Helena Gómez-Adorno, J. Posadas-Durán, Gemma Bel Enguix, Claudia Porto Capetillo","doi":"10.26342/2021-67-19","DOIUrl":"https://doi.org/10.26342/2021-67-19","url":null,"abstract":"This research was funded by CONACyT project CB A1-S-27780, DGAPA-UNAM PAPIIT grants number TA400121 and TA100520. The authors also thank CONACYT for the computer resources provided through the INAOE Supercomputing Laboratory's Deep Learning Platform for Language Technologies.","PeriodicalId":258781,"journal":{"name":"Proces. del Leng. Natural","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128794994","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Overview of ADoBo 2021: Automatic Detection of Unassimilated Borrowings in the Spanish Press ADoBo 2021概述:西班牙语出版社中未同化借用的自动检测
Proces. del Leng. Natural Pub Date : 2021-09-06 DOI: 10.26342/2021-67-24
Elena Álvarez Mellado, Luis Espinosa Anke, Julio Gonzalo Arroyo, Constantine Lignos, Jordi Porta-Zamorano
{"title":"Overview of ADoBo 2021: Automatic Detection of Unassimilated Borrowings in the Spanish Press","authors":"Elena Álvarez Mellado, Luis Espinosa Anke, Julio Gonzalo Arroyo, Constantine Lignos, Jordi Porta-Zamorano","doi":"10.26342/2021-67-24","DOIUrl":"https://doi.org/10.26342/2021-67-24","url":null,"abstract":"espanolEn este articulo presentamos los resultados de ADoBo 2021, la tarea compartida de IberLEF 2021 sobre deteccion de prestamos lexicos en la prensa espanola. En esta tarea abordamos la deteccion de prestamos como un problema de etiquetado de secuencias. A los participantes de la tarea se les proporciono un corpus de prensa espanola anotado con prestamos lexicos no asimilados (mayoritariamente anglicismos) siguiendo el esquema BIO. Recibimos nueve sistemas distintos provenientes de cuatro equipos diferentes. Los resultados obtenidos oscilan entre los 37 y los 85 puntos de valor F1, lo que indica que la deteccion de prestamos lexicos es un problema no resuelto (sobre todo cuando se abordan prestamos no vistos anteriormente) y que el trabajo lexicografico tradicional podria beneficiarse de incorporar las tecnicas actuales del PLN. EnglishThis paper summarizes the main findings of the ADoBo 2021 shared task, proposed in the context of IberLef 2021. In this task, we invited participants to detect lexical borrowings (coming mostly from English) in Spanish newswire texts. This task was framed as a sequence classification problem using BIO encoding. We provided participants with an annotated corpus of lexical borrowings which we split into training, development and test splits. We received submissions from 4 teams with 9 different system runs overall. The results, which range from F1 scores of 37 to 85, suggest that this is a challenging task, especially when out-of-domain or OOV words are considered, and that traditional methods informed with lexicographic information would benefit from taking advantage of current NLP trends.","PeriodicalId":258781,"journal":{"name":"Proces. del Leng. Natural","volume":"64 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123376596","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Overview of Rest-Mex at IberLEF 2021: Recommendation System for Text Mexican Tourism Rest-Mex在IberLEF 2021的概述:文本墨西哥旅游推荐系统
Proces. del Leng. Natural Pub Date : 2021-09-06 DOI: 10.26342/2021-67-14
Miguel A. Alvarez-Carmona, Ramón Aranda, Samuel Arce-Cardenas, Daniel Fajardo-Delgado, Rafael Guerrero-Rodriguez, Adrian Pastor Lopez-Monroy, J. Martínez-Miranda, Humberto Pérez Espinosa, Ansel Y. Rodríguez González
{"title":"Overview of Rest-Mex at IberLEF 2021: Recommendation System for Text Mexican Tourism","authors":"Miguel A. Alvarez-Carmona, Ramón Aranda, Samuel Arce-Cardenas, Daniel Fajardo-Delgado, Rafael Guerrero-Rodriguez, Adrian Pastor Lopez-Monroy, J. Martínez-Miranda, Humberto Pérez Espinosa, Ansel Y. Rodríguez González","doi":"10.26342/2021-67-14","DOIUrl":"https://doi.org/10.26342/2021-67-14","url":null,"abstract":"This paper presents the framework and results from the Rest-Mex track at IberLEF 2021. This track considered two tasks: Recommendation System and Sentiment Analysis, using texts from Mexican touristic places. The Recommendation System task consists in predicting the degree of satisfaction that a tourist may have when recommending a destination of Nayarit, Mexico, based on places visited by the tourists and their opinions. On the other hand, the Sentiment Analysis task predicts the polarity of an opinion issued by a tourist who traveled to the most representative places in Guanajuato, Mexico. For both tasks, we have built new corpora considering Spanish opinions from the TripAdvisor website. This paper compares and discusses the results of the participants for both tasks.","PeriodicalId":258781,"journal":{"name":"Proces. del Leng. Natural","volume":"32-33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123639268","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 36
Overview of MeOffendEs at IberLEF 2021: Offensive Language Detection in Spanish Variants 在IberLEF 2021上的meoffenes概述:西班牙语变体中的攻击性语言检测
Proces. del Leng. Natural Pub Date : 2021-09-06 DOI: 10.26342/2021-67-16
F. Plaza-Del-Arco, Marco Casavantes, H. Escalante, M. T. M. Valdivia, Arturo Montejo Ráez, M. M. Y. Gómez, H. Jarquín-Vásquez, Luis Villaseñor-Pineda
{"title":"Overview of MeOffendEs at IberLEF 2021: Offensive Language Detection in Spanish Variants","authors":"F. Plaza-Del-Arco, Marco Casavantes, H. Escalante, M. T. M. Valdivia, Arturo Montejo Ráez, M. M. Y. Gómez, H. Jarquín-Vásquez, Luis Villaseñor-Pineda","doi":"10.26342/2021-67-16","DOIUrl":"https://doi.org/10.26342/2021-67-16","url":null,"abstract":"We would like to thank CONACyT for partially supporting this work under grants CB-2015-01-257383 and the Thematic Networks program (Language Technologies Thematic Network). Hugo Jair Escalante is supported by CONACyT under project grant CONACYT CB-S-26314. This work is also partially supported by the grant P20 00956 (PAIDI 2020) from Andalusian Regional Government, a grant from European Regional Development Fund (FEDER), the LIVING-LANG project [RTI2018-094653-B-C21], and the Ministry of Science, Innovation and Universities (scholarship [FPI-PRE2019-089310]) from the Spanish Government.","PeriodicalId":258781,"journal":{"name":"Proces. del Leng. Natural","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"113970789","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Unimodal Feature-level improvement on Multimodal CMU-MOSEI Dataset: Uncorrelated and Convolved Feature Sets 多模态CMU-MOSEI数据集的单模态特征级改进:不相关和卷积特征集
Proces. del Leng. Natural Pub Date : 2021-09-06 DOI: 10.26342/2021-67-6
Daniel Mora Melanchthon
{"title":"Unimodal Feature-level improvement on Multimodal CMU-MOSEI Dataset: Uncorrelated and Convolved Feature Sets","authors":"Daniel Mora Melanchthon","doi":"10.26342/2021-67-6","DOIUrl":"https://doi.org/10.26342/2021-67-6","url":null,"abstract":"This work was supported by the Government of Chile through ”Proyecto Fondecyt Regular 1191481: Induccion automatica de taxonomias de marcadores discursivos a partir de corpus multilingues (2019-2021)”, lead investigator Rogelio Nazar.","PeriodicalId":258781,"journal":{"name":"Proces. del Leng. Natural","volume":"59 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127195940","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Overview of HAHA at IberLEF 2021: Detecting, Rating and Analyzing Humor in Spanish 在IberLEF 2021上的哈哈概述:西班牙语幽默的检测,评级和分析
Proces. del Leng. Natural Pub Date : 2021-09-06 DOI: 10.26342/2021-67-22
Luis Chiruzzo, Santiago Castro, Santiago Góngora, Aiala Rosá, J. A Meaney, Rada Mihalcea
{"title":"Overview of HAHA at IberLEF 2021: Detecting, Rating and Analyzing Humor in Spanish","authors":"Luis Chiruzzo, Santiago Castro, Santiago Góngora, Aiala Rosá, J. A Meaney, Rada Mihalcea","doi":"10.26342/2021-67-22","DOIUrl":"https://doi.org/10.26342/2021-67-22","url":null,"abstract":"We present the results of HAHA at IberLEF 2021: Humor Analysis ba-sed on Human Annotation. This year’s edition of the competition includes the two classic tasks of humor detection and rating, plus two novel tasks of humor logic me-chanism and target classification. We describe the corpus created for the challenge, the competition phases, the submitted systems and the main results obtained.","PeriodicalId":258781,"journal":{"name":"Proces. del Leng. Natural","volume":"113 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132244315","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 23
Masking and BERT-based Models for Stereotype Identication 刻板印象识别的掩蔽和bert模型
Proces. del Leng. Natural Pub Date : 2021-09-06 DOI: 10.26342/2021-67-7
Javier Sánchez-Junquera, Paolo Rosso, M. Montes-y-Gómez, Berta Chulvi
{"title":"Masking and BERT-based Models for Stereotype Identication","authors":"Javier Sánchez-Junquera, Paolo Rosso, M. Montes-y-Gómez, Berta Chulvi","doi":"10.26342/2021-67-7","DOIUrl":"https://doi.org/10.26342/2021-67-7","url":null,"abstract":"The work of the authors from the Universitat Politecnica of Valencia was funded by the Spanish Ministry of Science and Innovation under the research project MISMIS-FAKEnHATE on MISinformation and MIScommunication in social media: FAKE news and HATE speech (PGC2018-096212-B-C31). Experiments were carried out on the GPU cluster at PRHLT thanks to the PROMETEO/2019/121 (DeepPattern) research project funded by the Generalitat Valenciana.","PeriodicalId":258781,"journal":{"name":"Proces. del Leng. Natural","volume":"89 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127659414","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
Reconocimiento y clasificación de entidades nombradas en textos legalesen español 法律文本中命名实体的承认和分类
Proces. del Leng. Natural Pub Date : 2021-09-06 DOI: 10.26342/2021-67-9
Doa Samy
{"title":"Reconocimiento y clasificación de entidades nombradas en textos legalesen español","authors":"Doa Samy","doi":"10.26342/2021-67-9","DOIUrl":"https://doi.org/10.26342/2021-67-9","url":null,"abstract":"El reconocimiento y la clasificacion de las entidades nombradas (NER/NERC) es una tarea principal en las areas del Procesamiento del Lenguaje Natural (PLN) y la Extraccion de la Informacion. El papel de NERC en el dominio legal es imprescindible en el desarrollo de sistemas legales inteligentes. El presente trabajo pretende dar un primer paso hacia establecer un \"baseline\" para la tarea NERC en el espanol juridico. El objetivo principal consiste en proporcionar un recurso linguistico anotando cinco tipos basicos de entidades nombradas en los textos legislativos en espanol peninsular. Los cinco tipos de entidades nombradas son: Personas, Organizaciones, Lugares, Fechas absolutas y Referencias a leyes, decretos, ordenes, normativas y articulos. Se adopta una metodologia hibrida que reune tres tecnicas principales: Patrones de expresiones regulares, listas de fuentes externas y el entrenamiento de tres modelos NERC utilizando la libreria abierta spaCy v3. De los tres modelos entrenados, el mejor ha obtenido un f-score de 0.93 alcanzando en algunos tipos como las menciones a leyes o fechas valores de 0.98 y 0.97 respectivamente. El peor de los modelos ha alcanzado una media de f-score de 0.85 que sigue siendo un resultado satisfactorio comparado con el estado de la cuestion.","PeriodicalId":258781,"journal":{"name":"Proces. del Leng. Natural","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125818962","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Sarcasm Detection with BERT 基于BERT的讽刺语检测
Proces. del Leng. Natural Pub Date : 2021-09-06 DOI: 10.26342/2021-67-1
Elsa Scola, Isabel Segura-Bedmar
{"title":"Sarcasm Detection with BERT","authors":"Elsa Scola, Isabel Segura-Bedmar","doi":"10.26342/2021-67-1","DOIUrl":"https://doi.org/10.26342/2021-67-1","url":null,"abstract":"Sarcasm is often used to humorously criticize something or hurt someone's feelings. Humans often have difficulty in recognizing sarcastic comments since we say the opposite of what we really mean. Thus, automatic sarcasm detection in textual data is one of the most challenging tasks in Natural Language Processing (NLP). It has also become a relevant research area due to its importance in the improvement of sentiment analysis. In this work, we explore several deep learning models such as Bidirectional Long Short-Term Memory (BiLSTM) and Bidirectional Encoder Representations from Transformers (BERT) to address the task of sarcasm detection. While most research has been conducted using social media data, we evaluate our models using a news headlines dataset. To the best of our knowledge, this is the first study that applies BERT to detect sarcasm in texts that do not come from social media. Experiment results show that the BERT-based approach overcomes the state-of-the-art on this type of dataset.","PeriodicalId":258781,"journal":{"name":"Proces. del Leng. Natural","volume":"95 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125082818","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Constructing Corpus and Word Embedding for Spanish Covid-19 Data 构建西班牙语Covid-19数据的语料库和词嵌入
Proces. del Leng. Natural Pub Date : 2021-09-06 DOI: 10.26342/2021-67-3
Kyungjin Hwang
{"title":"Constructing Corpus and Word Embedding for Spanish Covid-19 Data","authors":"Kyungjin Hwang","doi":"10.26342/2021-67-3","DOIUrl":"https://doi.org/10.26342/2021-67-3","url":null,"abstract":"Severe acute respiratory syndrome coronavirus 2 (COVID 19), colloquially referred to as coronavirus, escalated into a global pandemic with severe transmission and mortality rates in 2019. Despite the escalation of the virus’ worldwide impact in 2020, numerous studies on Natural Language Processing in Spanish have neglected corpus construction or word embedding, especially conspicuous in its absence being the corpora involving coronavirus or infectious diseases. Additionally, corpus construction or word embedding conducted in the medical field do not display efficacy in production pertaining to coronavirus or infectious diseases. To supplement this potentially detrimental insufficiency, this study collects Spanish Language data to build a relevant coronavirus corpus through appropriate preprocessing and then obtains a word embedding. Performance of the corpus and word embedding are then tested through word similarity evaluations, a cosine similarity evaluation, and a visualization evaluation with the existing Spanish corpus. After comparison, corpus and word embedding suitable for coronavirus will be suggested.","PeriodicalId":258781,"journal":{"name":"Proces. del Leng. Natural","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132799002","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信