CAPISCO @ CONcreTEXT 2020:(非)监督系统，用规范化数据将具体情境化

EVALITA Evaluation of NLP and Speech Tools for Italian - December 17th, 2020 Pub Date : 1900-01-01 DOI:10.4000/BOOKS.AACCADEMIA.7475

Alessandro Bondielli, Gianluca E. Lebani, Lucia C. Passaro, Alessandro Lenci

{"title":"CAPISCO @ CONcreTEXT 2020:(非)监督系统，用规范化数据将具体情境化","authors":"Alessandro Bondielli, Gianluca E. Lebani, Lucia C. Passaro, Alessandro Lenci","doi":"10.4000/BOOKS.AACCADEMIA.7475","DOIUrl":null,"url":null,"abstract":"English. This paper describes several approaches to the automatic rating of the concreteness of concepts in context, to approach the EVALITA 2020 “CONcreTEXT” task. Our systems focus on the interplay between words and their surrounding context by (i) exploiting annotated resources, (ii) using BERT masking to find potential substitutes of the target in specific contexts and measuring their average similarity with concrete and abstract centroids, and (iii) automatically generating labelled datasets to fine tune transformer models for regression. All the approaches have been tested both on English and Italian data. Both the best systems for each language ranked second in the task.","PeriodicalId":184564,"journal":{"name":"EVALITA Evaluation of NLP and Speech Tools for Italian - December 17th, 2020","volume":"29 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"CAPISCO @ CONcreTEXT 2020: (Un)supervised Systems to Contextualize Concreteness with Norming Data\",\"authors\":\"Alessandro Bondielli, Gianluca E. Lebani, Lucia C. Passaro, Alessandro Lenci\",\"doi\":\"10.4000/BOOKS.AACCADEMIA.7475\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"English. This paper describes several approaches to the automatic rating of the concreteness of concepts in context, to approach the EVALITA 2020 “CONcreTEXT” task. Our systems focus on the interplay between words and their surrounding context by (i) exploiting annotated resources, (ii) using BERT masking to find potential substitutes of the target in specific contexts and measuring their average similarity with concrete and abstract centroids, and (iii) automatically generating labelled datasets to fine tune transformer models for regression. All the approaches have been tested both on English and Italian data. Both the best systems for each language ranked second in the task.\",\"PeriodicalId\":184564,\"journal\":{\"name\":\"EVALITA Evaluation of NLP and Speech Tools for Italian - December 17th, 2020\",\"volume\":\"29 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1900-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"EVALITA Evaluation of NLP and Speech Tools for Italian - December 17th, 2020\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.4000/BOOKS.AACCADEMIA.7475\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"EVALITA Evaluation of NLP and Speech Tools for Italian - December 17th, 2020","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.4000/BOOKS.AACCADEMIA.7475","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

摘要

英语。本文描述了几种自动评估上下文中概念具体程度的方法，以接近EVALITA 2020“CONcreTEXT”任务。我们的系统通过(i)利用带注释的资源，(ii)使用BERT掩蔽来寻找特定上下文中目标的潜在替代品，并测量其与具体和抽象质心的平均相似度，以及(iii)自动生成标记数据集以微调回归变压器模型，从而专注于单词与其周围上下文之间的相互作用。所有的方法都在英语和意大利语的数据上进行了测试。每种语言的最佳系统在任务中都排名第二。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

CAPISCO @ CONcreTEXT 2020: (Un)supervised Systems to Contextualize Concreteness with Norming Data

English. This paper describes several approaches to the automatic rating of the concreteness of concepts in context, to approach the EVALITA 2020 “CONcreTEXT” task. Our systems focus on the interplay between words and their surrounding context by (i) exploiting annotated resources, (ii) using BERT masking to find potential substitutes of the target in specific contexts and measuring their average similarity with concrete and abstract centroids, and (iii) automatically generating labelled datasets to fine tune transformer models for regression. All the approaches have been tested both on English and Italian data. Both the best systems for each language ranked second in the task.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

EVALITA Evaluation of NLP and Speech Tools for Italian - December 17th, 2020

自引率

0.00%

发文量