FinSim-2 2021共享任务:学习金融领域的语义相似性

Companion Proceedings of the Web Conference 2021 Pub Date : 2021-04-19 DOI:10.1145/3442442.3451381

Youness Mansar, Juyeon Kang, Ismaïl El Maarouf

{"title":"FinSim-2 2021共享任务:学习金融领域的语义相似性","authors":"Youness Mansar, Juyeon Kang, Ismaïl El Maarouf","doi":"10.1145/3442442.3451381","DOIUrl":null,"url":null,"abstract":"The FinSim-2 is a second edition of FinSim Shared Task on Learning Semantic Similarities for the Financial Domain, colocated with the FinWeb workshop. FinSim-2 proposed the challenge to automatically learn effective and precise semantic models for the financial domain. The second edition of the FinSim offered an enriched dataset in terms of volume and quality, and interested in systems which make creative use of relevant resources such as ontologies and lexica, as well as systems which make use of contextual word embeddings such as BERT[4]. Going beyond the mere representation of words is a key step to industrial applications that make use of Natural Language Processing (NLP). This is typically addressed using either unsupervised corpus-derived representations like word embeddings, which are typically opaque to human understanding but very useful in NLP applications or manually created resources such as taxonomies and ontologies, which typically have low coverage and contain inconsistencies, but provide a deeper understanding of the target domain. Finsim is inspired from previous endeavours in the Semeval community, which organized several competitions on semantic/lexical relation extraction between concepts/words. This year, 18 system runs were submitted by 7 teams and systems were ranked according to 2 metrics, Accuracy and Mean rank. All the systems beat our baseline 1 model by over 15 points and the best systems beat the baseline 2 by over 1 ∼ 3 points in accuracy.","PeriodicalId":129420,"journal":{"name":"Companion Proceedings of the Web Conference 2021","volume":"72 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"19","resultStr":"{\"title\":\"The FinSim-2 2021 Shared Task: Learning Semantic Similarities for the Financial Domain\",\"authors\":\"Youness Mansar, Juyeon Kang, Ismaïl El Maarouf\",\"doi\":\"10.1145/3442442.3451381\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The FinSim-2 is a second edition of FinSim Shared Task on Learning Semantic Similarities for the Financial Domain, colocated with the FinWeb workshop. FinSim-2 proposed the challenge to automatically learn effective and precise semantic models for the financial domain. The second edition of the FinSim offered an enriched dataset in terms of volume and quality, and interested in systems which make creative use of relevant resources such as ontologies and lexica, as well as systems which make use of contextual word embeddings such as BERT[4]. Going beyond the mere representation of words is a key step to industrial applications that make use of Natural Language Processing (NLP). This is typically addressed using either unsupervised corpus-derived representations like word embeddings, which are typically opaque to human understanding but very useful in NLP applications or manually created resources such as taxonomies and ontologies, which typically have low coverage and contain inconsistencies, but provide a deeper understanding of the target domain. Finsim is inspired from previous endeavours in the Semeval community, which organized several competitions on semantic/lexical relation extraction between concepts/words. This year, 18 system runs were submitted by 7 teams and systems were ranked according to 2 metrics, Accuracy and Mean rank. All the systems beat our baseline 1 model by over 15 points and the best systems beat the baseline 2 by over 1 ∼ 3 points in accuracy.\",\"PeriodicalId\":129420,\"journal\":{\"name\":\"Companion Proceedings of the Web Conference 2021\",\"volume\":\"72 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-04-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"19\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Companion Proceedings of the Web Conference 2021\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3442442.3451381\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Companion Proceedings of the Web Conference 2021","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3442442.3451381","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 19

摘要

FinSim-2是FinSim关于学习金融领域语义相似性的共享任务的第二版，与FinWeb研讨会同步进行。FinSim-2提出了自动学习金融领域有效而精确的语义模型的挑战。FinSim的第二版在数量和质量方面提供了丰富的数据集，并对创造性地使用相关资源(如本体和词典)的系统以及使用上下文词嵌入(如BERT)的系统感兴趣[4]。超越单纯的单词表示是利用自然语言处理(NLP)的工业应用的关键一步。这通常是使用无监督的语料库派生表示来解决的，比如词嵌入，它通常对人类的理解是不透明的，但在NLP应用程序中非常有用，或者手动创建的资源，比如分类法和本体，它们通常覆盖率低，包含不一致性，但提供了对目标领域的更深入的理解。Finsim的灵感来自Semeval社区之前的努力，Semeval社区组织了几次关于概念/单词之间语义/词汇关系提取的比赛。今年，有7个团队提交了18个系统运行，系统根据准确性和平均排名这两个指标进行排名。所有系统都比我们的基线1模型高出15分以上，最好的系统在精度上比基线2高出1 ~ 3分以上。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

The FinSim-2 2021 Shared Task: Learning Semantic Similarities for the Financial Domain

The FinSim-2 is a second edition of FinSim Shared Task on Learning Semantic Similarities for the Financial Domain, colocated with the FinWeb workshop. FinSim-2 proposed the challenge to automatically learn effective and precise semantic models for the financial domain. The second edition of the FinSim offered an enriched dataset in terms of volume and quality, and interested in systems which make creative use of relevant resources such as ontologies and lexica, as well as systems which make use of contextual word embeddings such as BERT[4]. Going beyond the mere representation of words is a key step to industrial applications that make use of Natural Language Processing (NLP). This is typically addressed using either unsupervised corpus-derived representations like word embeddings, which are typically opaque to human understanding but very useful in NLP applications or manually created resources such as taxonomies and ontologies, which typically have low coverage and contain inconsistencies, but provide a deeper understanding of the target domain. Finsim is inspired from previous endeavours in the Semeval community, which organized several competitions on semantic/lexical relation extraction between concepts/words. This year, 18 system runs were submitted by 7 teams and systems were ranked according to 2 metrics, Accuracy and Mean rank. All the systems beat our baseline 1 model by over 15 points and the best systems beat the baseline 2 by over 1 ∼ 3 points in accuracy.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Companion Proceedings of the Web Conference 2021

自引率

0.00%

发文量