{"title":"FinSim-2 2021共享任务:学习金融领域的语义相似性","authors":"Youness Mansar, Juyeon Kang, Ismaïl El Maarouf","doi":"10.1145/3442442.3451381","DOIUrl":null,"url":null,"abstract":"The FinSim-2 is a second edition of FinSim Shared Task on Learning Semantic Similarities for the Financial Domain, colocated with the FinWeb workshop. FinSim-2 proposed the challenge to automatically learn effective and precise semantic models for the financial domain. The second edition of the FinSim offered an enriched dataset in terms of volume and quality, and interested in systems which make creative use of relevant resources such as ontologies and lexica, as well as systems which make use of contextual word embeddings such as BERT[4]. Going beyond the mere representation of words is a key step to industrial applications that make use of Natural Language Processing (NLP). This is typically addressed using either unsupervised corpus-derived representations like word embeddings, which are typically opaque to human understanding but very useful in NLP applications or manually created resources such as taxonomies and ontologies, which typically have low coverage and contain inconsistencies, but provide a deeper understanding of the target domain. Finsim is inspired from previous endeavours in the Semeval community, which organized several competitions on semantic/lexical relation extraction between concepts/words. This year, 18 system runs were submitted by 7 teams and systems were ranked according to 2 metrics, Accuracy and Mean rank. All the systems beat our baseline 1 model by over 15 points and the best systems beat the baseline 2 by over 1 ∼ 3 points in accuracy.","PeriodicalId":129420,"journal":{"name":"Companion Proceedings of the Web Conference 2021","volume":"72 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"19","resultStr":"{\"title\":\"The FinSim-2 2021 Shared Task: Learning Semantic Similarities for the Financial Domain\",\"authors\":\"Youness Mansar, Juyeon Kang, Ismaïl El Maarouf\",\"doi\":\"10.1145/3442442.3451381\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The FinSim-2 is a second edition of FinSim Shared Task on Learning Semantic Similarities for the Financial Domain, colocated with the FinWeb workshop. FinSim-2 proposed the challenge to automatically learn effective and precise semantic models for the financial domain. The second edition of the FinSim offered an enriched dataset in terms of volume and quality, and interested in systems which make creative use of relevant resources such as ontologies and lexica, as well as systems which make use of contextual word embeddings such as BERT[4]. Going beyond the mere representation of words is a key step to industrial applications that make use of Natural Language Processing (NLP). This is typically addressed using either unsupervised corpus-derived representations like word embeddings, which are typically opaque to human understanding but very useful in NLP applications or manually created resources such as taxonomies and ontologies, which typically have low coverage and contain inconsistencies, but provide a deeper understanding of the target domain. Finsim is inspired from previous endeavours in the Semeval community, which organized several competitions on semantic/lexical relation extraction between concepts/words. This year, 18 system runs were submitted by 7 teams and systems were ranked according to 2 metrics, Accuracy and Mean rank. All the systems beat our baseline 1 model by over 15 points and the best systems beat the baseline 2 by over 1 ∼ 3 points in accuracy.\",\"PeriodicalId\":129420,\"journal\":{\"name\":\"Companion Proceedings of the Web Conference 2021\",\"volume\":\"72 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-04-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"19\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Companion Proceedings of the Web Conference 2021\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3442442.3451381\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Companion Proceedings of the Web Conference 2021","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3442442.3451381","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
The FinSim-2 2021 Shared Task: Learning Semantic Similarities for the Financial Domain
The FinSim-2 is a second edition of FinSim Shared Task on Learning Semantic Similarities for the Financial Domain, colocated with the FinWeb workshop. FinSim-2 proposed the challenge to automatically learn effective and precise semantic models for the financial domain. The second edition of the FinSim offered an enriched dataset in terms of volume and quality, and interested in systems which make creative use of relevant resources such as ontologies and lexica, as well as systems which make use of contextual word embeddings such as BERT[4]. Going beyond the mere representation of words is a key step to industrial applications that make use of Natural Language Processing (NLP). This is typically addressed using either unsupervised corpus-derived representations like word embeddings, which are typically opaque to human understanding but very useful in NLP applications or manually created resources such as taxonomies and ontologies, which typically have low coverage and contain inconsistencies, but provide a deeper understanding of the target domain. Finsim is inspired from previous endeavours in the Semeval community, which organized several competitions on semantic/lexical relation extraction between concepts/words. This year, 18 system runs were submitted by 7 teams and systems were ranked according to 2 metrics, Accuracy and Mean rank. All the systems beat our baseline 1 model by over 15 points and the best systems beat the baseline 2 by over 1 ∼ 3 points in accuracy.