{"title":"Improving Multilingual Frame Identification by Estimating Frame Transferability","authors":"Jennifer Sikos, Michael Roth, Sebastian Padó","doi":"10.33011/lilt.v19i.939","DOIUrl":null,"url":null,"abstract":"A recent research direction in computational linguistics involves efforts to make the field, which used to focus primarily on English, more multilingual and inclusive. However, resource creation often remains a bottleneck for many languages, in particular at the semantic level. In this article, we consider the case of frame-semantic annotation. We investigate how to perform frame selection for annotation in a target language by taking advantage of existing annotations in different, supplementary languages, with the goal of reducing the required annotation effort in the target language. We measure success by training and testing frame identification models for the target language. We base our selection methods on measuring frame transferability in the supplementary language, where we estimate which frames will transfer poorly, and therefore should receive more annotation, in the target language. We apply our approach to English, German, and French – three languages which have annotations that are similar in size as well as frames with overlapping lexicographic definitions. We find that transferability is indeed a useful indicator and supports a setup where a limited amount of target language data is sufficient to train frame identification systems.","PeriodicalId":218122,"journal":{"name":"Linguistic Issues in Language Technology","volume":"60 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Linguistic Issues in Language Technology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.33011/lilt.v19i.939","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
A recent research direction in computational linguistics involves efforts to make the field, which used to focus primarily on English, more multilingual and inclusive. However, resource creation often remains a bottleneck for many languages, in particular at the semantic level. In this article, we consider the case of frame-semantic annotation. We investigate how to perform frame selection for annotation in a target language by taking advantage of existing annotations in different, supplementary languages, with the goal of reducing the required annotation effort in the target language. We measure success by training and testing frame identification models for the target language. We base our selection methods on measuring frame transferability in the supplementary language, where we estimate which frames will transfer poorly, and therefore should receive more annotation, in the target language. We apply our approach to English, German, and French – three languages which have annotations that are similar in size as well as frames with overlapping lexicographic definitions. We find that transferability is indeed a useful indicator and supports a setup where a limited amount of target language data is sufficient to train frame identification systems.