{"title":"Word Transduction for Addressing the OOV Problem in Machine Translation for Similar Resource-Scarce Languages","authors":"Anssi Yli-Jyrä","doi":"10.18653/v1/W17-4009","DOIUrl":null,"url":null,"abstract":"Wiktionary provides lexical information for an increasing number of languages, including morphological inflection tables. It is a good resource for automatically learning rule-based analysis of the inflectional morphology of a language. This paper performs an extensive evaluation of a method to extract generalized paradigms from morphological inflection tables, which can be converted to weighted and unweighted finite transducers for morphological parsing and generation. The inflection tables of 55 languages from the English edition of Wiktionary are converted to such general paradigms, and the performance of the probabilistic parsers based on these paradigms are tested.","PeriodicalId":286427,"journal":{"name":"Finite-State Methods and Natural Language Processing","volume":"23 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Finite-State Methods and Natural Language Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.18653/v1/W17-4009","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 7
Abstract
Wiktionary provides lexical information for an increasing number of languages, including morphological inflection tables. It is a good resource for automatically learning rule-based analysis of the inflectional morphology of a language. This paper performs an extensive evaluation of a method to extract generalized paradigms from morphological inflection tables, which can be converted to weighted and unweighted finite transducers for morphological parsing and generation. The inflection tables of 55 languages from the English edition of Wiktionary are converted to such general paradigms, and the performance of the probabilistic parsers based on these paradigms are tested.