{"title":"Morphisto - An Open Source Morphological Analyzer for German","authors":"Andrea Zielinski, Christian Simon","doi":"10.3233/978-1-58603-975-2-224","DOIUrl":null,"url":null,"abstract":"This paper presents the development of an open-source morphology tool for German integrated into a grid-based environment. Departing from the SFST-based SMOR tools (Schmid et al. [1]), we have implemented a minimal lexicon component that works in tandem with the morphological tool. Tests on a list of 30,000 high-frequency German words show that the recognition rate is comparable to other systems with even larger lexicons. Additional tools for the management of lexical data and services built on top of the finite-state transducer are also integrated as web services in the grid, so that all resources can be shared easily among lexicographers, linguists, and finite-state developers.","PeriodicalId":286427,"journal":{"name":"Finite-State Methods and Natural Language Processing","volume":"21 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2009-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"45","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Finite-State Methods and Natural Language Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3233/978-1-58603-975-2-224","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 45
Abstract
This paper presents the development of an open-source morphology tool for German integrated into a grid-based environment. Departing from the SFST-based SMOR tools (Schmid et al. [1]), we have implemented a minimal lexicon component that works in tandem with the morphological tool. Tests on a list of 30,000 high-frequency German words show that the recognition rate is comparable to other systems with even larger lexicons. Additional tools for the management of lexical data and services built on top of the finite-state transducer are also integrated as web services in the grid, so that all resources can be shared easily among lexicographers, linguists, and finite-state developers.
本文提出了一个开源形态学工具的发展,为德国集成到一个基于网格的环境。从基于sfst的SMOR工具(Schmid et al.[1])出发,我们实现了一个与形态学工具协同工作的最小词典组件。在一个包含3万个高频德语单词的列表上进行的测试表明,该系统的识别率与其他拥有更大词汇量的系统相当。用于管理词法数据和建立在有限状态转换器之上的服务的其他工具也被集成为网格中的web服务,因此所有资源都可以在词典编纂者、语言学家和有限状态开发人员之间轻松共享。