{"title":"A Hybrid Model for Computational Morphology Application","authors":"Xu Yang, Wang Hou-feng","doi":"10.1109/SNPD.2007.31","DOIUrl":null,"url":null,"abstract":"Computational morphology is a core component in many different types of natural language processing, such as the alignment techniques. This paper describes a method for morphological processing. Based on both rules and statistical models, a lemmatizer is constructed to analyze the English inflectional morphology, and automatically derives the lemmas of the words. The rule model incorporates data from various corpora, machine-readable dictionaries, and an empirical metamorphose rule set, and the statistical model applies mainly the maximum entropy principles to deal with unknown words and ambiguous cases effectively. The knowledge used in our lemmatizer is convenient to update to support the development of natural language processing. Experiments show that the lemmatizer has a wide coverage and high accuracy.","PeriodicalId":197058,"journal":{"name":"Eighth ACIS International Conference on Software Engineering, Artificial Intelligence, Networking, and Parallel/Distributed Computing (SNPD 2007)","volume":"112 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2007-07-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Eighth ACIS International Conference on Software Engineering, Artificial Intelligence, Networking, and Parallel/Distributed Computing (SNPD 2007)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SNPD.2007.31","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
Computational morphology is a core component in many different types of natural language processing, such as the alignment techniques. This paper describes a method for morphological processing. Based on both rules and statistical models, a lemmatizer is constructed to analyze the English inflectional morphology, and automatically derives the lemmas of the words. The rule model incorporates data from various corpora, machine-readable dictionaries, and an empirical metamorphose rule set, and the statistical model applies mainly the maximum entropy principles to deal with unknown words and ambiguous cases effectively. The knowledge used in our lemmatizer is convenient to update to support the development of natural language processing. Experiments show that the lemmatizer has a wide coverage and high accuracy.