{"title":"基于最大熵模型的词性标注器","authors":"Heyan Huang, Xiao-fei Zhang","doi":"10.1109/ICCSIT.2009.5234787","DOIUrl":null,"url":null,"abstract":"The maximum entropy (ME) conditional models don't force to adhere to the independence assumption such as in Hidden Markov generative models, and thus the ME -based Part-of-Speech (POS) tagger can depend on arbitrary, non-independent features, which are benefit to the POS tagging, without accounting for the distribution of those dependencies. Since ME models are able to flexibly utilize a wide variety of features, the sparse problem of training data is efficiently solved. Experiments show that the POS tagging error rate is reduced by 54.25% in close test and 40.56% in open test over the Hidden-Markov-Model-based baseline, and synchronously an accuracy of 98.01% in close test and 95.56%in open test are obtained.","PeriodicalId":342396,"journal":{"name":"2009 2nd IEEE International Conference on Computer Science and Information Technology","volume":"5 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2009-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"11","resultStr":"{\"title\":\"Part-of-speech tagger based on maximum entropy model\",\"authors\":\"Heyan Huang, Xiao-fei Zhang\",\"doi\":\"10.1109/ICCSIT.2009.5234787\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The maximum entropy (ME) conditional models don't force to adhere to the independence assumption such as in Hidden Markov generative models, and thus the ME -based Part-of-Speech (POS) tagger can depend on arbitrary, non-independent features, which are benefit to the POS tagging, without accounting for the distribution of those dependencies. Since ME models are able to flexibly utilize a wide variety of features, the sparse problem of training data is efficiently solved. Experiments show that the POS tagging error rate is reduced by 54.25% in close test and 40.56% in open test over the Hidden-Markov-Model-based baseline, and synchronously an accuracy of 98.01% in close test and 95.56%in open test are obtained.\",\"PeriodicalId\":342396,\"journal\":{\"name\":\"2009 2nd IEEE International Conference on Computer Science and Information Technology\",\"volume\":\"5 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2009-09-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"11\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2009 2nd IEEE International Conference on Computer Science and Information Technology\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICCSIT.2009.5234787\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2009 2nd IEEE International Conference on Computer Science and Information Technology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCSIT.2009.5234787","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Part-of-speech tagger based on maximum entropy model
The maximum entropy (ME) conditional models don't force to adhere to the independence assumption such as in Hidden Markov generative models, and thus the ME -based Part-of-Speech (POS) tagger can depend on arbitrary, non-independent features, which are benefit to the POS tagging, without accounting for the distribution of those dependencies. Since ME models are able to flexibly utilize a wide variety of features, the sparse problem of training data is efficiently solved. Experiments show that the POS tagging error rate is reduced by 54.25% in close test and 40.56% in open test over the Hidden-Markov-Model-based baseline, and synchronously an accuracy of 98.01% in close test and 95.56%in open test are obtained.