{"title":"基于自动更新动态地名表的条件随机场自适应命名实体识别","authors":"Xixin Wu, Zhiyong Wu, Jia Jia, Lianhong Cai","doi":"10.1109/ISCSLP.2012.6423495","DOIUrl":null,"url":null,"abstract":"This paper presents a hybrid model which combines conditional random fields (CRFs) with dynamic gazetteers (DGs) for the task of Chinese named entity recognition (NER). In the previous work of NER, gazetteers were widely used. But their gazetteers were all static ones which cannot adapt themselves to the new domains and new out-of-vocabulary named entities (OOVNEs). In this work, we build and maintain DGs to solve the problems and propose a method to automatically update DGs along with the recognition process of the named entities (NEs). With this method, the DGs can be updated to contain more and more new NEs and features of NEs that are not found in the training data. These newly added items make the DGs become more aware of the knowledge about new domains and hence be more adaptive to new domains for the recognition of OOVNEs. Experiments on the People's Daily corpus demonstrate that our method is effective, and can improve the average F-score by 1%~2%.","PeriodicalId":186099,"journal":{"name":"2012 8th International Symposium on Chinese Spoken Language Processing","volume":"25 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":"{\"title\":\"Adaptive named entity recognition based on conditional random fields with automatic updated dynamic gazetteers\",\"authors\":\"Xixin Wu, Zhiyong Wu, Jia Jia, Lianhong Cai\",\"doi\":\"10.1109/ISCSLP.2012.6423495\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper presents a hybrid model which combines conditional random fields (CRFs) with dynamic gazetteers (DGs) for the task of Chinese named entity recognition (NER). In the previous work of NER, gazetteers were widely used. But their gazetteers were all static ones which cannot adapt themselves to the new domains and new out-of-vocabulary named entities (OOVNEs). In this work, we build and maintain DGs to solve the problems and propose a method to automatically update DGs along with the recognition process of the named entities (NEs). With this method, the DGs can be updated to contain more and more new NEs and features of NEs that are not found in the training data. These newly added items make the DGs become more aware of the knowledge about new domains and hence be more adaptive to new domains for the recognition of OOVNEs. Experiments on the People's Daily corpus demonstrate that our method is effective, and can improve the average F-score by 1%~2%.\",\"PeriodicalId\":186099,\"journal\":{\"name\":\"2012 8th International Symposium on Chinese Spoken Language Processing\",\"volume\":\"25 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2012-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2012 8th International Symposium on Chinese Spoken Language Processing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ISCSLP.2012.6423495\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2012 8th International Symposium on Chinese Spoken Language Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISCSLP.2012.6423495","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Adaptive named entity recognition based on conditional random fields with automatic updated dynamic gazetteers
This paper presents a hybrid model which combines conditional random fields (CRFs) with dynamic gazetteers (DGs) for the task of Chinese named entity recognition (NER). In the previous work of NER, gazetteers were widely used. But their gazetteers were all static ones which cannot adapt themselves to the new domains and new out-of-vocabulary named entities (OOVNEs). In this work, we build and maintain DGs to solve the problems and propose a method to automatically update DGs along with the recognition process of the named entities (NEs). With this method, the DGs can be updated to contain more and more new NEs and features of NEs that are not found in the training data. These newly added items make the DGs become more aware of the knowledge about new domains and hence be more adaptive to new domains for the recognition of OOVNEs. Experiments on the People's Daily corpus demonstrate that our method is effective, and can improve the average F-score by 1%~2%.