F. Barcala, Jesús Vilares, Miguel A. Alonso, J. G. Gil, M. Ferro
{"title":"信息检索中的标记化和专有名词识别","authors":"F. Barcala, Jesús Vilares, Miguel A. Alonso, J. G. Gil, M. Ferro","doi":"10.1109/DEXA.2002.1045906","DOIUrl":null,"url":null,"abstract":"In this paper we consider a set of natural language processing techniques that can be used to analyze large amounts of texts, focusing on the advanced tokenizer which accounts for a number of complex linguistic phenomena, as well as for pre-tagging tasks such as proper noun recognition. We also show the results of several experiments performed in order to study the impact of the strategy chosen for the recognition of proper nouns.","PeriodicalId":254550,"journal":{"name":"Proceedings. 13th International Workshop on Database and Expert Systems Applications","volume":"106 2","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2002-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"30","resultStr":"{\"title\":\"Tokenization and proper noun recognition for information retrieval\",\"authors\":\"F. Barcala, Jesús Vilares, Miguel A. Alonso, J. G. Gil, M. Ferro\",\"doi\":\"10.1109/DEXA.2002.1045906\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper we consider a set of natural language processing techniques that can be used to analyze large amounts of texts, focusing on the advanced tokenizer which accounts for a number of complex linguistic phenomena, as well as for pre-tagging tasks such as proper noun recognition. We also show the results of several experiments performed in order to study the impact of the strategy chosen for the recognition of proper nouns.\",\"PeriodicalId\":254550,\"journal\":{\"name\":\"Proceedings. 13th International Workshop on Database and Expert Systems Applications\",\"volume\":\"106 2\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2002-09-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"30\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings. 13th International Workshop on Database and Expert Systems Applications\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/DEXA.2002.1045906\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings. 13th International Workshop on Database and Expert Systems Applications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/DEXA.2002.1045906","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Tokenization and proper noun recognition for information retrieval
In this paper we consider a set of natural language processing techniques that can be used to analyze large amounts of texts, focusing on the advanced tokenizer which accounts for a number of complex linguistic phenomena, as well as for pre-tagging tasks such as proper noun recognition. We also show the results of several experiments performed in order to study the impact of the strategy chosen for the recognition of proper nouns.