{"title":"单词与复合词结合的新语言模型","authors":"Arezki Hammache, R. Ahmed-Ouamer, M. Boughanem","doi":"10.1109/WI-IAT.2011.52","DOIUrl":null,"url":null,"abstract":"Most traditional information retrieval systems are based on single terms indexing. However, it is admitted that semantic content of a document (or a query) cannot be accurately captured by a simple set of independent keywords. Although, several works have incorporated phrases or other syntactic information in IR, such attempts have shown slight benefit, at best. Particularly in language modeling approaches this is achieved through the use of the big ram or n-gram models. However, in these models all big rams/n-grams are considered and weighted uniformly. In this paper we introduce a new approach to weight and consider only certain types of N-grams \"compound terms\". Experimental results on three test collections showed an improvement.","PeriodicalId":128421,"journal":{"name":"2011 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology","volume":"14 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"A New Language Model Combining Single and Compound Terms\",\"authors\":\"Arezki Hammache, R. Ahmed-Ouamer, M. Boughanem\",\"doi\":\"10.1109/WI-IAT.2011.52\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Most traditional information retrieval systems are based on single terms indexing. However, it is admitted that semantic content of a document (or a query) cannot be accurately captured by a simple set of independent keywords. Although, several works have incorporated phrases or other syntactic information in IR, such attempts have shown slight benefit, at best. Particularly in language modeling approaches this is achieved through the use of the big ram or n-gram models. However, in these models all big rams/n-grams are considered and weighted uniformly. In this paper we introduce a new approach to weight and consider only certain types of N-grams \\\"compound terms\\\". Experimental results on three test collections showed an improvement.\",\"PeriodicalId\":128421,\"journal\":{\"name\":\"2011 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology\",\"volume\":\"14 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2011-08-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2011 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/WI-IAT.2011.52\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2011 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/WI-IAT.2011.52","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A New Language Model Combining Single and Compound Terms
Most traditional information retrieval systems are based on single terms indexing. However, it is admitted that semantic content of a document (or a query) cannot be accurately captured by a simple set of independent keywords. Although, several works have incorporated phrases or other syntactic information in IR, such attempts have shown slight benefit, at best. Particularly in language modeling approaches this is achieved through the use of the big ram or n-gram models. However, in these models all big rams/n-grams are considered and weighted uniformly. In this paper we introduce a new approach to weight and consider only certain types of N-grams "compound terms". Experimental results on three test collections showed an improvement.