{"title":"近似搜索引擎优化目录服务","authors":"Kai-Hsiang Yang, Chi-Chien Pan, Tzao-Lin Lee","doi":"10.1109/IPDPS.2003.1213439","DOIUrl":null,"url":null,"abstract":"Today, in many practical e-commerce systems, the real stored data usually are short strings, such as names, addresses, or other information. Searching data within these short strings is not the same as searching within longer strings. General search engines try their best to scan all long strings (or articles) quickly, and find out the places that match the search conditions. Some great online search algorithms (such as \"agrep\" as used inside glimpse, or \"cgrep \" as used inside compressed indices, or 'NR-grep') are proposed for searching without any indices in the sub-linear time O(n). However, for short strings (n is small), the practical performance of algorithms of O(n) and O(n) are much the same. Therefore, suitable indices are necessary to optimize the performance of the search engine. On the other hand, directory services are more and more important because of its optimization for searching data. The data stored in directory servers are almost short strings. The approximate search engine for directory service must take the properties of short strings into considerations. In our previous research, we have designed one approximate search engine especially for short strings by using filters to filter out the possible short strings, and then checking for the answers. However the performance of the previous search engine needs to be enhanced. In this paper, we propose new architecture and algorithm to optimize the performance of searching for directory service.","PeriodicalId":177848,"journal":{"name":"Proceedings International Parallel and Distributed Processing Symposium","volume":"12 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2003-04-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":"{\"title\":\"Approximate search engine optimization for directory service\",\"authors\":\"Kai-Hsiang Yang, Chi-Chien Pan, Tzao-Lin Lee\",\"doi\":\"10.1109/IPDPS.2003.1213439\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Today, in many practical e-commerce systems, the real stored data usually are short strings, such as names, addresses, or other information. Searching data within these short strings is not the same as searching within longer strings. General search engines try their best to scan all long strings (or articles) quickly, and find out the places that match the search conditions. Some great online search algorithms (such as \\\"agrep\\\" as used inside glimpse, or \\\"cgrep \\\" as used inside compressed indices, or 'NR-grep') are proposed for searching without any indices in the sub-linear time O(n). However, for short strings (n is small), the practical performance of algorithms of O(n) and O(n) are much the same. Therefore, suitable indices are necessary to optimize the performance of the search engine. On the other hand, directory services are more and more important because of its optimization for searching data. The data stored in directory servers are almost short strings. The approximate search engine for directory service must take the properties of short strings into considerations. In our previous research, we have designed one approximate search engine especially for short strings by using filters to filter out the possible short strings, and then checking for the answers. However the performance of the previous search engine needs to be enhanced. In this paper, we propose new architecture and algorithm to optimize the performance of searching for directory service.\",\"PeriodicalId\":177848,\"journal\":{\"name\":\"Proceedings International Parallel and Distributed Processing Symposium\",\"volume\":\"12 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2003-04-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings International Parallel and Distributed Processing Symposium\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IPDPS.2003.1213439\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings International Parallel and Distributed Processing Symposium","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IPDPS.2003.1213439","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Approximate search engine optimization for directory service
Today, in many practical e-commerce systems, the real stored data usually are short strings, such as names, addresses, or other information. Searching data within these short strings is not the same as searching within longer strings. General search engines try their best to scan all long strings (or articles) quickly, and find out the places that match the search conditions. Some great online search algorithms (such as "agrep" as used inside glimpse, or "cgrep " as used inside compressed indices, or 'NR-grep') are proposed for searching without any indices in the sub-linear time O(n). However, for short strings (n is small), the practical performance of algorithms of O(n) and O(n) are much the same. Therefore, suitable indices are necessary to optimize the performance of the search engine. On the other hand, directory services are more and more important because of its optimization for searching data. The data stored in directory servers are almost short strings. The approximate search engine for directory service must take the properties of short strings into considerations. In our previous research, we have designed one approximate search engine especially for short strings by using filters to filter out the possible short strings, and then checking for the answers. However the performance of the previous search engine needs to be enhanced. In this paper, we propose new architecture and algorithm to optimize the performance of searching for directory service.