{"title":"基于使用和领域知识的语义丰富关键词预取","authors":"Sonia Setia;Jyoti;Neelam Duhan;Aman Anand;Nikita Verma","doi":"10.13052/jwe1540-9589.2332","DOIUrl":null,"url":null,"abstract":"In intelligent web systems [2], web prefetching [27] plays a crucial role. In order to make accurate predictions for web prefetching, it is important but challenging to uncover valuable information from web use statistics [16]. Using statistics and domain expertise, this study presents a new approach dubbed SPUDK for efficient prefetching. In this paper, it is shown how web access logs can be used efficiently for browsing prediction. Our main focus is on the technique needed to manage the queries found in web access logs so that valuable information can be attained. We further process these access logs using a taxonomy and a thesaurus, WordNet, to find the semantics of queries. SPUDK, a system that organises use data into semantic clusters, is one example of this approach. Our contributions in this paper are as follows: (1) A technique to exploit query keywords from access logs. (2) An approach to enrich queries with semantic information. (3) A new similarity measure for finding similarity among URLs present in access logs. (4) A novel clustering technique to find semantic clusters of URLs. (5) Experimental evaluation of the proposed system. The proposed SPUDK system is evaluated using American Online (AOL) logs, which gives improvement of 39% in precision of prediction, 35% in hit ratio and reduction of 50.6% in latency on average as compared to other prediction techniques in the literature.","PeriodicalId":49952,"journal":{"name":"Journal of Web Engineering","volume":"23 3","pages":"341-375"},"PeriodicalIF":0.7000,"publicationDate":"2024-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10547277","citationCount":"0","resultStr":"{\"title\":\"Semantically Enriched Keyword Prefetching Based on Usage and Domain Knowledge\",\"authors\":\"Sonia Setia;Jyoti;Neelam Duhan;Aman Anand;Nikita Verma\",\"doi\":\"10.13052/jwe1540-9589.2332\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In intelligent web systems [2], web prefetching [27] plays a crucial role. In order to make accurate predictions for web prefetching, it is important but challenging to uncover valuable information from web use statistics [16]. Using statistics and domain expertise, this study presents a new approach dubbed SPUDK for efficient prefetching. In this paper, it is shown how web access logs can be used efficiently for browsing prediction. Our main focus is on the technique needed to manage the queries found in web access logs so that valuable information can be attained. We further process these access logs using a taxonomy and a thesaurus, WordNet, to find the semantics of queries. SPUDK, a system that organises use data into semantic clusters, is one example of this approach. Our contributions in this paper are as follows: (1) A technique to exploit query keywords from access logs. (2) An approach to enrich queries with semantic information. (3) A new similarity measure for finding similarity among URLs present in access logs. (4) A novel clustering technique to find semantic clusters of URLs. (5) Experimental evaluation of the proposed system. The proposed SPUDK system is evaluated using American Online (AOL) logs, which gives improvement of 39% in precision of prediction, 35% in hit ratio and reduction of 50.6% in latency on average as compared to other prediction techniques in the literature.\",\"PeriodicalId\":49952,\"journal\":{\"name\":\"Journal of Web Engineering\",\"volume\":\"23 3\",\"pages\":\"341-375\"},\"PeriodicalIF\":0.7000,\"publicationDate\":\"2024-03-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10547277\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Web Engineering\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10547277/\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"COMPUTER SCIENCE, SOFTWARE ENGINEERING\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Web Engineering","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10547277/","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}
Semantically Enriched Keyword Prefetching Based on Usage and Domain Knowledge
In intelligent web systems [2], web prefetching [27] plays a crucial role. In order to make accurate predictions for web prefetching, it is important but challenging to uncover valuable information from web use statistics [16]. Using statistics and domain expertise, this study presents a new approach dubbed SPUDK for efficient prefetching. In this paper, it is shown how web access logs can be used efficiently for browsing prediction. Our main focus is on the technique needed to manage the queries found in web access logs so that valuable information can be attained. We further process these access logs using a taxonomy and a thesaurus, WordNet, to find the semantics of queries. SPUDK, a system that organises use data into semantic clusters, is one example of this approach. Our contributions in this paper are as follows: (1) A technique to exploit query keywords from access logs. (2) An approach to enrich queries with semantic information. (3) A new similarity measure for finding similarity among URLs present in access logs. (4) A novel clustering technique to find semantic clusters of URLs. (5) Experimental evaluation of the proposed system. The proposed SPUDK system is evaluated using American Online (AOL) logs, which gives improvement of 39% in precision of prediction, 35% in hit ratio and reduction of 50.6% in latency on average as compared to other prediction techniques in the literature.
期刊介绍:
The World Wide Web and its associated technologies have become a major implementation and delivery platform for a large variety of applications, ranging from simple institutional information Web sites to sophisticated supply-chain management systems, financial applications, e-government, distance learning, and entertainment, among others. Such applications, in addition to their intrinsic functionality, also exhibit the more complex behavior of distributed applications.