基于查询接口上下文的深度网络数据源分类

2012 Fourth International Conference on Computational and Information Sciences Pub Date : 2012-08-17 DOI:10.1109/ICCIS.2012.117

Zilu Cui, Yuchen Fu

{"title":"基于查询接口上下文的深度网络数据源分类","authors":"Zilu Cui, Yuchen Fu","doi":"10.1109/ICCIS.2012.117","DOIUrl":null,"url":null,"abstract":"As the volume of information in the Deep Web grows, a Deep Web data source classification algorithm based on query interface context is presented. Two methods are combined to get the search interface similarity. One is based on the vector space. The classical TF-IDF statistics are used to gain the similarity between search interfaces. The other is to compute the two pages semantic similarity by the use of HowNet. Based on the K-NN algorithm, a WDB classification algorithm is presented. Experimental results show this algorithm generates high-quality clusters, measured both in terms of entropy and F-measure. It indicates the practical value of application.","PeriodicalId":269967,"journal":{"name":"2012 Fourth International Conference on Computational and Information Sciences","volume":"56 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-08-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":"{\"title\":\"Deep Web Data Source Classification Based on Query Interface Context\",\"authors\":\"Zilu Cui, Yuchen Fu\",\"doi\":\"10.1109/ICCIS.2012.117\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"As the volume of information in the Deep Web grows, a Deep Web data source classification algorithm based on query interface context is presented. Two methods are combined to get the search interface similarity. One is based on the vector space. The classical TF-IDF statistics are used to gain the similarity between search interfaces. The other is to compute the two pages semantic similarity by the use of HowNet. Based on the K-NN algorithm, a WDB classification algorithm is presented. Experimental results show this algorithm generates high-quality clusters, measured both in terms of entropy and F-measure. It indicates the practical value of application.\",\"PeriodicalId\":269967,\"journal\":{\"name\":\"2012 Fourth International Conference on Computational and Information Sciences\",\"volume\":\"56 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2012-08-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2012 Fourth International Conference on Computational and Information Sciences\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICCIS.2012.117\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2012 Fourth International Conference on Computational and Information Sciences","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCIS.2012.117","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 5

摘要

随着深度网络信息量的增长，提出了一种基于查询接口上下文的深度网络数据源分类算法。将两种方法相结合，得到搜索界面相似度。一个是基于向量空间的。经典TF-IDF统计数据用于获得搜索界面之间的相似度。另一种是利用HowNet计算两个页面的语义相似度。在K-NN算法的基础上，提出了一种WDB分类算法。实验结果表明，该算法生成了高质量的聚类，并对熵和F-measure进行了测量。说明了该方法的实际应用价值。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Deep Web Data Source Classification Based on Query Interface Context

As the volume of information in the Deep Web grows, a Deep Web data source classification algorithm based on query interface context is presented. Two methods are combined to get the search interface similarity. One is based on the vector space. The classical TF-IDF statistics are used to gain the similarity between search interfaces. The other is to compute the two pages semantic similarity by the use of HowNet. Based on the K-NN algorithm, a WDB classification algorithm is presented. Experimental results show this algorithm generates high-quality clusters, measured both in terms of entropy and F-measure. It indicates the practical value of application.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2012 Fourth International Conference on Computational and Information Sciences

自引率

0.00%

发文量