对大型数据图进行信息完整和无冗余的关键字搜索

Proceedings of the 21st ACM international conference on Information and knowledge management Pub Date : 2012-10-29 DOI:10.1145/2396761.2398712

Byron J. Gao, Zhumin Chen, Qi Kang

{"title":"对大型数据图进行信息完整和无冗余的关键字搜索","authors":"Byron J. Gao, Zhumin Chen, Qi Kang","doi":"10.1145/2396761.2398712","DOIUrl":null,"url":null,"abstract":"Keyword search over graphs has a wide array of applications in querying structured, semi-structured and unstructured data. Existing models typically use minimal trees or bounded subgraphs as query answers. While such models emphasize relevancy, they would suffer from incompleteness of information and redundancy among answers, making it difficult for users to effectively explore query answers. To overcome these drawbacks, we propose a novel cluster-based model, where query answers are relevancy-connected clusters. A cluster is a subgraph induced from a maximal set of relevancy-connected nodes. Such clusters are coherent and relevant, yet complete and redundancy free. They can be of arbitrary shape in contrast to the sphere-shaped bounded subgraphs in existing models. We also propose an efficient search algorithm and a corresponding graph index for large, disk-resident data graphs.","PeriodicalId":313414,"journal":{"name":"Proceedings of the 21st ACM international conference on Information and knowledge management","volume":"40 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Information-complete and redundancy-free keyword search over large data graphs\",\"authors\":\"Byron J. Gao, Zhumin Chen, Qi Kang\",\"doi\":\"10.1145/2396761.2398712\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Keyword search over graphs has a wide array of applications in querying structured, semi-structured and unstructured data. Existing models typically use minimal trees or bounded subgraphs as query answers. While such models emphasize relevancy, they would suffer from incompleteness of information and redundancy among answers, making it difficult for users to effectively explore query answers. To overcome these drawbacks, we propose a novel cluster-based model, where query answers are relevancy-connected clusters. A cluster is a subgraph induced from a maximal set of relevancy-connected nodes. Such clusters are coherent and relevant, yet complete and redundancy free. They can be of arbitrary shape in contrast to the sphere-shaped bounded subgraphs in existing models. We also propose an efficient search algorithm and a corresponding graph index for large, disk-resident data graphs.\",\"PeriodicalId\":313414,\"journal\":{\"name\":\"Proceedings of the 21st ACM international conference on Information and knowledge management\",\"volume\":\"40 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2012-10-29\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 21st ACM international conference on Information and knowledge management\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2396761.2398712\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 21st ACM international conference on Information and knowledge management","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2396761.2398712","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

摘要

图上的关键字搜索在查询结构化、半结构化和非结构化数据方面有着广泛的应用。现有模型通常使用最小树或有界子图作为查询答案。虽然这种模型强调相关性，但会存在答案之间信息不完整和冗余的问题，用户难以有效地探索查询答案。为了克服这些缺点，我们提出了一种新的基于聚类的模型，其中查询答案是关联连接的聚类。聚类是由关联连接节点的最大集合产生的子图。这样的集群是连贯和相关的，但完整和无冗余。它们可以是任意形状，而不是现有模型中的球形有界子图。我们还提出了一种高效的搜索算法和相应的图索引，用于大型磁盘驻留数据图。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Information-complete and redundancy-free keyword search over large data graphs

Keyword search over graphs has a wide array of applications in querying structured, semi-structured and unstructured data. Existing models typically use minimal trees or bounded subgraphs as query answers. While such models emphasize relevancy, they would suffer from incompleteness of information and redundancy among answers, making it difficult for users to effectively explore query answers. To overcome these drawbacks, we propose a novel cluster-based model, where query answers are relevancy-connected clusters. A cluster is a subgraph induced from a maximal set of relevancy-connected nodes. Such clusters are coherent and relevant, yet complete and redundancy free. They can be of arbitrary shape in contrast to the sphere-shaped bounded subgraphs in existing models. We also propose an efficient search algorithm and a corresponding graph index for large, disk-resident data graphs.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings of the 21st ACM international conference on Information and knowledge management

自引率

0.00%

发文量