Proceedings of the 5th International Workshop on Exploratory Search in Databases and the Web. International Workshop on Exploratory Search in Databases and the Web (5th : 2018 : Houston, Tex.)最新文献

筛选
英文 中文
Unraveling the duplicate-elimination problem in XML-to-SQL query translation 揭示xml到sql查询转换中的重复消除问题
R. Krishnamurthy, R. Kaushik, J. Naughton
{"title":"Unraveling the duplicate-elimination problem in XML-to-SQL query translation","authors":"R. Krishnamurthy, R. Kaushik, J. Naughton","doi":"10.1145/1017074.1017088","DOIUrl":"https://doi.org/10.1145/1017074.1017088","url":null,"abstract":"We consider the scenario where existing relational data is exported as XML. In this context, we look at the problem of translating XML queries into SQL. XML query languages have two different notions of duplicates: node-identity based and value-based. Path expression queries have an implicit node-identity based duplicate elimination built into them. On the other hand, SQL only supports value-based duplicate elimination. In this paper, using a simple path expression query we illustrate the problems that arise when we attempt to simulate the node-identity based duplicate elimination using value-based duplicate elimination in the SQL queries. We show how a general solution for this problem covering the class of views considered in published literature requires a fairly complex mechanism.","PeriodicalId":93360,"journal":{"name":"Proceedings of the 5th International Workshop on Exploratory Search in Databases and the Web. International Workshop on Exploratory Search in Databases and the Web (5th : 2018 : Houston, Tex.)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2004-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76922060","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Scalable dissemination: what's hot and what's not 可扩展传播:什么是热门的,什么是不热门的
J. Beaver, Nicholas Morsillo, K. Pruhs, Panos K. Chrysanthis, V. Liberatore
{"title":"Scalable dissemination: what's hot and what's not","authors":"J. Beaver, Nicholas Morsillo, K. Pruhs, Panos K. Chrysanthis, V. Liberatore","doi":"10.1145/1017074.1017084","DOIUrl":"https://doi.org/10.1145/1017074.1017084","url":null,"abstract":"A major problem in web database applications and on the Internet in general is the scalable delivery of data. One proposed solution for this problem is a hybrid system that uses multicast push to scalably deliver the most popular data, and reserves traditional unicast pull for delivery of less popular data. However, such a hybrid scheme introduces a variety of data management problems at the server. In this paper we examine three of these problems: the push popularity problem, the document classification problem, and the bandwidth division problem. The push popularity problem is to estimate the popularity of the documents in the web site. The document classification problem is to determine which documents should be pushed and which documents must be pulled. The band-width division problem is to determine how much of the server bandwidth to devote to pushed documents and how much of the server bandwidth should be reserved for pulled documents. We propose simple and elegant solutions for these problems. We report on experiments with our system that validate our algorithms.","PeriodicalId":93360,"journal":{"name":"Proceedings of the 5th International Workshop on Exploratory Search in Databases and the Web. International Workshop on Exploratory Search in Databases and the Web (5th : 2018 : Houston, Tex.)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2004-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81781715","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Visualizing and discovering web navigational patterns 可视化和发现web导航模式
Jiyang Chen, Lisheng Sun, Osmar R Zaiane, R. Goebel
{"title":"Visualizing and discovering web navigational patterns","authors":"Jiyang Chen, Lisheng Sun, Osmar R Zaiane, R. Goebel","doi":"10.1145/1017074.1017079","DOIUrl":"https://doi.org/10.1145/1017074.1017079","url":null,"abstract":"Web site structures are complex to analyze. Cross-referencing the web structure with navigational behaviour adds to the complexity of the analysis. However, this convoluted analysis is necessary to discover useful patterns and understand the navigational behaviour of web site visitors, whether to improve web site structures, provide intelligent on-line tools or offer support to human decision makers. Moreover, interactive investigation of web access logs is often desired since it allows ad hoc discovery and examination of patterns not a priori known. Various visualization tools have been provided for this task but they often lack the functionality to conveniently generate new patterns. In this paper we propose a visualization tool to visualize web graphs, representations of web structure overlaid with information and pattern tiers. We also propose a web graph algebra to manipulate and combine web graphs and their layers in order to discover new patterns in an ad hoc manner.","PeriodicalId":93360,"journal":{"name":"Proceedings of the 5th International Workshop on Exploratory Search in Databases and the Web. International Workshop on Exploratory Search in Databases and the Web (5th : 2018 : Houston, Tex.)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2004-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86102695","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 47
Best-match querying from document-centric XML 从以文档为中心的XML进行最佳匹配查询
J. Kamps, maarten marx, M. de Rijke, Börkur Sigurbjörnsson
{"title":"Best-match querying from document-centric XML","authors":"J. Kamps, maarten marx, M. de Rijke, Börkur Sigurbjörnsson","doi":"10.1145/1017074.1017089","DOIUrl":"https://doi.org/10.1145/1017074.1017089","url":null,"abstract":"On the Web, there is a pervasive use of XML to give lightweight semantics to textual collections. Such document-centric XML collections require a query language that can gracefully handle structural constraints as well as constraints on the free text of the documents. Our main contributions are three-fold. First, we outline two fragments of XPath tailored to users that have varying degrees of understanding of the XML structure used, and give both syntactic and semantic characterizations of these fragments. Second, we extend XPath with an about function having a best-match semantics based on the relevance of the document component for the expressed information need. Third, we evaluate the resulting query language using the INEX 2003 test suite, and show that best-match approaches outperform exact-match approaches for evaluating content-and-structure queries.","PeriodicalId":93360,"journal":{"name":"Proceedings of the 5th International Workshop on Exploratory Search in Databases and the Web. International Workshop on Exploratory Search in Databases and the Web (5th : 2018 : Houston, Tex.)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2004-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75119231","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 18
Querying bi-level information 查询双级信息
S. Murthy, D. Maier, L. Delcambre
{"title":"Querying bi-level information","authors":"S. Murthy, D. Maier, L. Delcambre","doi":"10.1145/1017074.1017078","DOIUrl":"https://doi.org/10.1145/1017074.1017078","url":null,"abstract":"In our research on superimposed information management, we have developed applications where information elements in the superimposed layer serve to annotate, comment, restructure, and combine selections from one or more existing documents in the base layer. Base documents tend to be unstructured or semi-structured (HTML pages, Excel spreadsheets, and so on) with marks delimiting selections. Selections in the base layer can be programmatically accessed via marks to retrieve content and context. The applications we have built to date allow creation of new marks and new superimposed elements (that use marks), but they have been browse-oriented and tend to expose the line between superimposed and base layers. Here, we present a new access capability, called bi-level queries, that allows an application or user to query over both layers as a whole. Bi-level queries provide an alternative style of data integration where only relevant portions of a base document are mediated (not the whole document) and the superimposed layer can add information not present in the base layer. We discuss our framework for superimposed information management, an initial implementation of a bi-level query system with an XML Query interface, and suggest mechanisms to improve scalability and performance.","PeriodicalId":93360,"journal":{"name":"Proceedings of the 5th International Workshop on Exploratory Search in Databases and the Web. International Workshop on Exploratory Search in Databases and the Web (5th : 2018 : Houston, Tex.)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2004-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73996505","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
DTDs versus XML schema: a practical study dtd与XML模式:一个实用的研究
G. Bex, F. Neven, J. V. D. Bussche
{"title":"DTDs versus XML schema: a practical study","authors":"G. Bex, F. Neven, J. V. D. Bussche","doi":"10.1145/1017074.1017095","DOIUrl":"https://doi.org/10.1145/1017074.1017095","url":null,"abstract":"Among the various proposals answering the shortcomings of Document Type Definitions (DTDs), XML Schema is the most widely used. Although DTDs and XML Schema Definitions (XSDs) differ syntactically, they are still quite related on an abstract level. Indeed, freed from all syntactic sugar, XML Schemas can be seen as an extension of DTDs with a restricted form of specialization. In the present paper, we inspect a number of DTDs and XSDs harvested from the web and try to answer the following questions: (1) which of the extra features/expressiveness of XML Schema not allowed by DTDs are effectively used in practice; and, (2) how sophisticated are the structural properties (i.e. the nature of regular expressions) of the two formalisms. It turns out that at present real-world XSDs only sparingly use the new features introduced by XML Schema: on a structural level the vast majority of them can already be defined by DTDs. Further, we introduce a class of simple regular expressions and obtain that a surprisingly high fraction of the content models belong to this class. The latter result sheds light on the justification of simplifying assumptions that sometimes have to be made in XML research.","PeriodicalId":93360,"journal":{"name":"Proceedings of the 5th International Workshop on Exploratory Search in Databases and the Web. International Workshop on Exploratory Search in Databases and the Web (5th : 2018 : Houston, Tex.)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2004-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81163356","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 180
Semantic multicast for content-based stream dissemination 基于内容的流传播的语义组播
Olga Papaemmanouil, U. Çetintemel
{"title":"Semantic multicast for content-based stream dissemination","authors":"Olga Papaemmanouil, U. Çetintemel","doi":"10.1145/1017074.1017085","DOIUrl":"https://doi.org/10.1145/1017074.1017085","url":null,"abstract":"We consider the problem of content-based routing and dissemination of highly-distributed, fast data streams from multiple sources to multiple receivers. Our target application domain includes real-time, stream-based monitoring applications and large-scale event dissemination. We introduce SemCast, a new semantic multicast approach that, unlike previous approaches, eliminates the need for content-based forwarding at interior brokers and facilitates fine-grained control over the construction of dissemination overlays. We present the initial design of SemCast and provide an outline of the architectural and algorithmic challenges as well as our initial solutions. Preliminary experimental results show that SemCast can significantly reduce overall bandwidth requirements compared to traditional event-dissemination approaches.","PeriodicalId":93360,"journal":{"name":"Proceedings of the 5th International Workshop on Exploratory Search in Databases and the Web. International Workshop on Exploratory Search in Databases and the Web (5th : 2018 : Houston, Tex.)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2004-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76776563","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 19
Checking potential validity of XML documents 检查XML文档的潜在有效性
I. Iacob, Alex Dekhtyar, M. Dekhtyar
{"title":"Checking potential validity of XML documents","authors":"I. Iacob, Alex Dekhtyar, M. Dekhtyar","doi":"10.1145/1017074.1017097","DOIUrl":"https://doi.org/10.1145/1017074.1017097","url":null,"abstract":"The process of creation of document-centric XML documents often starts with a prepared textual content, into which the editor introduces markup. In such situations, intermediate XML is almost never valid with respect to the DTD/Schema used for the encoding. At the same time, it is important to ensure that at each moment of time, the editor is working with an XML document that can enriched with further markup to become valid. In this paper we introduce the notion of potential validity of XML documents, which allows us to distinguish between XML documents that are invalid because the encoding is simply incomplete and XML documents that are invalid because some of the DTD rules guiding the structure of the encoding were violated during the markup process. We give a linear-time algorithm for checking potential validity for documents.","PeriodicalId":93360,"journal":{"name":"Proceedings of the 5th International Workshop on Exploratory Search in Databases and the Web. International Workshop on Exploratory Search in Databases and the Web (5th : 2018 : Houston, Tex.)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2004-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80683901","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Challenges in selecting paths for navigational queries: trade-off of benefit of path versus cost of plan 为导航查询选择路径的挑战:路径收益与计划成本的权衡
Maria-Esther Vidal, L. Raschid, Julián Mestre
{"title":"Challenges in selecting paths for navigational queries: trade-off of benefit of path versus cost of plan","authors":"Maria-Esther Vidal, L. Raschid, Julián Mestre","doi":"10.1145/1017074.1017091","DOIUrl":"https://doi.org/10.1145/1017074.1017091","url":null,"abstract":"Life sciences sources are characterized by a complex graph of overlapping sources, and multiple alternate links between sources. A (navigational) query may be answered by traversing multiple alternate paths between a start source and a target source. Each of these paths may have dissimilar benefit, e.g., the cardinality of result objects that are reached in the target source. Paths may also have dissimilar costs of evaluation, i.e., the execution cost of a query evaluation plan for a path. In prior research, we developed ESearch, an algorithm based on a Deterministic Finite Automaton (DFA), which exhaustively enumerates all paths to answer a navigational query. The challenge is to develop heuristics that improve on the exhaustive ESearch solution and identify good utility functions that can rank the sources, the links between sources, and the sub-paths that are already visited, in order to quickly produce paths that have the highest benefit and the least cost. In this paper, we present a heuristic that uses local utility functions to rank sources, using either the benefit attributed to the source, the cost of a plan using the source, or both. The heuristic will limit its search to some Top XX% of the ranked sources. To compare ESearch and the heuristic, we construct a Pareto surface of all dominant solutions produced by ESearch, with respect to benefit and cost. We choose the Top 25% of the ESearch solutions that are in the Pareto surface. We compare the paths produced by the heuristic to this Top 25% of ESearch solutions with respect to precision and recall. This motivates the need for further research on developing a more efficient algorithm and better utility functions.","PeriodicalId":93360,"journal":{"name":"Proceedings of the 5th International Workshop on Exploratory Search in Databases and the Web. International Workshop on Exploratory Search in Databases and the Web (5th : 2018 : Houston, Tex.)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2004-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74769226","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
One torus to rule them all: multi-dimensional queries in P2P systems 一个环可以统治所有的环:P2P系统中的多维查询
Prasanna Ganesan, Beverly Yang, H. Garcia-Molina
{"title":"One torus to rule them all: multi-dimensional queries in P2P systems","authors":"Prasanna Ganesan, Beverly Yang, H. Garcia-Molina","doi":"10.1145/1017074.1017081","DOIUrl":"https://doi.org/10.1145/1017074.1017081","url":null,"abstract":"Peer-to-peer systems enable access to data spread over an extremely large number of machines. Most P2P systems support only simple lookup queries. However, many new applications, such as P2P photo sharing and massively multi-player games, would benefit greatly from support for multidimensional range queries. We show how such queries may be supported in a P2P system by adapting traditional spatial-database technologies with novel P2P routing networks and load-balancing algorithms. We show how to adapt two popular spatial-database solutions - kd-trees and space-filling curves - and experimentally compare their effectiveness.","PeriodicalId":93360,"journal":{"name":"Proceedings of the 5th International Workshop on Exploratory Search in Databases and the Web. International Workshop on Exploratory Search in Databases and the Web (5th : 2018 : Houston, Tex.)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2004-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79209748","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 270
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信