Proceedings of the 5th International Workshop on Exploratory Search in Databases and the Web. International Workshop on Exploratory Search in Databases and the Web (5th : 2018 : Houston, Tex.)最新文献

Exploring Pros and Cons of Ranked Entities with COMPETE 探索具有竞争的排名实体的利弊

Proceedings of the 5th International Workshop on Exploratory Search in Databases and the Web. International Workshop on Exploratory Search in Databases and the Web (5th : 2018 : Houston, Tex.) Pub Date : 2018-06-15 DOI: 10.1145/3214708.3214709

Kiril Panev, S. Michel

引用次数: 2

Discovery and Creation of Rich Entities for Knowledge Bases 知识库丰富实体的发现与创建

Proceedings of the 5th International Workshop on Exploratory Search in Databases and the Web. International Workshop on Exploratory Search in Databases and the Web (5th : 2018 : Houston, Tex.) Pub Date : 2018-06-15 DOI: 10.1145/3214708.3214712

A. Quamar, Fatma Özcan, Konstantinos Xirogiannopoulos

{"title":"Discovery and Creation of Rich Entities for Knowledge Bases","authors":"A. Quamar, Fatma Özcan, Konstantinos Xirogiannopoulos","doi":"10.1145/3214708.3214712","DOIUrl":"https://doi.org/10.1145/3214708.3214712","url":null,"abstract":"Businesses and professional organizations from a variety of different domains such as finance, weather, healthcare, social networks, etc., produce massive amounts of unstructured, semi-structured and structured data. Knowledge bases, enable querying and analysis of integrated content derived from such data available as open, third party and propriety data sets. Many knowledge bases today, provide an entity-centric view over the integrated content by using domain-specific ontologies. These entity-centric views enable querying individual real-world entities, as well as exploring exact information (such as address or net revenue of a company) through explicit querying using languages such as SQL or SPARQL. Although very useful for many business and commercial applications, this may not be sufficient for the exploration of relevant and context specific information associated with real-world entities stored in these knowledge bases. Users often need to resort to a manual and tedious process of exploration using ad-hoc queries to gather the required information. To enhance user experience and ameliorate the problem of relevant data exploration, we propose the concept of Rich Entities. These rich entities comprise of all the relevant and context specific information grouped together around real-world entities and served as efficient and meaningful responses to user queries against these entities in a knowledge base. These rich entities are created by grouping together information not only from a single entity represented as an ontology concept, but also related concepts and properties as specified by the domain ontology. In this paper we propose several novel techniques and algorithms to automatically detect, learn, and create domain-specific rich entities. We use inputs from query patterns in existing query workloads against knowledge bases, and leverage the structure and relationships between entities defined in the domain ontology. Our techniques are very effective and can be applied to a wide variety of application domains thus adding great value to data exploration and information extraction from entity-centric real-world knowledge bases.","PeriodicalId":93360,"journal":{"name":"Proceedings of the 5th International Workshop on Exploratory Search in Databases and the Web. International Workshop on Exploratory Search in Databases and the Web (5th : 2018 : Houston, Tex.)","volume":"69 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2018-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88984579","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

Exploring Genomic Datasets: from Batch to Interactive and Back 探索基因组数据集:从批处理到交互和返回

Proceedings of the 5th International Workshop on Exploratory Search in Databases and the Web. International Workshop on Exploratory Search in Databases and the Web (5th : 2018 : Houston, Tex.) Pub Date : 2018-06-15 DOI: 10.1145/3214708.3214710

Luca Nanni, Pietro Pinoli, Arif Canakoglu, S. Ceri

引用次数: 6

Recommendations for Explorations based on Graphs 基于图的探索建议

Proceedings of the 5th International Workshop on Exploratory Search in Databases and the Web. International Workshop on Exploratory Search in Databases and the Web (5th : 2018 : Houston, Tex.) Pub Date : 2018-06-15 DOI: 10.1145/3214708.3214713

Marialena Kyriakidi, G. Koutrika, Y. Ioannidis

引用次数: 0

Strategies for Detection of Correlated Data Streams 关联数据流检测策略

Proceedings of the 5th International Workshop on Exploratory Search in Databases and the Web. International Workshop on Exploratory Search in Databases and the Web (5th : 2018 : Houston, Tex.) Pub Date : 2018-06-15 DOI: 10.1145/3214708.3214714

Rakan Alseghayer, Daniel Petrov, Panos K. Chrysanthis

{"title":"Strategies for Detection of Correlated Data Streams","authors":"Rakan Alseghayer, Daniel Petrov, Panos K. Chrysanthis","doi":"10.1145/3214708.3214714","DOIUrl":"https://doi.org/10.1145/3214708.3214714","url":null,"abstract":"There is an increasing demand for real-time analysis of large volumes of data streams that are produced at high velocity. The most recent data needs to be processed within a specified delay target in order for the analysis to lead to actionable result. In this paper we present an effective solution for the analysis of such data streams that is based upon a 3-fold approach that combines (1) incremental sliding-window computation of aggregates, to avoid unnecessary recomputations, (2) intelligent scheduling of computation steps and operations, driven by a utility function within a micro-batch, and (3) an exploration strategy that tunes the utility function. Specifically, we propose eight strategies that explore correlated pairs of live data streams across consecutive micro-batches. Our experimental evaluation on a real dataset shows that some strategies are more suitable to identifying high numbers of correlated pairs of live data streams, already known from previous micro-batches, while others are more suitable to identifying previously unseen pairs of live data streams across consecutive micro-batches.","PeriodicalId":93360,"journal":{"name":"Proceedings of the 5th International Workshop on Exploratory Search in Databases and the Web. International Workshop on Exploratory Search in Databases and the Web (5th : 2018 : Houston, Tex.)","volume":"18 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2018-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74660549","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

Any-k Algorithms for Exploratory Analysis with Conjunctive Queries. 带有联合查询的探索性分析的Any-k算法。

Proceedings of the 5th International Workshop on Exploratory Search in Databases and the Web. International Workshop on Exploratory Search in Databases and the Web (5th : 2018 : Houston, Tex.) Pub Date : 2018-06-01 DOI: 10.1145/3214708.3214711

Xiaofeng Yang, Mirek Riedewald, Rundong Li, Wolfgang Gatterbauer

引用次数: 7

Proceedings of the 5th International Workshop on Exploratory Search in Databases and the Web 第五届数据库与网络探索性搜索国际研讨会论文集

Proceedings of the 5th International Workshop on Exploratory Search in Databases and the Web. International Workshop on Exploratory Search in Databases and the Web (5th : 2018 : Houston, Tex.) Pub Date : 2016-06-26 DOI: 10.1145/3214708

Senjuti Basu Roy, K. Stefanidis, G. Koutrika, Mirek Riedewald, L. Lakshmanan

引用次数: 0

Supporting Range Queries on Web Data Using k-Nearest Neighbor Search 支持使用k近邻搜索对Web数据的范围查询

Proceedings of the 5th International Workshop on Exploratory Search in Databases and the Web. International Workshop on Exploratory Search in Databases and the Web (5th : 2018 : Houston, Tex.) Pub Date : 2007-11-28 DOI: 10.1007/978-3-540-76925-5_5

Wan D. Bae, Shayma Alkobaisi, S. H. Kim, Sada Narayanappa, C. Shahabi

引用次数: 11

Spam, damn spam, and statistics: using statistical analysis to locate spam web pages 垃圾邮件，该死的垃圾邮件和统计:使用统计分析来定位垃圾网页

Proceedings of the 5th International Workshop on Exploratory Search in Databases and the Web. International Workshop on Exploratory Search in Databases and the Web (5th : 2018 : Houston, Tex.) Pub Date : 2004-06-17 DOI: 10.1145/1017074.1017077

Dennis Fetterly, M. Manasse, Marc Najork

{"title":"Spam, damn spam, and statistics: using statistical analysis to locate spam web pages","authors":"Dennis Fetterly, M. Manasse, Marc Najork","doi":"10.1145/1017074.1017077","DOIUrl":"https://doi.org/10.1145/1017074.1017077","url":null,"abstract":"The increasing importance of search engines to commercial web sites has given rise to a phenomenon we call \"web spam\", that is, web pages that exist only to mislead search engines into (mis)leading users to certain web sites. Web spam is a nuisance to users as well as search engines: users have a harder time finding the information they need, and search engines have to cope with an inflated corpus, which in turn causes their cost per query to increase. Therefore, search engines have a strong incentive to weed out spam web pages from their index.We propose that some spam web pages can be identified through statistical analysis: Certain classes of spam pages, in particular those that are machine-generated, diverge in some of their properties from the properties of web pages at large. We have examined a variety of such properties, including linkage structure, page content, and page evolution, and have found that outliers in the statistical distribution of these properties are highly likely to be caused by web spam.This paper describes the properties we have examined, gives the statistical distributions we have observed, and shows which kinds of outliers are highly correlated with web spam.","PeriodicalId":93360,"journal":{"name":"Proceedings of the 5th International Workshop on Exploratory Search in Databases and the Web. International Workshop on Exploratory Search in Databases and the Web (5th : 2018 : Houston, Tex.)","volume":"198 1","pages":"1-6"},"PeriodicalIF":0.0,"publicationDate":"2004-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72892866","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 350

Unraveling the duplicate-elimination problem in XML-to-SQL query translation 揭示xml到sql查询转换中的重复消除问题

Proceedings of the 5th International Workshop on Exploratory Search in Databases and the Web. International Workshop on Exploratory Search in Databases and the Web (5th : 2018 : Houston, Tex.) Pub Date : 2004-06-17 DOI: 10.1145/1017074.1017088

R. Krishnamurthy, R. Kaushik, J. Naughton

引用次数: 6