Proceedings of the 2016 on SIGMOD'16 PhD Symposium最新文献

筛选
英文 中文
What I Wish I Knew When I Finished my PhD 我希望当我完成我的博士学位时我就知道
Proceedings of the 2016 on SIGMOD'16 PhD Symposium Pub Date : 2016-06-14 DOI: 10.1145/2926693.2929896
A. Halevy
{"title":"What I Wish I Knew When I Finished my PhD","authors":"A. Halevy","doi":"10.1145/2926693.2929896","DOIUrl":"https://doi.org/10.1145/2926693.2929896","url":null,"abstract":"You're about to finish your Ph.D and looking forward to a bright career. You might have some plans for what that career will look like, but the truth is, you're about to embark on a fascinating journey you know little about. You think that in 5 or 10 years you'll be all set, but actually, careers take interesting twists at many stages. In this talk I will share a few of the lessons I learned in the first 20+ years of my journey.","PeriodicalId":123723,"journal":{"name":"Proceedings of the 2016 on SIGMOD'16 PhD Symposium","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-06-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115385347","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Temporal Data Exchange 临时数据交换
Proceedings of the 2016 on SIGMOD'16 PhD Symposium Pub Date : 2016-06-14 DOI: 10.1145/2926693.2929900
Ladan Golshanara, J. Chomicki, W. Tan
{"title":"Temporal Data Exchange","authors":"Ladan Golshanara, J. Chomicki, W. Tan","doi":"10.1145/2926693.2929900","DOIUrl":"https://doi.org/10.1145/2926693.2929900","url":null,"abstract":"In this work, we study data exchange for temporal data. There are two views associated with temporal data: the concrete temporal view, which depicts how temporal data is compactly represented and on which implementations are based, and the abstract temporal view, which defines the semantics of temporal data. Based on the chase procedure, which is a fundamental tool in relational data exchange, two new kinds of chase are proposed in this paper: the abstract chase for the abstract temporal view and the concrete chase for the concrete temporal view. While labeled nulls are sufficient for relational data exchange, they have to be refined in temporal data exchange to keep the connection between the result produced by the concrete chase and the result of the abstract chase. We show that the concrete chase respects the semantics defined by the abstract chase and provides a foundation for query answering","PeriodicalId":123723,"journal":{"name":"Proceedings of the 2016 on SIGMOD'16 PhD Symposium","volume":"128 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-06-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122044862","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Probabilistic Evaluation of Expressive Queries on Bounded-Treewidth Instances 有界树宽实例上表达查询的概率计算
Proceedings of the 2016 on SIGMOD'16 PhD Symposium Pub Date : 2016-06-14 DOI: 10.1145/2926693.2929905
Mikaël Monet
{"title":"Probabilistic Evaluation of Expressive Queries on Bounded-Treewidth Instances","authors":"Mikaël Monet","doi":"10.1145/2926693.2929905","DOIUrl":"https://doi.org/10.1145/2926693.2929905","url":null,"abstract":"Though data uncertainty naturally appears in many real-life situations, traditional database theory and systems tend to assume that the data is reliable and complete. The reason is that of complexity and performance: on arbitrary relational database instances annotated with probabilities, performing exact probabilistic query evaluation is hard. However, a criterion on the shape of the database has been shown in recent work to be sufficient and in some sense necessary to the tractability of this task. Databases whose treewidth is bounded by a constant k are exactly those that can be tractably queried, with respect to quantitative uncertainty estimation. But this is a data complexity result, that does not take into account the cost in terms of the query or of k -- in many cases, this cost is too high for real-world applications. The aim of our PhD research is to study in which circumstances the overall complexity of probabilistic query evaluation can become tractable, aiming at both theoretical and practical results.","PeriodicalId":123723,"journal":{"name":"Proceedings of the 2016 on SIGMOD'16 PhD Symposium","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-06-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132526958","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
TrailMarker: Automatic Mining of Geographical Complex Sequences TrailMarker:地理复杂序列的自动挖掘
Proceedings of the 2016 on SIGMOD'16 PhD Symposium Pub Date : 2016-06-14 DOI: 10.1145/2926693.2929903
Takato Honda
{"title":"TrailMarker: Automatic Mining of Geographical Complex Sequences","authors":"Takato Honda","doi":"10.1145/2926693.2929903","DOIUrl":"https://doi.org/10.1145/2926693.2929903","url":null,"abstract":"Given a huge collection of vehicle sensor data consisting of d sensors for w trajectories of duration n, which are accompanied by geographical information, how can we find patterns, rules and outliers? How can we efficiently and effectively find typical patterns and points of variation? In this paper we present TRAILMARKER, a fully automatic mining algorithm for geographical complex sequences. Our method has the following properties: (a) effective: it finds important patterns and outliers in real datasets; (b) scalable: it is linear with respect to the data size; (c) parameter-free: it is fully automatic, and requires no prior training, and no parameter tuning. Extensive experiments on real data demonstrate that TRAILMARKER finds interesting and unexpected patterns and groups accurately. In fact, TRAILMARKER consistently outperforms the best state-of-the-art methods in terms of both accuracy and execution speed.","PeriodicalId":123723,"journal":{"name":"Proceedings of the 2016 on SIGMOD'16 PhD Symposium","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-06-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126993906","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Techniques and Systems for Large Dynamic Graphs 大型动态图的技术和系统
Proceedings of the 2016 on SIGMOD'16 PhD Symposium Pub Date : 2016-06-14 DOI: 10.1145/2926693.2929897
Khaled Ammar
{"title":"Techniques and Systems for Large Dynamic Graphs","authors":"Khaled Ammar","doi":"10.1145/2926693.2929897","DOIUrl":"https://doi.org/10.1145/2926693.2929897","url":null,"abstract":"Many applications regularly generate large graph data. Many of these graphs change dynamically, and analysis techniques for static graphs are not suitable in these cases. This thesis proposes an architecture to process and analyze dynamic graphs. It is based on a new computation model called Grab'n Fix. The architecture includes a novel distributed graph storage layer to support dynamic graph processing. These proposals were inspired by an extensive quantitative and qualitative analysis of existing graph analytics platform.","PeriodicalId":123723,"journal":{"name":"Proceedings of the 2016 on SIGMOD'16 PhD Symposium","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-06-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122426662","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Scalable Microblogs Data Management 可扩展微博数据管理
Proceedings of the 2016 on SIGMOD'16 PhD Symposium Pub Date : 2016-06-14 DOI: 10.1145/2926693.2929898
A. Magdy
{"title":"Scalable Microblogs Data Management","authors":"A. Magdy","doi":"10.1145/2926693.2929898","DOIUrl":"https://doi.org/10.1145/2926693.2929898","url":null,"abstract":"Microblogs, e.g., tweets, reviews, or comments on news websites and social media, have become so popular among web users that many applications are exploiting them for different types of analysis. The distinguishing characteristics of microblogs have motivated a lot of research for managing such data. However, the developed technology for microblogs is still scattered efforts here and there which leads to several data management gaps that limits supporting microblogs-centric applications end-to-end. Our research aims to provide a holistic system approach to manage microblogs data, so that whoever builds new functionality on microblogs can seamlessly exploit a single data management system to power his applications. In this paper, we present a full proposal for Kite; the first holistic system that provides end-to-end management for microblogs data. Kite aims to fill the gap in existing systems to support scalable queries with selective search criteria on data that comes in high velocity and adds up to large volumes (billions of records). To this end, the system is going to exploit and extend the infrastructure of Apache Spark system. Throughout the paper, we represent a roadmap for the accomplished contributions, on-going contributions towards the first cut realization of Kite, and future contributions to iteratively improve the system maturity and capabilities.","PeriodicalId":123723,"journal":{"name":"Proceedings of the 2016 on SIGMOD'16 PhD Symposium","volume":"316 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-06-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132433351","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Understanding User Behavior From Online Traces 从上网痕迹了解用户行为
Proceedings of the 2016 on SIGMOD'16 PhD Symposium Pub Date : 2016-06-14 DOI: 10.1145/2926693.2929901
Elad Kravi
{"title":"Understanding User Behavior From Online Traces","authors":"Elad Kravi","doi":"10.1145/2926693.2929901","DOIUrl":"https://doi.org/10.1145/2926693.2929901","url":null,"abstract":"People nowadays share large amounts of data online, explicitly or implicitly. Analysis of such data can detect useful behavior patterns of varying natures and scales, from mass immigration between continents to trendy venues in a city in turn. Detecting these patterns can be used for improving online services. However, capturing behavior patterns may be challenging, since such patterns are often of a specialized essence, no benchmark or labeled data exist, and it is not even clear how to formulate them to enable computation. Moreover, it is often unclear how recognition of these patterns can be translated into concrete service improvement. We analyzed major datasets of three common types of online traces: microbloging, social networking, and web search. We detected online behavior patterns and utilized them toward novel services and improvement of traditional services. In this paper we describe our studies and findings, and offer a vision for future development.","PeriodicalId":123723,"journal":{"name":"Proceedings of the 2016 on SIGMOD'16 PhD Symposium","volume":" 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-06-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"113951968","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Non-linear Time-series Analysis of Social Influence 社会影响非线性时间序列分析
Proceedings of the 2016 on SIGMOD'16 PhD Symposium Pub Date : 2016-06-14 DOI: 10.1145/2926693.2929902
Thinh Minh Do, Yasuko Matsubara, Yasushi Sakurai
{"title":"Non-linear Time-series Analysis of Social Influence","authors":"Thinh Minh Do, Yasuko Matsubara, Yasushi Sakurai","doi":"10.1145/2926693.2929902","DOIUrl":"https://doi.org/10.1145/2926693.2929902","url":null,"abstract":"In this paper, we present Δ-SPOT, a non-linear model for analysing large scale web search data, and its fitting algorithm. Δ-SPOT can forecast long-range future dynamics of the keywords/queries. We use the Google Search, Twitter and MemeTracker data set for extensive experiments, which show that our method outperforms other non-linear mining methods. We also provide an online algorithm contributing to the need of monitoring multiple co-evolving data sequences.","PeriodicalId":123723,"journal":{"name":"Proceedings of the 2016 on SIGMOD'16 PhD Symposium","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-06-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130904577","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Query Answering over Complete Data with Conceptual Constraints 具有概念约束的完整数据的查询回答
Proceedings of the 2016 on SIGMOD'16 PhD Symposium Pub Date : 2016-06-14 DOI: 10.1145/2926693.2929899
Nhung Ngo
{"title":"Query Answering over Complete Data with Conceptual Constraints","authors":"Nhung Ngo","doi":"10.1145/2926693.2929899","DOIUrl":"https://doi.org/10.1145/2926693.2929899","url":null,"abstract":"Query answering over databases with conceptual constraints is an important problem in database theory. To deal with the problem, the ontology-based data access approach uses ontologies to capture both constraints and databases. In this approach, databases are considered under open-world assumption which creates many issues including the necessity of restricting to only positive queries, and the failure of query composition. In our research, we focus on a combined approach that allows data in databases stays completely as under closed-world assumption while knowledge providing by conceptual constraints can be incomplete. We first study the complexity of query answering problem under description logic constraints in the presence of complete data and show that complete data makes query answering become harder than query answering over incomplete data only. We then provide a query rewriting technique that supports deciding the existence of a safe-range first-order equivalent reformulation of a query in terms of the database schema, and if so, it provides an effective approach to construct the reformulation. Since the reformulation is a safe-range formula, it is effectively executable as an SQL query. At the end, we study the definability abduction problem which aims to characterize the least committing extensions of conceptual constraints to gain the exact rewritable of queries. We also apply this idea to data exchange - where we want to characterize the case of lossless transformations of data.","PeriodicalId":123723,"journal":{"name":"Proceedings of the 2016 on SIGMOD'16 PhD Symposium","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-06-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133946212","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信