PhD '12最新文献 - Book学术

Linking records in dynamic world 链接动态世界中的记录

PhD '12 Pub Date : 2012-05-20 DOI: 10.1145/2213598.2213612

Pei Li

引用次数: 2

Holistic indexing: offline, online and adaptive indexing in the same kernel 整体索引:离线，在线和自适应索引在同一个内核

PhD '12 Pub Date : 2012-05-20 DOI: 10.1145/2213598.2213604

E. Petraki

{"title":"Holistic indexing: offline, online and adaptive indexing in the same kernel","authors":"E. Petraki","doi":"10.1145/2213598.2213604","DOIUrl":"https://doi.org/10.1145/2213598.2213604","url":null,"abstract":"Proper physical design is a momentous issue for the performance of modern database systems and applications. Nowadays, a growing amount of applications require the execution of dynamic and exploratory workloads with unpredictable characteristics that change over time, e.g., social networks, scientific databases and multimedia databases. In addition, as most modern applications move to the big data era, investing time and resources in building the wrong set of indexes over large collections of data can severely affect performance.\u0000 Offline, online and adaptive indexing are three distinct approaches to the problem of automating the physical design choices. Offline indexing is best in static environments with stable workloads. Online indexing is best in relatively dynamic environments where the query workload can be monitored. Adaptive indexing is best in fully dynamic environments where no idle time or workload knowledge may be assumed. We observe that these three approaches are complementary, while none of them can satisfy the needs of modern applications in isolation.\u0000 We envision a new index selection approach, holistic indexing that excels its predecessors by combining the best features of offline, online and adaptive indexing while overcoming their weaknesses. The main goal is the creation of a database kernel that can autonomously create partial indexes which are continuously refined during query processing as in adaptive indexing but at the same time the system continuously detects any opportunity to improve the physical design offline; whenever any idle time occurs it tries to exploit knowledge gathered during query processing to refine existing indexes further or create new ones. We sketch the research space and the new challenges such a direction brings.","PeriodicalId":335125,"journal":{"name":"PhD '12","volume":"81 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127504183","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Clustering techniques for open relation extraction 开放关系提取的聚类技术

PhD '12 Pub Date : 2012-05-20 DOI: 10.1145/2213598.2213607

F. Mesquita

引用次数: 11

An adaptive event stream processing environment 自适应事件流处理环境

PhD '12 Pub Date : 2012-05-20 DOI: 10.1145/2213598.2213613

Samujjwal Bhandari

引用次数: 4

High performance spatial query processing for large scale scientific data 面向大规模科学数据的高性能空间查询处理

PhD '12 Pub Date : 2012-05-20 DOI: 10.1145/2213598.2213603

Ablimit Aji, Fusheng Wang

{"title":"High performance spatial query processing for large scale scientific data","authors":"Ablimit Aji, Fusheng Wang","doi":"10.1145/2213598.2213603","DOIUrl":"https://doi.org/10.1145/2213598.2213603","url":null,"abstract":"Analyzing and querying large volumes of spatially derived data from scientific experiments has posed major challenges in the past decade. For example, the systematic analysis of imaged pathology specimens result in rich spatially derived information with GIS characteristics at cellular and sub-cellular scales, with nearly a million derived markups and hundred million features per image. This provides critical information for evaluation of experimental results, support of biomedical studies and pathology image based diagnosis. However, the vast amount of spatially oriented morphological information poses major challenges for analytical medical imaging. The major challenges I attack include: i) How can we provide cost effective, scalable spatial query support for medical imaging GIS? ii) How can we provide fast response queries on analytical imaging data to support biomedical research and clinical diagnosis? and iii) How can we provide expressive queries to support spatial queries and spatial pattern discoveries for end users? In my thesis, I work towards developing a MapReduce based framework MIGIS to support expressive, cost effective and high performance spatial queries. The framework includes a real-time spatial query engine RESQUE consisting of a variety of optimized access methods, boundary and density aware spatial data partitioning, a declarative query language interface, a query translator which automates translation of the spatial queries into MapReduce programs and an execution engine which parallelizes and executes queries on Hadoop. Our preliminary experiments demonstrate that MIGIS is a cost effective architecture which achieves high performance spatial query execution. MIGIS is extensible and can be adapted to support similar complex spatial queries for large scale spatial data in other scientific domains.","PeriodicalId":335125,"journal":{"name":"PhD '12","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129247116","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 23

Efficient optimization and processing for distributed monitoring and control applications 分布式监控应用的高效优化和处理

PhD '12 Pub Date : 2012-05-20 DOI: 10.1145/2213598.2213615

Mengmeng Liu

引用次数: 4

RecDB: towards DBMS support for online recommender systems RecDB:面向在线推荐系统的DBMS支持

PhD '12 Pub Date : 2012-05-20 DOI: 10.1145/2213598.2213608

Mohamed Sarwat

引用次数: 3

Foundational aspects of semantic web optimization 语义网页优化的基本方面

PhD '12 Pub Date : 2012-05-20 DOI: 10.1145/2213598.2213611

Sebastian Skritek

引用次数: 0

Data quality and integration in collaborative environments 协作环境中的数据质量和集成

PhD '12 Pub Date : 2012-05-20 DOI: 10.1145/2213598.2213606

Gregor Endler

{"title":"Data quality and integration in collaborative environments","authors":"Gregor Endler","doi":"10.1145/2213598.2213606","DOIUrl":"https://doi.org/10.1145/2213598.2213606","url":null,"abstract":"The trend to merge medical practices into cooperatively operating networks and organizational units like Medical Supply Centers generates new challenges for an adequate IT support. In particular, new use cases for common economic planning, controlling and treatment coordination arise. This requires consolidation of data originating from heterogeneous and autonomous software systems. Heterogeneity and autonomy are core reasons for low data quality. The intuitive approach of initially integrating heterogeneous systems into a federated system creates a very high upfront effort before the system can become operable and does not adequately consider the fact that data quality requirements might change over time. To remedy this, we propose an approach for continuous data quality improvement which enables a demand driven step by step system integration. By adapting the generic Total Data Quality Management process to healthcare specific use cases, we are developing an extended model for continuous data quality management in cooperative healthcare settings. The IT tools which are needed to provide the information that drives this process are currently in development within a government supported project involving both industry and academia.","PeriodicalId":335125,"journal":{"name":"PhD '12","volume":"135 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127388030","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

Towards an extensible efficient event processing kernel 一个可扩展的高效事件处理内核

PhD '12 Pub Date : 2012-05-20 DOI: 10.1145/2213598.2213602

Mohammad Sadoghi

{"title":"Towards an extensible efficient event processing kernel","authors":"Mohammad Sadoghi","doi":"10.1145/2213598.2213602","DOIUrl":"https://doi.org/10.1145/2213598.2213602","url":null,"abstract":"The efficient processing of large collections of patterns (Boolean expressions, XPath queries, or continuous SQL queries) over data streams plays a central role in major data intensive applications ranging from user-centric processing and personalization to real-time data analysis. On the one hand, emerging user-centric applications, including computational advertising and selective information dissemination, demand determining and presenting to an end-user only the most relevant content that is both user-consumable and suitable for limited screen real estate of target (mobile) devices. We achieve these user-centric requirements through novel high-dimensional indexing structures and (parallel) algorithms. On the other hand, applications in real-time data analysis, including computational finance and intrusion detection, demand meeting stringent subsecond processing requirements and providing high-frequency and low-latency event processing over data streams. We achieve real-time data analysis requirements by leveraging reconfigurable hardware -- FPGAs -- to sustain line-rate processing by exploiting unprecedented degrees of parallelism and potential for pipelining, only available through custom-built, application-specific, and low-level logic design. Finally, we conduct a comprehensive evaluation to demonstrate the superiority of our proposed techniques in comparison with state-of-the-art algorithms designed for event processing.","PeriodicalId":335125,"journal":{"name":"PhD '12","volume":"221 7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122930545","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3