{"title":"Incorporating a secure coprocessor in the database-as-a-service model","authors":"Einar Mykletun, G. Tsudik","doi":"10.1109/IWIA.2005.28","DOIUrl":"https://doi.org/10.1109/IWIA.2005.28","url":null,"abstract":"In this paper, we suggest an extension to the database-as-a-service (DAS) model that introduces a secure coprocessor (SC) at an untrusted database service provider in order to overcome drawbacks in the plain DAS model. The processor serves as a neutral party between the clients and service providers with the goal of increasing security of outsourced data. Additionally, it supports a much broader range of queries performed and reduces both bandwidth and computational burdens on the client. We expect these improvements to make the DAS model more viable and attractive from a client's perspective.","PeriodicalId":103456,"journal":{"name":"Innovative Architecture for Future Generation High-Performance Processors and Systems (IWIA'05)","volume":"103 45 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-01-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116227484","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
J. Aguilar-Saborit, V. Muntés-Mulero, C. Zuzarte, H. Pereyra, J. Larriba-Pey
{"title":"On the use of bit filters in shared nothing partitioned systems","authors":"J. Aguilar-Saborit, V. Muntés-Mulero, C. Zuzarte, H. Pereyra, J. Larriba-Pey","doi":"10.1109/IWIA.2005.34","DOIUrl":"https://doi.org/10.1109/IWIA.2005.34","url":null,"abstract":"Parallel query processing is in the core of many business analysis environments. Such applications impose a high demand on the computer hardware to achieve results in reasonable times, specially when queries are launched against huge amounts of warehouse data. We look into the problem of parallel query processing on large data sets focusing on a rational use of the network and memory resources. In this context, we propose a new protocol to make use of bit filters in parallel shared nothing systems for non-collocated joins. We call our protocol remote bit filters with requests (RBF/sub R/). We have implemented a prototype of RBF/sub R/ for the first time in a major commercial database, IBM/spl reg/ DB2 Universal Database/spl trade/(DB2 UDB). RBF/sub R/ has two important advantages over the previous usage of bit filters in the same context. First, it reduces the amount of memory used compared to previous solutions. This allows for the processing of more or larger queries. Second, the protocol itself has an insignificant impact on communication. This means that it is as efficient as the previous strategies, avoiding the saturation of the network in parallel intensive network usage environments.","PeriodicalId":103456,"journal":{"name":"Innovative Architecture for Future Generation High-Performance Processors and Systems (IWIA'05)","volume":"161 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-01-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116637570","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Understanding and comparing the performance of optimized JVMs","authors":"D. Nicolaescu, A. Veidenbaum","doi":"10.1109/IWIA.2005.44","DOIUrl":"https://doi.org/10.1109/IWIA.2005.44","url":null,"abstract":"Java virtual machines have different performance characteristics depending on their interpretation and just in time compilation strategies. These characteristics are even more complex when running on a modern out-of-order superscalar processor. This paper analyzes the behavior of the SPECjvm98 benchmarks on IBM's JikesRVM Java virtual machine executing on the IBM Power4 processor. Execution time parameters such as the number of instructions and cycles, the behavior of instruction and data caches, and the branching characteristics obtained from hardware performance counters are used to explain performance differences between interpreted, JIT compiled and dynamically optimized JVMs. Our goal is to understand benchmark and processor behavior with different JIT optimization options and strategies and to use this knowledge in design of future JVMs. The results show that the reduction in the number of executed instructions due to compiler optimizations is the main reason for improved performance. An increase in instruction level parallelism in compiled code provides further improvement. The increased ILP is in large part due to elimination of dependences in the optimized code.","PeriodicalId":103456,"journal":{"name":"Innovative Architecture for Future Generation High-Performance Processors and Systems (IWIA'05)","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-01-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122833903","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
H. Irie, N. Hattori, M. Takada, N. Hatta, T. Toyoshima, S. Sakai
{"title":"Steering and forwarding techniques for reducing memory communication on a clustered microarchitecture","authors":"H. Irie, N. Hattori, M. Takada, N. Hatta, T. Toyoshima, S. Sakai","doi":"10.1109/IWIA.2005.41","DOIUrl":"https://doi.org/10.1109/IWIA.2005.41","url":null,"abstract":"In a clustered micro architecture design, the execution core which has large RAMs, large CAMs and all connected result bypass loops is partitioned into smaller execution cores that are called clusters. Clustered microarchitecture can allow a scalable core design because intra-cluster operation remains fast regardless of entire execution width of the core. But localization of critical memory transfers (store-load-consumer) is still a problem. In this work, we propose a technique named \"distributed speculative memory forwarding (DSMF)\" that localizes critical memory transfers into a cluster. DSMF learns memory dependences at retire stage, steers dependent pair of the store and the consumer to the same cluster, transfers data locally in the cluster. We show that the IPC improvement of 15% was obtained by this localization on the baseline clustered microarchitecture.","PeriodicalId":103456,"journal":{"name":"Innovative Architecture for Future Generation High-Performance Processors and Systems (IWIA'05)","volume":"76 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-01-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130475770","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
J. Ayala, David Atienza Alonso, M. López-Vallejo, J. Mendias, R. Hermida, C. López-Barrio
{"title":"Optimal loop-unrolling mechanisms and architectural extensions for an energy-efficient design of shared register files in MPSoCs","authors":"J. Ayala, David Atienza Alonso, M. López-Vallejo, J. Mendias, R. Hermida, C. López-Barrio","doi":"10.1109/IWIA.2005.35","DOIUrl":"https://doi.org/10.1109/IWIA.2005.35","url":null,"abstract":"In this paper, we introduce a new hardware/software approach to reduce the energy of the shared register file in upcoming embedded architectures with several VLIW processors. This paper includes a set of architectural extensions and special loop unrolling techniques for the compilers of MPSoC platforms. This complete hardware/software support enables reducing the energy consumed in the register file of MPSoC architectures up to a 60% without introducing performance penalties.","PeriodicalId":103456,"journal":{"name":"Innovative Architecture for Future Generation High-Performance Processors and Systems (IWIA'05)","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-01-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128458229","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The bimode++ branch predictor","authors":"Kenji Kise, T. Katagiri, H. Honda, T. Yuba","doi":"10.1109/IWIA.2005.43","DOIUrl":"https://doi.org/10.1109/IWIA.2005.43","url":null,"abstract":"Modern wide-issue superscalar processors tend to adopt deeper pipelines in order to attain high clock rates. This trend increases the number of on-the-fly instructions in processors and a mispredicted branch can result in substantial amounts of wasted work. In order to mitigate these wasted works, an accurate branch prediction is required for the high performance processors. In order to improve the prediction accuracy, we propose the bimode++ branch predictor. It is an enhanced version of the bimode branch predictor. Throughout execution from the start to the end of a program, some branch instructions have the same result at all times. These branches are defined as extremely biased branches. The bimode++ branch predictor is unique in predicting the output of an extremely biased branch with a simple hardware structure. In addition, the bimode++ branch predictor improves the accuracy using the refined indexing and a fusion function. Our experimental results with benchmarks from SpecFP, SpecINT, multi-media and server area show that the bimode++ branch predictor can reduce the misprediction rate by 13.2% to the bimode and by 32.5% to the gshare.","PeriodicalId":103456,"journal":{"name":"Innovative Architecture for Future Generation High-Performance Processors and Systems (IWIA'05)","volume":"70 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-01-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133256555","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}