2013 IEEE 31st International Conference on Computer Design (ICCD)最新文献_第5页

Assessment of cloud-based health monitoring using Homomorphic Encryption 使用同态加密评估基于云的运行状况监测

2013 IEEE 31st International Conference on Computer Design (ICCD) Pub Date : 2013-11-07 DOI: 10.1109/ICCD.2013.6657078

Övünç Kocabas, T. Soyata, J. Couderc, M. Aktas, J. Xia, Michael C. Huang

{"title":"Assessment of cloud-based health monitoring using Homomorphic Encryption","authors":"Övünç Kocabas, T. Soyata, J. Couderc, M. Aktas, J. Xia, Michael C. Huang","doi":"10.1109/ICCD.2013.6657078","DOIUrl":"https://doi.org/10.1109/ICCD.2013.6657078","url":null,"abstract":"Current financial and regulatory pressure has provided strong incentives to institute better disease prevention, improved patient monitoring, and push U.S. healthcare into the digital era. This transition requires that data privacy be ensured for digital health data in three distinct phases: I. acquisition, II. storage, and III. computation. Each phase comes with unique challenges in terms of proper implementation and privacy. While the privacy of the data can be ensured with existing AES encryption techniques in phases I (acquisition) and II (storage), to enable healthcare organizations to take advantage of cloud computing using resources such as Amazon Web Services, phase III (computation) must also enable the privacy of the data. Currently, there exists no system to enable direct computation in the cloud while assuring data privacy. Fully Homomorphic Encryption (FHE) is an emerging cryptographic technique to permit computation on encrypted data directly in the cloud without the need to bring the data back to the computational node. However, this promising technique comes with significant performance- and storage-related challenges. While it will take more years before true FHE is mainstream, we provide a feasibility study for its application to a simple longterm patient ECG-data monitoring system.","PeriodicalId":398811,"journal":{"name":"2013 IEEE 31st International Conference on Computer Design (ICCD)","volume":"85 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132197379","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 61

Bayesian theory oriented Optimal Data-Provider Selection for CMP 面向贝叶斯理论的CMP最优数据提供者选择

2013 IEEE 31st International Conference on Computer Design (ICCD) Pub Date : 2013-11-07 DOI: 10.1109/ICCD.2013.6657050

Guohong Li, Zhenyu Liu, Sanchuan Guo, Chongmin Li, Dongsheng Wang

{"title":"Bayesian theory oriented Optimal Data-Provider Selection for CMP","authors":"Guohong Li, Zhenyu Liu, Sanchuan Guo, Chongmin Li, Dongsheng Wang","doi":"10.1109/ICCD.2013.6657050","DOIUrl":"https://doi.org/10.1109/ICCD.2013.6657050","url":null,"abstract":"With the number of cores and working sets of parallel workloads soaring, shared L2 caches exhibit fewer misses than private L2 caches via making better use of the all available cache capacity. However, shared L2 caches induce higher overall L1 miss latencies because of longer average distance between requestor and home node, and potentially congestions at some nodes. We observe that there is a high probability that the requested data of an L1 miss resides in a neighbor node's L1 cache. In such cases, these long-distance accesses to the home nodes can be potentially avoided. In order to successfully leverage the aforementioned property, we propose Bayesian theory oriented Optimal Data-Provider Selection (ODPS). ODPS partitions the multi-core into clusters of 2×2 nodes, and introduces the Proximity Data Prober (PDP) to detect whether an L1 miss can be served by one L1 cache within the same cluster. Furthermore, we devise the Bayesian Decision Classifier (BDC) to intelligently and adaptively select a remote L2 cache or a neighboring L1 node as the data provider according to the minimal miss cost based on the Bayesian decision theory.","PeriodicalId":398811,"journal":{"name":"2013 IEEE 31st International Conference on Computer Design (ICCD)","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115977382","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Accelerator-rich CMPs: From concept to real hardware 加速器丰富的cmp:从概念到实际硬件

2013 IEEE 31st International Conference on Computer Design (ICCD) Pub Date : 2013-11-07 DOI: 10.1109/ICCD.2013.6657039

Yu-Ting Chen, J. Cong, M. Ghodrat, Muhuan Huang, Chunyue Liu, Bingjun Xiao, Yi Zou

{"title":"Accelerator-rich CMPs: From concept to real hardware","authors":"Yu-Ting Chen, J. Cong, M. Ghodrat, Muhuan Huang, Chunyue Liu, Bingjun Xiao, Yi Zou","doi":"10.1109/ICCD.2013.6657039","DOIUrl":"https://doi.org/10.1109/ICCD.2013.6657039","url":null,"abstract":"Application-specific accelerators provide 10-100× improvement in power efficiency over general-purpose processors. The accelerator-rich architectures are especially promising. This work discusses a prototype of accelerator-rich CMPs (PARC). During our development of PARC in real hardware, we encountered a set of technical challenges and proposed corresponding solutions. First, we provided system IPs that serve a sea of accelerators to transfer data between userspace and accelerator memories without cache overhead. Second, we designed a dedicated interconnect between accelerators and memories to enable memory sharing. Third, we implemented an accelerator manager to virtualize accelerator resources for users. Finally, we developed an automated flow with a number of IP templates and customizable interfaces to a C-based synthesis flow to enable rapid design and update of PARC. We implemented PARC in a Virtex-6 FPGA chip with integration of platform-specific peripherals and booting of unmodified Linux. Experimental results show that PARC can fully exploit the energy benefits of accelerators at little system overhead.","PeriodicalId":398811,"journal":{"name":"2013 IEEE 31st International Conference on Computer Design (ICCD)","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114433022","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 34

Performance-controllable shared cache architecture for multi-core soft real-time systems 面向多核软实时系统的性能可控共享缓存架构

2013 IEEE 31st International Conference on Computer Design (ICCD) Pub Date : 2013-11-07 DOI: 10.1109/ICCD.2013.6657097

Myoungjun Lee, Soontae Kim

引用次数: 2

Selecting critical implications with set-covering formulation for SAT-based Bounded Model Checking 用集覆盖公式选择基于sat的有界模型检验的关键含义

2013 IEEE 31st International Conference on Computer Design (ICCD) Pub Date : 2013-11-07 DOI: 10.1109/ICCD.2013.6657070

Mahmoud Elbayoumi, M. Hsiao, Mustafa ElNainay

{"title":"Selecting critical implications with set-covering formulation for SAT-based Bounded Model Checking","authors":"Mahmoud Elbayoumi, M. Hsiao, Mustafa ElNainay","doi":"10.1109/ICCD.2013.6657070","DOIUrl":"https://doi.org/10.1109/ICCD.2013.6657070","url":null,"abstract":"The effectiveness of SAT-based Bounded Model Checking (BMC) critically relies on the deductive power of the BMC instance. Although implication relationships have been used to help SAT solver to make more deductions, frequently an excessive number of implications has been used. Too many such implications can result in a large number of clauses that could potentially degrade the underlying SAT solver performance. In this paper, we first propose a framework for a parallel deduction engine to reduce implication learning time. Secondly, we propose a novel set-cover technique for optimal selection of constraint clauses. This technique depends on maximizing the number of literals that can be deduced by the SAT solver during the BCP (Boolean Constraint Propagation) operation. Our parallel deduction engine can achieve a 5.7× speedup on a 36-core machine. In addition, by selecting only those critical implications, our strategy improves BMC by another 1.74× against the case where all extended implications were added to the BMC instance. Compared with the original BMC without any implication clauses, up to 55.32× speedup can be achieved.","PeriodicalId":398811,"journal":{"name":"2013 IEEE 31st International Conference on Computer Design (ICCD)","volume":"85 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121628197","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

Power capping of CPU-GPU heterogeneous systems through coordinating DVFS and task mapping 通过协调DVFS和任务映射实现CPU-GPU异构系统的功率上限

2013 IEEE 31st International Conference on Computer Design (ICCD) Pub Date : 2013-11-07 DOI: 10.1109/ICCD.2013.6657064

T. Komoda, S. Hayashi, Takashi Nakada, Shinobu Miwa, Hiroshi Nakamura

{"title":"Power capping of CPU-GPU heterogeneous systems through coordinating DVFS and task mapping","authors":"T. Komoda, S. Hayashi, Takashi Nakada, Shinobu Miwa, Hiroshi Nakamura","doi":"10.1109/ICCD.2013.6657064","DOIUrl":"https://doi.org/10.1109/ICCD.2013.6657064","url":null,"abstract":"Future computer systems are built under much stringent power budget due to the limitation of power delivery and cooling systems. To this end, sophisticated power management techniques are required. Power capping is a technique to limit the power consumption of a system to the predetermined level, and has been extensively studied in homogeneous systems. However, few studies about the power capping of CPU-GPU heterogeneous systems have been done yet. In this paper, we propose an efficient power capping technique through coordinating DVFS and task mapping in a single computing node equipped with GPUs. In CPU-GPU heterogeneous systems, settings of the device frequencies have to be considered with task mapping between the CPUs and the GPUs because the frequency scaling can incurs load imbalance between them. To guide the settings of DVFS and task mapping for avoiding power violation and the load imbalance, we develop new empirical models of the performance and the maximum power consumption of a CPU-GPU heterogeneous system. The models enable us to set near-optimal settings of the device frequencies and the task mapping in advance of the application execution. We evaluate the proposed technique with five data-parallel applications on a machine equipped with a single CPU and a single GPU. The experimental result shows that the performance achieved by the proposed power capping technique is comparable to the ideal one.","PeriodicalId":398811,"journal":{"name":"2013 IEEE 31st International Conference on Computer Design (ICCD)","volume":"60 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126337678","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 65

A global router on GPU architecture 基于GPU架构的全局路由器

2013 IEEE 31st International Conference on Computer Design (ICCD) Pub Date : 2013-11-07 DOI: 10.1109/ICCD.2013.6657028

Yiding Han, Koushik Chakraborty, Sanghamitra Roy

引用次数: 11

Noise-based algorithms for functional equivalence and tautology checking 基于噪声的功能等价和同义检验算法

2013 IEEE 31st International Conference on Computer Design (ICCD) Pub Date : 2013-11-07 DOI: 10.1109/ICCD.2013.6657048

P. Lin, S. Khatri

引用次数: 0

Quipu: High-performance simulation of quantum circuits using stabilizer frames qupu:使用稳定器框架的量子电路的高性能模拟

2013 IEEE 31st International Conference on Computer Design (ICCD) Pub Date : 2013-11-07 DOI: 10.1109/ICCD.2013.6657072

Héctor J. García, I. Markov

{"title":"Quipu: High-performance simulation of quantum circuits using stabilizer frames","authors":"Héctor J. García, I. Markov","doi":"10.1109/ICCD.2013.6657072","DOIUrl":"https://doi.org/10.1109/ICCD.2013.6657072","url":null,"abstract":"As quantum information processing gains traction, its simulation becomes increasingly significant for engineering purposes - evaluation, testing and optimization - as well as for theoretical research. Generic quantum-circuit simulation appears intractable for conventional computers. However, Gottesman and Knill identified an important subclass, called stabilizer circuits, which can be simulated efficiently using group-theory techniques. Practical circuits enriched with quantum error-correcting codes and fault-tolerant procedures are dominated by stabilizer subcircuits and contain a relatively small number of non-stabilizer components. Therefore, we develop new group-theory data structures and algorithms to simulate such circuits. Stabilizer frames offer more compact storage than previous approaches but requires more sophisticated bookkeeping. Our implementation, called Quipu, simulates certain quantum arithmetic circuits (e.g., ripple-carry adders) in polynomial time and space for equal superpositions of n-qubits. On such instances, known linear-algebraic simulation techniques, such as the (state-of-the-art) BDD-based simulator QuIDDPro, take exponential time. We simulate various quantum Fourier transform and quantum fault-tolerant circuits with Quipu, and the results demonstrate that our stabilizer-based technique outperforms QuIDDPro in all cases.","PeriodicalId":398811,"journal":{"name":"2013 IEEE 31st International Conference on Computer Design (ICCD)","volume":"62 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115801177","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 8

Increasing GPU throughput using kernel interleaved thread block scheduling 使用内核交错线程块调度增加GPU吞吐量

2013 IEEE 31st International Conference on Computer Design (ICCD) Pub Date : 2013-11-07 DOI: 10.1109/ICCD.2013.6657093

Mihir Awatramani, Joseph Zambreno, D. Rover

引用次数: 18