2013 IEEE 31st International Conference on Computer Design (ICCD)最新文献

筛选
英文 中文
Assessment of cloud-based health monitoring using Homomorphic Encryption 使用同态加密评估基于云的运行状况监测
2013 IEEE 31st International Conference on Computer Design (ICCD) Pub Date : 2013-11-07 DOI: 10.1109/ICCD.2013.6657078
Övünç Kocabas, T. Soyata, J. Couderc, M. Aktas, J. Xia, Michael C. Huang
{"title":"Assessment of cloud-based health monitoring using Homomorphic Encryption","authors":"Övünç Kocabas, T. Soyata, J. Couderc, M. Aktas, J. Xia, Michael C. Huang","doi":"10.1109/ICCD.2013.6657078","DOIUrl":"https://doi.org/10.1109/ICCD.2013.6657078","url":null,"abstract":"Current financial and regulatory pressure has provided strong incentives to institute better disease prevention, improved patient monitoring, and push U.S. healthcare into the digital era. This transition requires that data privacy be ensured for digital health data in three distinct phases: I. acquisition, II. storage, and III. computation. Each phase comes with unique challenges in terms of proper implementation and privacy. While the privacy of the data can be ensured with existing AES encryption techniques in phases I (acquisition) and II (storage), to enable healthcare organizations to take advantage of cloud computing using resources such as Amazon Web Services, phase III (computation) must also enable the privacy of the data. Currently, there exists no system to enable direct computation in the cloud while assuring data privacy. Fully Homomorphic Encryption (FHE) is an emerging cryptographic technique to permit computation on encrypted data directly in the cloud without the need to bring the data back to the computational node. However, this promising technique comes with significant performance- and storage-related challenges. While it will take more years before true FHE is mainstream, we provide a feasibility study for its application to a simple longterm patient ECG-data monitoring system.","PeriodicalId":398811,"journal":{"name":"2013 IEEE 31st International Conference on Computer Design (ICCD)","volume":"85 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132197379","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 61
Bayesian theory oriented Optimal Data-Provider Selection for CMP 面向贝叶斯理论的CMP最优数据提供者选择
2013 IEEE 31st International Conference on Computer Design (ICCD) Pub Date : 2013-11-07 DOI: 10.1109/ICCD.2013.6657050
Guohong Li, Zhenyu Liu, Sanchuan Guo, Chongmin Li, Dongsheng Wang
{"title":"Bayesian theory oriented Optimal Data-Provider Selection for CMP","authors":"Guohong Li, Zhenyu Liu, Sanchuan Guo, Chongmin Li, Dongsheng Wang","doi":"10.1109/ICCD.2013.6657050","DOIUrl":"https://doi.org/10.1109/ICCD.2013.6657050","url":null,"abstract":"With the number of cores and working sets of parallel workloads soaring, shared L2 caches exhibit fewer misses than private L2 caches via making better use of the all available cache capacity. However, shared L2 caches induce higher overall L1 miss latencies because of longer average distance between requestor and home node, and potentially congestions at some nodes. We observe that there is a high probability that the requested data of an L1 miss resides in a neighbor node's L1 cache. In such cases, these long-distance accesses to the home nodes can be potentially avoided. In order to successfully leverage the aforementioned property, we propose Bayesian theory oriented Optimal Data-Provider Selection (ODPS). ODPS partitions the multi-core into clusters of 2×2 nodes, and introduces the Proximity Data Prober (PDP) to detect whether an L1 miss can be served by one L1 cache within the same cluster. Furthermore, we devise the Bayesian Decision Classifier (BDC) to intelligently and adaptively select a remote L2 cache or a neighboring L1 node as the data provider according to the minimal miss cost based on the Bayesian decision theory.","PeriodicalId":398811,"journal":{"name":"2013 IEEE 31st International Conference on Computer Design (ICCD)","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115977382","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Accelerator-rich CMPs: From concept to real hardware 加速器丰富的cmp:从概念到实际硬件
2013 IEEE 31st International Conference on Computer Design (ICCD) Pub Date : 2013-11-07 DOI: 10.1109/ICCD.2013.6657039
Yu-Ting Chen, J. Cong, M. Ghodrat, Muhuan Huang, Chunyue Liu, Bingjun Xiao, Yi Zou
{"title":"Accelerator-rich CMPs: From concept to real hardware","authors":"Yu-Ting Chen, J. Cong, M. Ghodrat, Muhuan Huang, Chunyue Liu, Bingjun Xiao, Yi Zou","doi":"10.1109/ICCD.2013.6657039","DOIUrl":"https://doi.org/10.1109/ICCD.2013.6657039","url":null,"abstract":"Application-specific accelerators provide 10-100× improvement in power efficiency over general-purpose processors. The accelerator-rich architectures are especially promising. This work discusses a prototype of accelerator-rich CMPs (PARC). During our development of PARC in real hardware, we encountered a set of technical challenges and proposed corresponding solutions. First, we provided system IPs that serve a sea of accelerators to transfer data between userspace and accelerator memories without cache overhead. Second, we designed a dedicated interconnect between accelerators and memories to enable memory sharing. Third, we implemented an accelerator manager to virtualize accelerator resources for users. Finally, we developed an automated flow with a number of IP templates and customizable interfaces to a C-based synthesis flow to enable rapid design and update of PARC. We implemented PARC in a Virtex-6 FPGA chip with integration of platform-specific peripherals and booting of unmodified Linux. Experimental results show that PARC can fully exploit the energy benefits of accelerators at little system overhead.","PeriodicalId":398811,"journal":{"name":"2013 IEEE 31st International Conference on Computer Design (ICCD)","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114433022","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 34
Performance-controllable shared cache architecture for multi-core soft real-time systems 面向多核软实时系统的性能可控共享缓存架构
2013 IEEE 31st International Conference on Computer Design (ICCD) Pub Date : 2013-11-07 DOI: 10.1109/ICCD.2013.6657097
Myoungjun Lee, Soontae Kim
{"title":"Performance-controllable shared cache architecture for multi-core soft real-time systems","authors":"Myoungjun Lee, Soontae Kim","doi":"10.1109/ICCD.2013.6657097","DOIUrl":"https://doi.org/10.1109/ICCD.2013.6657097","url":null,"abstract":"Multi-core processors with shared L2 caches can improve performance and integrate several functions of real-time systems on a single chip. However, tasks running on different cores increase interferences in the shared L2 cache, resulting in more deadline misses and, consequently, worse quality of real-time tasks. This is mainly because of the blind sharing of the L2 cache by multiple tasks running on different cores.We propose a novel performance-controllable shared L2 cache architecture that can alleviate these problems. First, our proposed L2 cache architecture is made to be aware of instructions/data belonging to real-time tasks by adding a real-time indication bit to each L2 cache block. Second, it can control the performance of real-time tasks and non-real-time tasks. Our experimental results show that our proposed L2 cache architecture reduces more deadline misses of real-time tasks than the conventional L2 cache architecture and partitioning schemes.","PeriodicalId":398811,"journal":{"name":"2013 IEEE 31st International Conference on Computer Design (ICCD)","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128318348","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Selecting critical implications with set-covering formulation for SAT-based Bounded Model Checking 用集覆盖公式选择基于sat的有界模型检验的关键含义
2013 IEEE 31st International Conference on Computer Design (ICCD) Pub Date : 2013-11-07 DOI: 10.1109/ICCD.2013.6657070
Mahmoud Elbayoumi, M. Hsiao, Mustafa ElNainay
{"title":"Selecting critical implications with set-covering formulation for SAT-based Bounded Model Checking","authors":"Mahmoud Elbayoumi, M. Hsiao, Mustafa ElNainay","doi":"10.1109/ICCD.2013.6657070","DOIUrl":"https://doi.org/10.1109/ICCD.2013.6657070","url":null,"abstract":"The effectiveness of SAT-based Bounded Model Checking (BMC) critically relies on the deductive power of the BMC instance. Although implication relationships have been used to help SAT solver to make more deductions, frequently an excessive number of implications has been used. Too many such implications can result in a large number of clauses that could potentially degrade the underlying SAT solver performance. In this paper, we first propose a framework for a parallel deduction engine to reduce implication learning time. Secondly, we propose a novel set-cover technique for optimal selection of constraint clauses. This technique depends on maximizing the number of literals that can be deduced by the SAT solver during the BCP (Boolean Constraint Propagation) operation. Our parallel deduction engine can achieve a 5.7× speedup on a 36-core machine. In addition, by selecting only those critical implications, our strategy improves BMC by another 1.74× against the case where all extended implications were added to the BMC instance. Compared with the original BMC without any implication clauses, up to 55.32× speedup can be achieved.","PeriodicalId":398811,"journal":{"name":"2013 IEEE 31st International Conference on Computer Design (ICCD)","volume":"85 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121628197","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Power capping of CPU-GPU heterogeneous systems through coordinating DVFS and task mapping 通过协调DVFS和任务映射实现CPU-GPU异构系统的功率上限
2013 IEEE 31st International Conference on Computer Design (ICCD) Pub Date : 2013-11-07 DOI: 10.1109/ICCD.2013.6657064
T. Komoda, S. Hayashi, Takashi Nakada, Shinobu Miwa, Hiroshi Nakamura
{"title":"Power capping of CPU-GPU heterogeneous systems through coordinating DVFS and task mapping","authors":"T. Komoda, S. Hayashi, Takashi Nakada, Shinobu Miwa, Hiroshi Nakamura","doi":"10.1109/ICCD.2013.6657064","DOIUrl":"https://doi.org/10.1109/ICCD.2013.6657064","url":null,"abstract":"Future computer systems are built under much stringent power budget due to the limitation of power delivery and cooling systems. To this end, sophisticated power management techniques are required. Power capping is a technique to limit the power consumption of a system to the predetermined level, and has been extensively studied in homogeneous systems. However, few studies about the power capping of CPU-GPU heterogeneous systems have been done yet. In this paper, we propose an efficient power capping technique through coordinating DVFS and task mapping in a single computing node equipped with GPUs. In CPU-GPU heterogeneous systems, settings of the device frequencies have to be considered with task mapping between the CPUs and the GPUs because the frequency scaling can incurs load imbalance between them. To guide the settings of DVFS and task mapping for avoiding power violation and the load imbalance, we develop new empirical models of the performance and the maximum power consumption of a CPU-GPU heterogeneous system. The models enable us to set near-optimal settings of the device frequencies and the task mapping in advance of the application execution. We evaluate the proposed technique with five data-parallel applications on a machine equipped with a single CPU and a single GPU. The experimental result shows that the performance achieved by the proposed power capping technique is comparable to the ideal one.","PeriodicalId":398811,"journal":{"name":"2013 IEEE 31st International Conference on Computer Design (ICCD)","volume":"60 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126337678","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 65
A global router on GPU architecture 基于GPU架构的全局路由器
2013 IEEE 31st International Conference on Computer Design (ICCD) Pub Date : 2013-11-07 DOI: 10.1109/ICCD.2013.6657028
Yiding Han, Koushik Chakraborty, Sanghamitra Roy
{"title":"A global router on GPU architecture","authors":"Yiding Han, Koushik Chakraborty, Sanghamitra Roy","doi":"10.1109/ICCD.2013.6657028","DOIUrl":"https://doi.org/10.1109/ICCD.2013.6657028","url":null,"abstract":"In the modern VLSI design flow, global router is often utilized to provide fast and accurate congestion analysis for upstream processes to improve the design routability. Global routing parallelization is a good candidate to speedup its runtime performance while delivering very competitive solution quality. In this paper, we first study the cause of insufficient exploitable concurrency of the existing net level concurrency model, which has become a major bottleneck for parallelizing the emerging design problems. Then, we mitigate this limitation with a novel fine grain parallel model, with which a GPU based multi-agent global router is designed. Our experimental results indicate that the parallel model can effectively support the GPU based global router, and deliver stable solutions. The runtime comparison with NCTUgr2 has shown that upto 3.9× speedup is achieved by the GPU based router.","PeriodicalId":398811,"journal":{"name":"2013 IEEE 31st International Conference on Computer Design (ICCD)","volume":"100 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132229646","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Noise-based algorithms for functional equivalence and tautology checking 基于噪声的功能等价和同义检验算法
2013 IEEE 31st International Conference on Computer Design (ICCD) Pub Date : 2013-11-07 DOI: 10.1109/ICCD.2013.6657048
P. Lin, S. Khatri
{"title":"Noise-based algorithms for functional equivalence and tautology checking","authors":"P. Lin, S. Khatri","doi":"10.1109/ICCD.2013.6657048","DOIUrl":"https://doi.org/10.1109/ICCD.2013.6657048","url":null,"abstract":"In this paper, we present noise-based algorithms for functional equivalence and tautology checking using noise-based logic (NBL). A key property of NBL is that literals are represented by independent noise sources, from which we can construct noise-based cubes, and superpositions of such noise-based cubes, to create a noise-based Boolean function on a single wire. In our algorithms, the Boolean sum-of-products (SOP) formula is expressed in NBL as a superposition of its minterms. This noise-based representation of the SOP can then be compared with that of another SOP formula for equivalence checking (or with the noise-based formula representing tautology, for tautology checking) using a single operation. We validate our approach using software simulation.","PeriodicalId":398811,"journal":{"name":"2013 IEEE 31st International Conference on Computer Design (ICCD)","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133593852","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Quipu: High-performance simulation of quantum circuits using stabilizer frames qupu:使用稳定器框架的量子电路的高性能模拟
2013 IEEE 31st International Conference on Computer Design (ICCD) Pub Date : 2013-11-07 DOI: 10.1109/ICCD.2013.6657072
Héctor J. García, I. Markov
{"title":"Quipu: High-performance simulation of quantum circuits using stabilizer frames","authors":"Héctor J. García, I. Markov","doi":"10.1109/ICCD.2013.6657072","DOIUrl":"https://doi.org/10.1109/ICCD.2013.6657072","url":null,"abstract":"As quantum information processing gains traction, its simulation becomes increasingly significant for engineering purposes - evaluation, testing and optimization - as well as for theoretical research. Generic quantum-circuit simulation appears intractable for conventional computers. However, Gottesman and Knill identified an important subclass, called stabilizer circuits, which can be simulated efficiently using group-theory techniques. Practical circuits enriched with quantum error-correcting codes and fault-tolerant procedures are dominated by stabilizer subcircuits and contain a relatively small number of non-stabilizer components. Therefore, we develop new group-theory data structures and algorithms to simulate such circuits. Stabilizer frames offer more compact storage than previous approaches but requires more sophisticated bookkeeping. Our implementation, called Quipu, simulates certain quantum arithmetic circuits (e.g., ripple-carry adders) in polynomial time and space for equal superpositions of n-qubits. On such instances, known linear-algebraic simulation techniques, such as the (state-of-the-art) BDD-based simulator QuIDDPro, take exponential time. We simulate various quantum Fourier transform and quantum fault-tolerant circuits with Quipu, and the results demonstrate that our stabilizer-based technique outperforms QuIDDPro in all cases.","PeriodicalId":398811,"journal":{"name":"2013 IEEE 31st International Conference on Computer Design (ICCD)","volume":"62 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115801177","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Increasing GPU throughput using kernel interleaved thread block scheduling 使用内核交错线程块调度增加GPU吞吐量
2013 IEEE 31st International Conference on Computer Design (ICCD) Pub Date : 2013-11-07 DOI: 10.1109/ICCD.2013.6657093
Mihir Awatramani, Joseph Zambreno, D. Rover
{"title":"Increasing GPU throughput using kernel interleaved thread block scheduling","authors":"Mihir Awatramani, Joseph Zambreno, D. Rover","doi":"10.1109/ICCD.2013.6657093","DOIUrl":"https://doi.org/10.1109/ICCD.2013.6657093","url":null,"abstract":"The number of active threads required to achieve peak application throughput on graphics processing units (GPUs) depends largely on the ratio of time spent on computation to the time spent accessing data from memory. While compute-intensive applications can achieve peak throughput with a low number of threads, memory-intensive applications might not achieve good throughput even at the maximum supported thread count. In this paper, we study the effects of scheduling work from multiple applications on the same GPU core. We claim that interleaving workload from different applications on a GPU core can improve the utilization of computational units and reduce the load on memory subsystem. Experiments on 17 application pairs from the Rodinia benchmark suite show that overall throughput increases by 7% on average.","PeriodicalId":398811,"journal":{"name":"2013 IEEE 31st International Conference on Computer Design (ICCD)","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114664338","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 18
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信