2014 43rd International Conference on Parallel Processing最新文献_第6页

CASTA: CUDA-Accelerated Static Timing Analysis for VLSI Designs 超大规模集成电路设计的cuda加速静态时序分析

2014 43rd International Conference on Parallel Processing Pub Date : 2014-09-01 DOI: 10.1109/ICPP.2014.28

Hunta H.-W. Wang, Louis Y.-Z. Lin, Ryan H.-M. Huang, Charles H.-P. Wen

{"title":"CASTA: CUDA-Accelerated Static Timing Analysis for VLSI Designs","authors":"Hunta H.-W. Wang, Louis Y.-Z. Lin, Ryan H.-M. Huang, Charles H.-P. Wen","doi":"10.1109/ICPP.2014.28","DOIUrl":"https://doi.org/10.1109/ICPP.2014.28","url":null,"abstract":"General-purpose computing on graphics processing unit (GPGPU) enables the possibility of parallel computing for Static Timing Analysis (STA) of VLSI designs. However, memory access and synchronization between massively many cores become challenges to parallelizing STA. In this work, we developed a fast CUDA-Accelerated STA engine (named CASTA) that incorporates four novel techniques including Table-Index Remapping (TIR), Texture-Accelerated Rendering (TAR), Cell Levelization & Type Sorting (CLTS) and Timing-Table Restructuring(TTR) to enable high parallelism. Cell Levelization & Type Sorting (CLTS) levelizes cells and sort their types in order to efficiently access the same timing library. Timing-Table Restructuring (TTR) modifies the data structure for timing signals of cells to increase memory throughput. Table-Index Remapping (TIR) re-maps the axes of timing tables to retrieve data more efficiently while Texture-Accelerated Rendering (TAR) expands look-up tables (LUTs) to avoid extrapolation and stores LUTs in the texture for speed. As a result, our experimental result indicates that CASTA successfully enables high parallelism and outperforms a commercial tool by a three-order speedup on average over several benchmark circuits.","PeriodicalId":441115,"journal":{"name":"2014 43rd International Conference on Parallel Processing","volume":"263 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123364391","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

HERCULES: Strong Patterns towards More Intelligent Predictive Modeling HERCULES:向更智能的预测建模的强大模式

2014 43rd International Conference on Parallel Processing Pub Date : 2014-09-01 DOI: 10.1109/ICPP.2014.26

Eunjung Park, Christos Kartsaklis, John Cavazos

{"title":"HERCULES: Strong Patterns towards More Intelligent Predictive Modeling","authors":"Eunjung Park, Christos Kartsaklis, John Cavazos","doi":"10.1109/ICPP.2014.26","DOIUrl":"https://doi.org/10.1109/ICPP.2014.26","url":null,"abstract":"Recent work has shown that program analysis techniques to select meaningful code features of programs are important in the task of deciding the best compiler optimizations. Although, there are many successful state-of-the-art program analysis techniques, they often do not provide a simple method to extract the most expressive information about loops, especially when a target program is computationally intensive with complex loops and data dependencies. In this paper, we introduce a static technique to characterize a program using a pattern-driven system named HERCULES. This characterization technique not only helps a user to understand programs by searching pattern-of-interests, but also can be used for a predictive model that effectively selects the proper compiler optimizations. We formulated 35 loop patterns, then evaluated our characterization technique by comparing the predictive models constructed using HERCULES to three other state-of-the-art characterization methods. We show that our models outperform three state-of-the-art program characterization techniques on two multicore systems in selecting the best optimization combination from a given loop transformation space. We achieved up to 67% of the best possible speedup achievable with the optimization search space we evaluated.","PeriodicalId":441115,"journal":{"name":"2014 43rd International Conference on Parallel Processing","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133820530","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 16

Output-Sensitive Parallel Algorithm for Polygon Clipping 多边形裁剪的输出敏感并行算法

2014 43rd International Conference on Parallel Processing Pub Date : 2014-09-01 DOI: 10.1109/ICPP.2014.33

S. Puri, S. Prasad

引用次数: 15

Towards Perpetual Sensor Networks via Deploying Multiple Mobile Wireless Chargers 通过部署多个移动无线充电器实现永久传感器网络

2014 43rd International Conference on Parallel Processing Pub Date : 2014-09-01 DOI: 10.1109/ICPP.2014.17

Wenzheng Xu, W. Liang, X. Lin, Guoqiang Mao, Xiaojiang Ren

{"title":"Towards Perpetual Sensor Networks via Deploying Multiple Mobile Wireless Chargers","authors":"Wenzheng Xu, W. Liang, X. Lin, Guoqiang Mao, Xiaojiang Ren","doi":"10.1109/ICPP.2014.17","DOIUrl":"https://doi.org/10.1109/ICPP.2014.17","url":null,"abstract":"In this paper, we study the use of multiple mobile charging vehicles to charge sensors in a large-scale wireless sensor network for a given monitoring period, where sensors can be charged by the vehicles with wireless power transfer. Since each sensor may experience multiple charges to avoid its energy expiration for the period, we first consider a charging problem of scheduling the multiple mobile vehicles to collaboratively charge sensors so that none of the sensors will run out of its energy and the sum of traveling distance (referred to as the service cost) of these vehicles can be minimized. Due to NP-hardness of the problem, we then propose a novel approximation algorithm for it, assuming that sensor energy consumption rates do not change over time. Otherwise, we devise a heuristic algorithm through minor modifications to the approximation algorithm. We finally evaluate the performance of the proposed algorithms via simulations. Experimental results show that the proposed algorithms are very promising, which can reduce upto 45% of the service cost in comparison with the service cost delivered by a greedy algorithm.","PeriodicalId":441115,"journal":{"name":"2014 43rd International Conference on Parallel Processing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134087142","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 42