Hunta H.-W. Wang, Louis Y.-Z. Lin, Ryan H.-M. Huang, Charles H.-P. Wen
{"title":"CASTA: CUDA-Accelerated Static Timing Analysis for VLSI Designs","authors":"Hunta H.-W. Wang, Louis Y.-Z. Lin, Ryan H.-M. Huang, Charles H.-P. Wen","doi":"10.1109/ICPP.2014.28","DOIUrl":"https://doi.org/10.1109/ICPP.2014.28","url":null,"abstract":"General-purpose computing on graphics processing unit (GPGPU) enables the possibility of parallel computing for Static Timing Analysis (STA) of VLSI designs. However, memory access and synchronization between massively many cores become challenges to parallelizing STA. In this work, we developed a fast CUDA-Accelerated STA engine (named CASTA) that incorporates four novel techniques including Table-Index Remapping (TIR), Texture-Accelerated Rendering (TAR), Cell Levelization & Type Sorting (CLTS) and Timing-Table Restructuring(TTR) to enable high parallelism. Cell Levelization & Type Sorting (CLTS) levelizes cells and sort their types in order to efficiently access the same timing library. Timing-Table Restructuring (TTR) modifies the data structure for timing signals of cells to increase memory throughput. Table-Index Remapping (TIR) re-maps the axes of timing tables to retrieve data more efficiently while Texture-Accelerated Rendering (TAR) expands look-up tables (LUTs) to avoid extrapolation and stores LUTs in the texture for speed. As a result, our experimental result indicates that CASTA successfully enables high parallelism and outperforms a commercial tool by a three-order speedup on average over several benchmark circuits.","PeriodicalId":441115,"journal":{"name":"2014 43rd International Conference on Parallel Processing","volume":"263 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123364391","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"HERCULES: Strong Patterns towards More Intelligent Predictive Modeling","authors":"Eunjung Park, Christos Kartsaklis, John Cavazos","doi":"10.1109/ICPP.2014.26","DOIUrl":"https://doi.org/10.1109/ICPP.2014.26","url":null,"abstract":"Recent work has shown that program analysis techniques to select meaningful code features of programs are important in the task of deciding the best compiler optimizations. Although, there are many successful state-of-the-art program analysis techniques, they often do not provide a simple method to extract the most expressive information about loops, especially when a target program is computationally intensive with complex loops and data dependencies. In this paper, we introduce a static technique to characterize a program using a pattern-driven system named HERCULES. This characterization technique not only helps a user to understand programs by searching pattern-of-interests, but also can be used for a predictive model that effectively selects the proper compiler optimizations. We formulated 35 loop patterns, then evaluated our characterization technique by comparing the predictive models constructed using HERCULES to three other state-of-the-art characterization methods. We show that our models outperform three state-of-the-art program characterization techniques on two multicore systems in selecting the best optimization combination from a given loop transformation space. We achieved up to 67% of the best possible speedup achievable with the optimization search space we evaluated.","PeriodicalId":441115,"journal":{"name":"2014 43rd International Conference on Parallel Processing","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133820530","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Output-Sensitive Parallel Algorithm for Polygon Clipping","authors":"S. Puri, S. Prasad","doi":"10.1109/ICPP.2014.33","DOIUrl":"https://doi.org/10.1109/ICPP.2014.33","url":null,"abstract":"Polygon clipping is one of the complex operations in computational geometry. It is a primitive operation in many fields such as Geographic Information Systems (GIS), Computer Graphics and VLSI CAD. Sequential algorithms for this problem are in abundance in literature but there are very few parallel algorithms solving it in its most general form. We present the first output-sensitive CREW PRAM algorithm, which can perform polygon clipping in O(logn) time using (n + k + k') processors, where n is the number of vertices, k is the number of edge intersections and k' is the additional temporary vertices introduced due to the partitioning of polygons. The current best algorithm by Karinthi, Srinivas, and Almasi [1] does not handle self-intersecting polygons, is not output-sensitive and must employ ⊝(n2) processors to achieve O(logn) time. Our algorithm is developed from the first principles and it is superior to [1] in cost. It yields a practical implementation on multicores and demonstrates 30x speedup for real-world dataset. Our algorithm can perform the typical clipping operations including intersection, union, and difference.","PeriodicalId":441115,"journal":{"name":"2014 43rd International Conference on Parallel Processing","volume":"100 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129268391","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Wenzheng Xu, W. Liang, X. Lin, Guoqiang Mao, Xiaojiang Ren
{"title":"Towards Perpetual Sensor Networks via Deploying Multiple Mobile Wireless Chargers","authors":"Wenzheng Xu, W. Liang, X. Lin, Guoqiang Mao, Xiaojiang Ren","doi":"10.1109/ICPP.2014.17","DOIUrl":"https://doi.org/10.1109/ICPP.2014.17","url":null,"abstract":"In this paper, we study the use of multiple mobile charging vehicles to charge sensors in a large-scale wireless sensor network for a given monitoring period, where sensors can be charged by the vehicles with wireless power transfer. Since each sensor may experience multiple charges to avoid its energy expiration for the period, we first consider a charging problem of scheduling the multiple mobile vehicles to collaboratively charge sensors so that none of the sensors will run out of its energy and the sum of traveling distance (referred to as the service cost) of these vehicles can be minimized. Due to NP-hardness of the problem, we then propose a novel approximation algorithm for it, assuming that sensor energy consumption rates do not change over time. Otherwise, we devise a heuristic algorithm through minor modifications to the approximation algorithm. We finally evaluate the performance of the proposed algorithms via simulations. Experimental results show that the proposed algorithms are very promising, which can reduce upto 45% of the service cost in comparison with the service cost delivered by a greedy algorithm.","PeriodicalId":441115,"journal":{"name":"2014 43rd International Conference on Parallel Processing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134087142","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}