2014 23rd International Conference on Parallel Architecture and Compilation (PACT)最新文献

筛选
英文 中文
Keynote: Internet of mobile things: Challenges and opportunities 主题演讲:移动物联网:挑战与机遇
K. Nahrstedt
{"title":"Keynote: Internet of mobile things: Challenges and opportunities","authors":"K. Nahrstedt","doi":"10.1145/2628071.2635931","DOIUrl":"https://doi.org/10.1145/2628071.2635931","url":null,"abstract":"The Internet of Things (IoT) concept has been around for some time and applications such as transportation, health-care, education, travel, smart grid, retail, are and will be major benefactors of this concept. However, only recently, due to technological advances in sensor devices and rich wireless connectivity, Internet of Things at scale is becoming reality. For example, Cisco's Internet of Things Group predicts over 50 billion connected sensory devices by 2020. In this talk, we will discuss the Internet of Mobile Things (IoMT) since several game-changing technological advances happened on ‘mobile things’ such as mobile phones, trains, and cars, where rich sets of sensors, connected via diverse sets of wireless Internet technologies, are changing and influencing how people communicate, move, and download and distribute information. In this space, challenges come from the needs to determine (1) contextual information such as location, duration of contact, density of devices, utilizing networked sensory information; (2) higher level knowledge such as users' activity detection, mood detection, applications usage pattern detection and user interactions on ‘mobile things’, utilizing contextual information; and (3) adaptive and real-time parallel and distributed architectures that integrate context, activity, mood, usage patterns into mobile application services on mobile ‘things’. Solving these challenges will provide enormous opportunities to improve the utility of mobile ‘things’, optimizing scarce resources on mobile ‘things’ such as energy, memory, and bandwidth.","PeriodicalId":263670,"journal":{"name":"2014 23rd International Conference on Parallel Architecture and Compilation (PACT)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115389208","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
ADHA: Automatic data layout framework for heterogeneous architectures ADHA:用于异构架构的自动数据布局框架
Deepak Majeti, Kuldeep S. Meel, R. Barik, Vivek Sarkar
{"title":"ADHA: Automatic data layout framework for heterogeneous architectures","authors":"Deepak Majeti, Kuldeep S. Meel, R. Barik, Vivek Sarkar","doi":"10.1145/2628071.2628122","DOIUrl":"https://doi.org/10.1145/2628071.2628122","url":null,"abstract":"Data layouts play a crucial role in determining the performance of a given application running on a given architecture. Existing parallel programming frameworks for both multicore and heterogeneous systems leave the onus of selecting a data layout to the programmer. Therefore, shifting the burden of data layout selection to optimizing compilers can greatly enhance programmer productivity and application performance. In this work, we introduce ADHA: a two-level hierarchal formulation of the data layout problem for modern heterogeneous architectures. We have created a reference implementation of ADHA in the Heterogeneous Habanero-C (H2C) parallel programming system. ADHA shows significant performance benefits of up to 6.92× compared to manually specified layouts for two benchmark programs running on a CPU+GPU heterogeneous platform.","PeriodicalId":263670,"journal":{"name":"2014 23rd International Conference on Parallel Architecture and Compilation (PACT)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-07-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116460378","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Preemptive thread block scheduling with online structural runtime prediction for concurrent GPGPU kernels 基于在线结构运行时预测的并发GPGPU内核抢占式线程块调度
Sreepathi Pai, R. Govindarajan, M. J. Thazhuthaveetil
{"title":"Preemptive thread block scheduling with online structural runtime prediction for concurrent GPGPU kernels","authors":"Sreepathi Pai, R. Govindarajan, M. J. Thazhuthaveetil","doi":"10.1145/2628071.2628117","DOIUrl":"https://doi.org/10.1145/2628071.2628117","url":null,"abstract":"Recent NVIDIA Graphics Processing Units (GPUs) can execute multiple kernels concurrently. On these GPUs, the thread block scheduler (TBS) currently uses the FIFO policy to schedule thread blocks of concurrent kernels. We show that the FIFO policy leaves performance to chance, resulting in significant loss of performance and fairness. To improve performance and fairness, we propose use of the preemptive Shortest Remaining Time First (SRTF) policy instead. Although SRTF requires an estimate of runtime of GPU kernels, we show that such an estimate of the runtime can be easily obtained using online profiling and exploiting a simple observation on GPU kernels' grid structure. Specifically, we propose a novel Structural Runtime Predictor. Using a simple Staircase model of GPU kernel execution, we show that the runtime of a kernel can be predicted by profiling only the first few thread blocks. We evaluate an online predictor based on this model on benchmarks from ERCBench, and find that it can estimate the actual runtime reasonably well after the execution of only a single thread block. Next, we design a thread block scheduler that is both concurrent kernel-aware and uses this predictor. We implement the Shortest Remaining Time First (SRTF) policy and evaluate it on two-program workloads from ER-CBench. SRTF improves STP by 1.18× and ANTT by 2.25× over FIFO. When compared to MPMax, a state-of-the-art resource allocation policy for concurrent kernels, SRTF improves STP by 1.16× and ANTT by 1.3×. To improve fairness, we also propose SRTF/Adaptive which controls resource usage of concurrently executing kernels to maximize fairness. SRTF/Adaptive improves STP by 1.12×, ANTT by 2.23× and Fairness by 2.95× compared to FIFO. Overall, our implementation of SRTF achieves system throughput to within 12.64% of Shortest Job First (SJF, an oracle optimal scheduling policy), bridging 49% of the gap between FIFO and SJF.","PeriodicalId":263670,"journal":{"name":"2014 23rd International Conference on Parallel Architecture and Compilation (PACT)","volume":"48 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133446230","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 18
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信