{"title":"Browser workload characterization for an Ajax-based commercial online service","authors":"Shu Xu, Bo Huang, Junyong Ding, J. Dai","doi":"10.1109/IISWC.2009.5306780","DOIUrl":"https://doi.org/10.1109/IISWC.2009.5306780","url":null,"abstract":"The transition to cloud computing and SaaS is a disruptive trend where users can conveniently access the services through browsers at any clients. In addition, with the prevalence of Web 2.0 and AJAX techniques, a browser-based client can have complex application logic and fancy user interface that are comparable to traditional desktop applications. This paper reports the study of workload construction and characterization for browser-based clients, using the Ajax-based web client of Zimbra (a commercial online messaging and collaboration suite). By comparing the various workload behaviors across different Zimbra server datasets, different browsers and different client platforms, it presents the characteristics of a real-life web application, which has significant differences from existing browser benchmarks in the literature. In addition, the platform-independent and browser-independent design of our workload makes it portable across various clients. Finally, this paper also provides valuable insights to the browser internals by analyzing the workload execution, the browser memory footprint and the breakdown of browser sub-modules.","PeriodicalId":387816,"journal":{"name":"2009 IEEE International Symposium on Workload Characterization (IISWC)","volume":"183 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134234794","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
D. Lugones, Daniel Franco, Dolores Rexachs, J. Moure, E. Luque, Eduardo Argollo, Ayose Falcón, Daniel Ortega, P. Faraboschi
{"title":"High-speed network modeling for full system simulation","authors":"D. Lugones, Daniel Franco, Dolores Rexachs, J. Moure, E. Luque, Eduardo Argollo, Ayose Falcón, Daniel Ortega, P. Faraboschi","doi":"10.1109/IISWC.2009.5306799","DOIUrl":"https://doi.org/10.1109/IISWC.2009.5306799","url":null,"abstract":"The widespread adoption of cluster computing systems has shifted the modeling focus from synthetic traffic to realistic workloads to better capture the complex interactions between applications and architecture. In this context, a full-system simulation environment also needs to model the networking component, but the simulation duration that is practically affordable is too short to appropriately stress the networking bottlenecks. In this paper, we present a methodology that overcomes this problem and enables the modeling of interconnection networks while ensuring representative results with fast simulation turnaround. We use standard network tools to extract simplified models that are statistically validated and at the same time compatible with a full system simulation environment. We propose three models with different accuracy vs. speed ratios that compute network latency times according to the estimated traffic and measure them on a real-world parallel scientific application.","PeriodicalId":387816,"journal":{"name":"2009 IEEE International Symposium on Workload Characterization (IISWC)","volume":"515 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132755627","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"On the (dis)similarity of transactional memory workloads","authors":"C. Hughes, James Poe, Amer Qouneh, Tao Li","doi":"10.1109/IISWC.2009.5306790","DOIUrl":"https://doi.org/10.1109/IISWC.2009.5306790","url":null,"abstract":"Programming to exploit the resources in a multicore system remains a major obstacle for both computer and software engineers. Transactional memory offers an attractive alternative to traditional concurrent programming but implementations emerged before the programming model, leaving a gap in the design process. In previous research, transactional microbenchmarks have been used to evaluate designs or lock-based multithreaded workloads have been manually converted into their transactional equivalents; others have even created dedicated transactional benchmarks. Yet, throughout all of the investigations, transactional memory researchers have not settled on a way to describe the runtime characteristics that these programs exhibit; nor has there been any attempt to unify the way transactional memory implementations are evaluated. In addition, the similarity (or redundancy) of these workloads is largely unknown. Evaluating transactional memory designs using workloads that exhibit similar characteristics will unnecessarily increase the number of simulations without contributing new insight. On the other hand, arbitrarily choosing a subset of transactional memory workloads for evaluation can miss important features and lead to biased or incorrect conclusions. In this work, we propose a set of architecture-independent transaction-oriented workload characteristics that can accurately capture the behavior of transactional code. We apply principle component analysis and clustering algorithms to analyze the proposed workload characteristics collected from a set of SPLASH-2, STAMP, and PARSEC transactional memory programs. Our results show that using transactional characteristics to cluster the chosen benchmarks can reduce the number of required simulations by almost half. We also show that the methods presented in this paper can be used to identify specific feature subsets. With the increasing number of TM workloads in the future, we believe that the proposed transactional memory workload characterization techniques will help TM architects select a small, diverse, set of TM workloads for their design evaluation.","PeriodicalId":387816,"journal":{"name":"2009 IEEE International Symposium on Workload Characterization (IISWC)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129821136","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
S. Srinivasan, Zhen Fang, R. Iyer, Steven Zhang, Michael Espig, D. Newell, Daniel Cermak, Yi Wu, I. Kozintsev, H. Haussecker
{"title":"Performance characterization and optimization of mobile augmented reality on handheld platforms","authors":"S. Srinivasan, Zhen Fang, R. Iyer, Steven Zhang, Michael Espig, D. Newell, Daniel Cermak, Yi Wu, I. Kozintsev, H. Haussecker","doi":"10.1109/IISWC.2009.5306788","DOIUrl":"https://doi.org/10.1109/IISWC.2009.5306788","url":null,"abstract":"The introduction of low power general purpose processors (like the Intel® Atom™ processor) expands the capability of handheld and mobile internet devices (MIDs) to include compelling visual computing applications. One rapidly emerging visual computing usage model is known as mobile augmented reality (MAR). In the MAR usage model, the user is able to point the handheld camera to an object (like a wine bottle) or a set of objects (like an outdoor scene of buildings or monuments) and the device automatically recognizes and displays information regarding the object(s). Achieving this on the handheld requires significant compute processing resulting in a response time in the order of several seconds. In this paper, we analyze a MAR workload and identify the primary hotspot functions that incur a large fraction of the overall response time. We also present a detailed architectural characterization of the hotspot functions in terms of CPI, MPI, etc. We then implement and analyze the benefits of several software optimizations: (a) vectorization, (b) multi-threading, (c) cache conflict avoidance and (d) miscellaneous code optimizations that reduce the number of computations. We show that a 3X performance improvement in execution time can be achieved by implementing these optimizations. Overall, we believe our analysis provides a detailed understanding of the processing for a new domain of visual computing workloads (i.e. MAR) running on low power handheld compute platforms.","PeriodicalId":387816,"journal":{"name":"2009 IEEE International Symposium on Workload Characterization (IISWC)","volume":"86 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117270010","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Storage characterization for unstructured data in online services applications","authors":"S. Sankar, Kushagra Vaid","doi":"10.1109/IISWC.2009.5306786","DOIUrl":"https://doi.org/10.1109/IISWC.2009.5306786","url":null,"abstract":"Mega datacenters hosting large scale web services have unique workload attributes that need to be taken into account for optimal service scalability. Provisioning compute and storage resources to provide a seamless user experience is challenging since customer traffic loads vary widely across time and geographies, and the servers hosting these applications have to be rightsized to provide both performance within a single server and across a scale-out cluster. Typical user-facing web services have a three tiered hierarchy — front-end web servers, middle-tier application logic, and back-end data storage and processing layer. In this paper, we address the challenge of disk subsystem design for back-end servers hosting large amounts of unstructured (also called blob) data. Examples of typical content hosted on such servers include user generated content such as photos, email messages, videos, and social networking updates. Specific server applications analyzed in this paper correspond to the message store of a large scale email application, image tile storage for a large scale geo-mapping application, and user content storage for Web 2.0 type applications. We analyze the storage subsystems for these web services in a live production environment and provide an overview of the disk traffic patterns and access characteristics for each of these applications. We then explore time-series characteristics and derive probabilistic models showing state transitions between locations on the data volumes for these applications. We then explore how these probabilistic models could be extended into a framework for synthetic benchmark generation for such applications. Finally, we discuss how this framework can be used for storage subsystem rightsizing for optimal scalability of such backend storage clusters.","PeriodicalId":387816,"journal":{"name":"2009 IEEE International Symposium on Workload Characterization (IISWC)","volume":"91 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131698428","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A communication characterisation of Splash-2 and Parsec","authors":"Nick Barrow-Williams, Christian Fensch, S. Moore","doi":"10.1109/IISWC.2009.5306792","DOIUrl":"https://doi.org/10.1109/IISWC.2009.5306792","url":null,"abstract":"Recent benchmark suite releases such as Parsec specifically utilise the tightly coupled cores available in chip-multiprocessors to allow the use of newer, high performance, models of parallelisation. However, these techniques introduce additional irregularity and complexity to data sharing and are entirely dependent on efficient communication performance between processors. This paper thoroughly examines the crucial communication and sharing behaviour of these future applications. The infrastructure used allows both accurate and comprehensive program analysis, employing a full Linux OS running on a simulated 32-core x86 machine. Experiments use full program runs, with communication classified at both core and thread granularities. Migratory, read-only and producer-consumer sharing patterns are observed and their behaviour characterised. The temporal and spatial characteristics of communication are presented for the full collection of Splash-2 and Parsec benchmarks. Our results aim to support the design of future communication systems for CMPs, encompassing coherence protocols, network-on-chip and thread mapping.","PeriodicalId":387816,"journal":{"name":"2009 IEEE International Symposium on Workload Characterization (IISWC)","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121238911","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Shuai Che, Michael Boyer, Jiayuan Meng, D. Tarjan, J. Sheaffer, Sang-Ha Lee, K. Skadron
{"title":"Rodinia: A benchmark suite for heterogeneous computing","authors":"Shuai Che, Michael Boyer, Jiayuan Meng, D. Tarjan, J. Sheaffer, Sang-Ha Lee, K. Skadron","doi":"10.1109/IISWC.2009.5306797","DOIUrl":"https://doi.org/10.1109/IISWC.2009.5306797","url":null,"abstract":"This paper presents and characterizes Rodinia, a benchmark suite for heterogeneous computing. To help architects study emerging platforms such as GPUs (Graphics Processing Units), Rodinia includes applications and kernels which target multi-core CPU and GPU platforms. The choice of applications is inspired by Berkeley's dwarf taxonomy. Our characterization shows that the Rodinia benchmarks cover a wide range of parallel communication patterns, synchronization techniques and power consumption, and has led to some important architectural insight, such as the growing importance of memory-bandwidth limitations and the consequent importance of data layout.","PeriodicalId":387816,"journal":{"name":"2009 IEEE International Symposium on Workload Characterization (IISWC)","volume":"92 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122269023","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
David August, L. Barnes, Pradeep Dubey, L. Eeckhout, P. Faraboschi, J. Held, M. Hind, Sunpyo Hong, Hillery Hunter, D. Kaeli, Hyesoon Kim, Minjang Kim YoonguKim, Nagesh B. Lakshminarayana, Hsien-Hsin Lee, Jaekyu Lee, Charles Levine, M. Levy, J. Martínez, OnurMutlu Nacho Navarro, J. Paul, S. Patel, Y. Patt, E. Rotenberg, Ravi Soundararajan
{"title":"IISWC 2009 reviewers","authors":"David August, L. Barnes, Pradeep Dubey, L. Eeckhout, P. Faraboschi, J. Held, M. Hind, Sunpyo Hong, Hillery Hunter, D. Kaeli, Hyesoon Kim, Minjang Kim YoonguKim, Nagesh B. Lakshminarayana, Hsien-Hsin Lee, Jaekyu Lee, Charles Levine, M. Levy, J. Martínez, OnurMutlu Nacho Navarro, J. Paul, S. Patel, Y. Patt, E. Rotenberg, Ravi Soundararajan","doi":"10.1109/iiswc.2009.5306802","DOIUrl":"https://doi.org/10.1109/iiswc.2009.5306802","url":null,"abstract":"","PeriodicalId":387816,"journal":{"name":"2009 IEEE International Symposium on Workload Characterization (IISWC)","volume":"223 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133240371","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}