Ryusuke Egawa, J. Tada, Yusuke Endo, H. Takizawa, Hiroaki Kobayashi
{"title":"Poster: Exploring Design Space of a 3D Stacked Vector Cache - Designing a 3D Stacked Vector Cache using Conventional EDA Tools","authors":"Ryusuke Egawa, J. Tada, Yusuke Endo, H. Takizawa, Hiroaki Kobayashi","doi":"10.1109/SC.Companion.2012.271","DOIUrl":"https://doi.org/10.1109/SC.Companion.2012.271","url":null,"abstract":"Although 3D integration technologies with through silicon vias (TSVs) have expected to overcome the memory and power wall problems in the future microprocessor design, there is no promising EDA tools to design 3D integrated VLSIs. In addition, effects of 3D integration on microprocessor design have not been discussed well. Under this situation, this paper presents design approach of 3D stacked cache memories using existing EDA tools, and shows early performances evaluation of 3D stacked cache memories for vector processors.","PeriodicalId":6346,"journal":{"name":"2012 SC Companion: High Performance Computing, Networking Storage and Analysis","volume":"43 1","pages":"1477-1477"},"PeriodicalIF":0.0,"publicationDate":"2012-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91187125","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
K. De, A. Klimentov, S. Panitkin, M. Titov, A. Vaniachine, T. Wenaus, D. Yu, G. Záruba
{"title":"Poster: PanDA: Next Generation Workload Management and Analysis System for Big Data","authors":"K. De, A. Klimentov, S. Panitkin, M. Titov, A. Vaniachine, T. Wenaus, D. Yu, G. Záruba","doi":"10.1109/SC.Companion.2012.302","DOIUrl":"https://doi.org/10.1109/SC.Companion.2012.302","url":null,"abstract":"In real world any big science project implies to use a sophisticated Workload Management System (WMS) that deals with a huge amount of highly distributed data, which is often accessed by large collaborations. The Production and Distributed Analysis System (PanDA) is a high-performance WMS that is aimed to meet production and analysis requirements for a data-driven workload management system capable of operating at the Large Hadron Collider data processing scale. PanDA provides execution environments for a wide range of experimental applications, automates centralized data production and processing, enables analysis activity of physics groups, supports custom workflow of individual physicists, provides a unified view of distributed worldwide resources, presents status and history of workflow through an integrated monitoring system, archives and curates all workflow. PanDA is now being generalized and packaged, as a WMS already proven at extreme scales, for the wider use of the Big Data community.","PeriodicalId":6346,"journal":{"name":"2012 SC Companion: High Performance Computing, Networking Storage and Analysis","volume":"145 1","pages":"1523-1523"},"PeriodicalIF":0.0,"publicationDate":"2012-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78282079","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Light-Weight Data Management Solutions for Visualization and Dissemination of Massive Scientific Datasets - Position Paper","authors":"G. Agrawal, Yunde Su","doi":"10.1109/SC.Companion.2012.157","DOIUrl":"https://doi.org/10.1109/SC.Companion.2012.157","url":null,"abstract":"Many of the `big-data' challenges today are arising from increasing computing ability, as data collected from simulations has become extremely valuable for a variety of scientific endeavors. With growing computational capabilities of parallel machines, scientific simulations are being performed at finer spatial and temporal scales, leading to a data explosion. As a specific example, the Global Cloud-Resolving Model (GCRM) currently has a grid-cell size of 4 km, and already produces 1 petabyte of data for a 10 day simulation. Future plans include simulations with a grid-cell size of 1 km, which will increase the data generation 64 folds. Finer granularity of simulation data offers both an opportunity and a challenge. On one hand, it can allow understanding of underlying phenomenon and features in a way that would not be possible with coarser granularity. On the other hand, larger datasets are extremely difficult to store, manage, disseminate, analyze, and visualize. Neither the memory capacity of parallel machines, memory access speeds, nor disk bandwidths are increasing at the same rate as computing power, contributing to the difficulty in storing, managing, and analyzing these datasets. Simulation data is often disseminated widely, through portals like the Earth System Grid (ESG), and downloaded by researchers all over the world. Such dissemination efforts are hampered by dataset size growth, as wide area data transfer bandwidths are growing at a much slower pace. Finally, while visualizing datasets, human perception is inherently limited.","PeriodicalId":6346,"journal":{"name":"2012 SC Companion: High Performance Computing, Networking Storage and Analysis","volume":"144 1","pages":"1296-1300"},"PeriodicalIF":0.0,"publicationDate":"2012-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78588974","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Building a Climatology of Mountain Gap Wind Jets and Related Coastal Upwelling","authors":"S. Graves, Xiang Li, K. Keiser, Deborah K. Smith","doi":"10.1109/SC.Companion.2012.71","DOIUrl":"https://doi.org/10.1109/SC.Companion.2012.71","url":null,"abstract":"Winds accelerating through coastal topology are capable of generating jets that often result in cold-water upwelling events in near-coast locations. In situ measurements are frequently not available in remote locations for many of the mountain gap locations globally, so to provide a record of these events for researchers, as well as military and commercial interests, this NASA-funded project is demonstrating how remotely sensed satellite data derived products, and fused model and observations, for wind and sea surface temperatures can be used to detect both wind jet and upwelling events. An algorithm was developed to automatically detect gap wind and ocean upwelling events at gulf regions of Central America using the Cross-Calibrated, Multi-Platform (CCMP) ocean surface wind product and the Optimally Interpolated Sea Surface Temperature (OISST) product. Hierarchical thresholding and region growing methods are used to extract regions of strong winds and temperature anomalies. A post processing step further links the detected events to generate time series of these events. Though developed for Central America regions, the algorithm is being extended to apply to other coastal regions so that detected event products are globally consistent. Through collaboration with the Global Hydrology Resource Center (GHRC), a NASA Distribute Active Archive Center, this project is analyzing large climate data records to generate a resulting climatology of wind jet and upwelling events at known geographic locations will be available as a resource for other researchers. Likewise, through integration of the project's analysis techniques with the GHRC's data ingest processing, the identification and notification of new or current events will likewise be openly available to research, commercial and military users. This paper provides a report on the preliminary results of applying the team's approach of identifying and capturing events for selected mountain gap jet locations.","PeriodicalId":6346,"journal":{"name":"2012 SC Companion: High Performance Computing, Networking Storage and Analysis","volume":"324 1","pages":"495-499"},"PeriodicalIF":0.0,"publicationDate":"2012-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76301127","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Seungwoo Rho, Seoyoung Kim, Sangwan Kim, Seokkyoo Kim, Jik-Soo Kim, Soonwook Hwang
{"title":"Poster: HTCaaS: A Large-Scale High-Throughput Computing by Leveraging Grids, Supercomputers and Cloud","authors":"Seungwoo Rho, Seoyoung Kim, Sangwan Kim, Seokkyoo Kim, Jik-Soo Kim, Soonwook Hwang","doi":"10.1109/SC.Companion.2012.176","DOIUrl":"https://doi.org/10.1109/SC.Companion.2012.176","url":null,"abstract":"We present the HTCaaS (High-Throughput Computing as a Service) system which aims to provide researchers with ease of exploring large-scale and complex HTC problems by leveraging Supercomputers, Grids, and Cloud. HTCaaS can hide heterogeneity and complexity of harnessing different types of computing infrastructures from users, and efficiently submit a large number of jobs at once by effectively managing and exploiting of all available computing resources. Our system has been effectively integrated with national Supercomputers in Korea, international computational Grids, and Amazon EC2 resulting in combining a vast amount of computing resources to support most challenging scientific problems.","PeriodicalId":6346,"journal":{"name":"2012 SC Companion: High Performance Computing, Networking Storage and Analysis","volume":"63 1","pages":"1343-1343"},"PeriodicalIF":0.0,"publicationDate":"2012-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73199580","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yujuan Tan, Zhichao Yan, D. Feng, E. Sha, Xiongzi Ge
{"title":"Reducing the De-linearization of Data Placement to Improve Deduplication Performance","authors":"Yujuan Tan, Zhichao Yan, D. Feng, E. Sha, Xiongzi Ge","doi":"10.1109/SC.Companion.2012.110","DOIUrl":"https://doi.org/10.1109/SC.Companion.2012.110","url":null,"abstract":"Data deduplication is a lossless compression technology that replaces the redundant data chunks with pointers pointing to the already-stored ones. Due to this intrinsic data elimination feature, the deduplication commodity would delinearize the data placement and force the data chunks that belong to the same data object to be divided into multiple separate parts. In our preliminary study, it is found that the de-linearization of the data placement would weaken the data spatial locality that is used for improving data read performance, deduplication throughput and efficiency in some deduplication approaches, which significantly affects the deduplication performance. In this paper, we first analyze the negative effect of the de-linearization of data placement to the data deduplication performance with some examples and experimental evidences, and then propose an effective approach to reduce the de-linearization of data placement by sacrificing little compression ratios. The experimental evaluation driven by the real world datasets shows that our proposed approach effectively reduces the de-linearization of the data placement and enhances the data spatial locality, which significantly improves the deduplication performances including deduplication throughput, deduplication efficiency and data read performance, while at the cost of little compression ratios.","PeriodicalId":6346,"journal":{"name":"2012 SC Companion: High Performance Computing, Networking Storage and Analysis","volume":"80 5 pt 1 1","pages":"796-800"},"PeriodicalIF":0.0,"publicationDate":"2012-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74592308","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Software-Defined Networking for Big-Data Science - Architectural Models from Campus to the WAN","authors":"I. Monga, Eric Pouyoul, C. Guok","doi":"10.1109/SC.Companion.2012.341","DOIUrl":"https://doi.org/10.1109/SC.Companion.2012.341","url":null,"abstract":"University campuses, Supercomputer centers and R&E networks are challenged to architect, build and support IT infrastructure to deal effectively with the data deluge facing most science disciplines. Hybrid network architecture, multi-domain bandwidth reservations, performance monitoring and GLIF Open Lightpath Exchanges (GOLE) are examples of network architectures that have been proposed, championed and implemented successfully to meet the needs of science. Most recently, Science DMZ, a campus design pattern that bypasses traditional performance hotspots in typical campus network implementation, has been gaining momentum. In this paper and corresponding demonstration, we build upon the SC11 SCinet Research Sandbox demonstrator with Software-Defined networking to explore new architectural approaches. A virtual switch network abstraction is explored, that when combined with software-defined networking concepts provides the science users a simple, adaptable network framework to meet their upcoming application requirements.","PeriodicalId":6346,"journal":{"name":"2012 SC Companion: High Performance Computing, Networking Storage and Analysis","volume":"37 1","pages":"1629-1635"},"PeriodicalIF":0.0,"publicationDate":"2012-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73835104","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Poster: Evaluating Asynchrony in Gibraltar RAID's GPU Reed-Solomon Coding Library","authors":"Xin Zhou, A. Skjellum, M. Curry","doi":"10.1109/SC.Companion.2012.285","DOIUrl":"https://doi.org/10.1109/SC.Companion.2012.285","url":null,"abstract":"GPU has been utilized for Reed Solomon coding tasks for arbitrary (k+n) RAID system. In this project, we apply the asynchronous design with CUDA in order to run multiple coding tasks simultaneously. Results show significant performance boosts for small block coding tasks by concurrent kernels.","PeriodicalId":6346,"journal":{"name":"2012 SC Companion: High Performance Computing, Networking Storage and Analysis","volume":"39 1","pages":"1498-1498"},"PeriodicalIF":0.0,"publicationDate":"2012-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85062476","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Abstract: Preliminary Report for a High Precision Distributed Memory Parallel Eigenvalue Solver","authors":"Toshiyuki Imamura, S. Yamada, M. Machida","doi":"10.1109/SC.Companion.2012.255","DOIUrl":"https://doi.org/10.1109/SC.Companion.2012.255","url":null,"abstract":"This study covers the design and implementation of a DD (double-double) extended parallel eigenvalue solver, namely QPEigenK. We extended most of underlying numerical software layers from BLAS, LAPACK, and ScaLAPACK as well as MPI. Preliminary results show that QPEigenK performs on several platforms, and shows good accuracy and parallel efficiency. We can conclude that the DD format is reasonable data format instead of real (16) format from the viewpoint of programming and performance.","PeriodicalId":6346,"journal":{"name":"2012 SC Companion: High Performance Computing, Networking Storage and Analysis","volume":"85 1","pages":"1454-1455"},"PeriodicalIF":0.0,"publicationDate":"2012-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82067821","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Crayons: An Azure Cloud Based Parallel System for GIS Overlay Operations","authors":"Dinesh Agarwal","doi":"10.1109/SC.Companion.2012.315","DOIUrl":"https://doi.org/10.1109/SC.Companion.2012.315","url":null,"abstract":"Processing of extremely large polygonal (vector-based) spatial datasets has been a long-standing research challenge for scientists in the Geographic Information Systems and Science (GIS) community. Surprisingly, it is not for the lack of individual parallel algorithm; we discovered that the irregular and data intensive nature of the underlying processing is the main reason for the meager amount of work by way of system design and implementation. Furthermore, of all the systems reported in the literature, very few deal with the complexities of vector-based datasets and none, including commercial systems, on the cloud platform. We have designed and implemented an open-architecture-based system named Crayons for Windows Azure cloud platform using state-of-the-art techniques. We have implemented three different architectures of Crayons with different load balancing schemes. Crayons scales well for sufficiently large data sets, achieving end-to-end absolute speedup of over 28-fold employing 100 Azure processors. For smaller and more irregular workload, it still yields over 10-fold speedup.","PeriodicalId":6346,"journal":{"name":"2012 SC Companion: High Performance Computing, Networking Storage and Analysis","volume":"312 1","pages":"1542-1543"},"PeriodicalIF":0.0,"publicationDate":"2012-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79634756","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}