Daniel Higuero, Juan M. Tirado, Florin Isaila, J. Carretero
{"title":"Enhancing File Transfer Scheduling and Server Utilization in Data Distribution Infrastructures","authors":"Daniel Higuero, Juan M. Tirado, Florin Isaila, J. Carretero","doi":"10.1109/MASCOTS.2012.55","DOIUrl":"https://doi.org/10.1109/MASCOTS.2012.55","url":null,"abstract":"This paper presents a methodology for efficiently solving the file transfer scheduling problem in a distributed environment. Our solution is based on the relaxation of an objective-based time-indexed formulation of a linear programming problem. The main contributions of this paper are the following. First, we introduce a novel approach to the relaxation of the time-indexed formulation of the transfer scheduling problem in multi-server and multi-user environments. Our solution consists of reducing the complexity of the optimization by transforming it into an approximation problem, whose proximity to the optimal solution can be controlled depending on practical and computational needs. Second, we present a distributed deployment of our methodology, which leverages the inherent parallelism of the divide-and-conquer approach in order to speed-up the solving process. Third, we demonstrate that our methodology is able to considerably reduce the schedule length and idle time in a computationally tractable way.","PeriodicalId":278764,"journal":{"name":"2012 IEEE 20th International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems","volume":"81 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126256502","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Measurement Study of Network Coding in Peer-to-Peer Video-on-Demand Systems","authors":"Saikat Sarkar, Mea Wang","doi":"10.1109/MASCOTS.2012.54","DOIUrl":"https://doi.org/10.1109/MASCOTS.2012.54","url":null,"abstract":"In recent years, Peer-to-Peer (P2P) multimedia streaming has become an alternative to cable/satellite TV services. Many P2P streaming applications further provide users with DVD-like operations: play, pause, chapter selection, fast-forward, and rewind. Such a real-time interactive multimedia streaming is commonly referred to as the P2P Video-on-Demand (VoD). To further improve the streaming quality, recent research employs network coding as the key enabling technology. However, the practicality and implementation challenges of network coding received very little attention. In this paper, we present a practical implementation of network coding in a P2P VoD system, VoD+NC, based on which we conduct a measurement study on the actual performance gain provided by network coding. In the meantime, we identify design pitfalls when incorporating network coding into a P2P VoD system. Our study shows that, unlike P2P live streaming, directly applying network coding to a P2P VoD system does not necessarily lead to an immediate improvement in playback quality. With the proper configuration, network coding not only brings the same benefits as it does in P2P live streaming, but also better accommodates the asymmetric interests among peers and simplifies the neighbourhood management.","PeriodicalId":278764,"journal":{"name":"2012 IEEE 20th International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114832814","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
J. Frank, E. L. Miller, I. Adams, Daniel C. Rosenthal
{"title":"Evolutionary Trends in a Supercomputing Tertiary Storage Environment","authors":"J. Frank, E. L. Miller, I. Adams, Daniel C. Rosenthal","doi":"10.1109/MASCOTS.2012.53","DOIUrl":"https://doi.org/10.1109/MASCOTS.2012.53","url":null,"abstract":"Tracking archival usage and data migration in a long term supercomputing system is critical to understanding not only how users' needs and habits have changed over time, but also how the archive itself evolves in response to these external factors. Yet this type of study has not previously been performed. To address this need, we conducted an in-depth comparison of user initiated file activity on the mass storage system (MSS) at the National Center for Atmospheric Research (NCAR) during two periods, one in the early 1990s, and another nearly twenty years later. In addition to confirming earlier findings, our analysis turned up three surprising results. First, the read: write ratio went from 2:1 in the earlier trace to 1:2 in the later trace, a reduction of a factor of four in reads relative to writes. Second, only 30% of the current archive was accessed during the three year period of the study, in stark contrast to the 80% seen in the 1992 trace analysis. Third, access latency to the first byte of data actually got slower despite much faster computers and storage devices. These findings indicate that archival behavior has shifted towards a write-heavy workload, and that future archives can be more optimized for write activity than previously believed. Furthermore it may be worth considering the value of data being archived when it is stored, since later retrieval is increasingly less likely.","PeriodicalId":278764,"journal":{"name":"2012 IEEE 20th International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114849813","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Min Li, Shravan Gaonkar, A. Butt, Deepak Kenchammana, K. Voruganti
{"title":"Cooperative Storage-Level De-duplication for I/O Reduction in Virtualized Data Centers","authors":"Min Li, Shravan Gaonkar, A. Butt, Deepak Kenchammana, K. Voruganti","doi":"10.1109/MASCOTS.2012.33","DOIUrl":"https://doi.org/10.1109/MASCOTS.2012.33","url":null,"abstract":"Data centers are increasingly being re-designed for workload consolidation in order to reap the benefits of better resource utilization, power savings, and physical space savings. Among the forces driving savings are server and storage virtualization technologies. As more consolidated workloads are concentrated on physical machines -- e.g., the virtual density is already very high in virtual desktop environments, and will be driven to unprecedented levels with the fast growing highcore counts of physical servers -- the shared storage layer must respond with virtualization innovations of its own such as de-duplication and thin provisioning. A key insight of this paper is that there is a greater synergy between the two layers of storage and server virtualization to exploit block sharing information than was previously thought possible. We reveal this via developing a systematic framework to explore the storage and virtualization servers interactions. We also quantitatively evaluate the I/O bandwidth and latency reduction that is possible between virtual machine hosts and storage servers using real-world trace driven simulation. Moreover, we present a proof of concept NFS implementation that incorporates our techniques to quantify their I/O latency benefits.","PeriodicalId":278764,"journal":{"name":"2012 IEEE 20th International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114986033","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Multi-armed Bandit Congestion Control in Multi-hop Infrastructure Wireless Mesh Networks","authors":"A. A. Islam, S. Alam, V. Raghunathan, S. Bagchi","doi":"10.1109/MASCOTS.2012.14","DOIUrl":"https://doi.org/10.1109/MASCOTS.2012.14","url":null,"abstract":"Congestion control in multi-hop infrastructure wireless mesh networks is both an important and a unique problem. It is unique because it has two prominent causes of failed transmissions which are difficult to tease apart - lossy nature of wireless medium and high extent of congestion around gateways in the network. The concurrent presence of these two causes limits applicability of already available congestion control mechanisms, proposed for wireless networks. Prior mechanisms mainly focus on the former cause, ignoring the latter one. Therefore, we address this issue to design an end-to-end congestion control mechanism for infrastructure wireless mesh networks in this paper. We formulate the congestion control problem and map that to the restless multi-armed bandit problem, a well-known decision problem in the literature. Then, we propose three myopic policies to achieve a near-optimal solution for the mapped problem since no optimal solution is known to this problem. We perform comparative evaluation through ns-2 simulation and a real testbed experiment with a wireline TCP variant and a wireless TCP protocol. The evaluation reveals that our proposed mechanism can achieve up to 52% increased network throughput and 34% decreased average energy consumption per transmitted bit in comparison to the other end-to-end congestion control variants.","PeriodicalId":278764,"journal":{"name":"2012 IEEE 20th International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems","volume":"81 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126237074","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
I. S. Sette, Bruno Cartaxo, Thun Pin T. F. Chiu, A. Silva-Filho, R. Assad, José Dirceu G. Ramos, H. Coutinho
{"title":"Analysis of Prediction and Replacement Algorithms Applied to Real Workload for Storage Devices","authors":"I. S. Sette, Bruno Cartaxo, Thun Pin T. F. Chiu, A. Silva-Filho, R. Assad, José Dirceu G. Ramos, H. Coutinho","doi":"10.1109/MASCOTS.2012.67","DOIUrl":"https://doi.org/10.1109/MASCOTS.2012.67","url":null,"abstract":"This work evaluates cache algorithms for block devices in terms of hit rate for different replacement and prediction algorithms, and also focuses on choosing suitable replacement and prediction strategies to be implemented in a storage array caching solution employing solid-state drives. As case study, a real workload was continuously collected from web proxy server network. Such replacement and prediction algorithms were evaluated in detail and compared between them considering traced SCSI commands collected from Linux kernel. Comparison results for replacement algorithms LRU, CLOCK, LRFU and LRU-WAR applied to real workload indicates that LRU obtained good results in the majority of the cases. GHB and ReadAhead prefetching algorithms also were integrated and improvements of about 5.5% were achieved when ReadAhead was used.","PeriodicalId":278764,"journal":{"name":"2012 IEEE 20th International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125115867","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Assuring Demanded Read Performance of Data Deduplication Storage with Backup Datasets","authors":"Youngjin Nam, Dongchul Park, D. Du","doi":"10.1109/MASCOTS.2012.32","DOIUrl":"https://doi.org/10.1109/MASCOTS.2012.32","url":null,"abstract":"Data deduplication has been widely adopted in contemporary backup storage systems. It not only saves storage space considerably, but also shortens the data backup time significantly. Since the major goal of the original data deduplication lies in saving storage space, its design has been focused primarily on improving write performance by removing as many duplicate data as possible from incoming data streams. Although fast recovery from a system crash relies mainly on read performance provided by deduplication storage, little investigation into read performance improvement has been made. In general, as the amount of deduplicated data increases, write performance improves accordingly, whereas associated read performance becomes worse. In this paper, we newly propose a deduplication scheme that assures demanded read performance of each data stream while achieving its write performance at a reasonable level, eventually being able to guarantee a target system recovery time. For this, we first propose an indicator called cache aware Chunk Fragmentation Level (CFL) that estimates degraded read performance on the fly by taking into account both incoming chunk information and read cache effects. We also show a strong correlation between this CFL and read performance in the backup datasets. In order to guarantee demanded read performance expressed in terms of a CFL value, we propose a read performance enhancement scheme called selective duplication that is activated whenever the current CFL becomes worse than the demanded one. The key idea is to judiciously write non-unique (shared) chunks into storage together with unique chunks unless the shared chunks exhibit good enough spatial locality. We quantify the spatial locality by using a selective duplication threshold value. Our experiments with the actual backup datasets demonstrate that the proposed scheme achieves demanded read performance in most cases at the reasonable cost of write performance.","PeriodicalId":278764,"journal":{"name":"2012 IEEE 20th International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems","volume":"97 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130694094","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Energy-Aware Replica Selection for Data-Intensive Services in Cloud","authors":"Bo Li, S. Song, Ivona Bezáková, K. Cameron","doi":"10.1109/MASCOTS.2012.66","DOIUrl":"https://doi.org/10.1109/MASCOTS.2012.66","url":null,"abstract":"With the increasing energy cost in data centers, an energy efficient approach to provide data intensive services in the cloud is highly in demand. This paper solves the energy cost reduction problem of data centers by formulating an energy-aware replica selection problem in order to guide the distribution of workload among data centers. The current popular centralized replica selection approaches address such problem but they lack scalability and are vulnerable to a crash of the central coordinator. Also, they do not take total data center energy cost as the primary optimization target. We propose a simple decentralized replica selection system implemented with two distributed optimization algorithms (consensus-based distributed projected subgradient method and Lagrangian dual decomposition method) to work with clients as a decentralized coordinator. We also compare our energy-aware replica selection approach with the replica selection where a round-robin algorithm is implemented. A prototype of the decentralized replica selection system is designed and developed to collect energy consumption information of data centers. The results show that the total energy cost can be effectively reduced by using our decentralized replica selection system comparing with a round-robin method. It also has low calculation and communication overhead and can be easily adapted to the real world cloud environment.","PeriodicalId":278764,"journal":{"name":"2012 IEEE 20th International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131934121","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Energy-efficient Resource Management for QoS-guaranteed Computing Clusters","authors":"Kaiqi Xiong","doi":"10.1109/MASCOTS.2012.70","DOIUrl":"https://doi.org/10.1109/MASCOTS.2012.70","url":null,"abstract":"A cluster computing system in data centers not only improves service availability and performance but also increase power consumption. It is a challenge to increase the performance of a cluster computing system and reduce its power consumption simultaneously. MapReduce has recently evolved in data-intensive parallel computing. It is a programming model for processing large data sets. The implementation of MapReduce typically runs on a large scale of cluster computing systems consisting of thousands of commodity machines simply called MapReduce clusters that results in high power consumption, which is a major concern by service providers such as Amazon and Yahoo. In this research, we consider a collection of cluster computing resources owned by a service provider to host an enterprise application for business customers. We investigate the problem of resource allocation for power management in MapReduce clusters. Specifically, we propose resource allocation approaches to minimizing the mean end-to-end delay of customer jobs or services under the constraints of the energy consumption and the availability of MapReduce clusters and to minimizing the energy consumption of MapReduce clusters under the availability of MapReduce clusters and the mean end-to-end delay of customer jobs or services that play an essential role in the delivery of quality of services (QoS) for customer services.. Numerical experiments demonstrate that the proposed approaches are applicable and efficient to solve these resource allocation problems for power management in MapReduce clusters.","PeriodicalId":278764,"journal":{"name":"2012 IEEE 20th International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127681433","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yuan Tian, S. Klasky, Weikuan Yu, H. Abbasi, Bin Wang, N. Podhorszki, R. Grout, M. Wolf
{"title":"SMART-IO: SysteM-AwaRe Two-Level Data Organization for Efficient Scientific Analytics","authors":"Yuan Tian, S. Klasky, Weikuan Yu, H. Abbasi, Bin Wang, N. Podhorszki, R. Grout, M. Wolf","doi":"10.1109/MASCOTS.2012.30","DOIUrl":"https://doi.org/10.1109/MASCOTS.2012.30","url":null,"abstract":"Current I/O techniques have pushed the write performance close to the system peak, but they usually overlook the read side of problem. With the mounting needs of scientific discovery, it is important to provide good read performance for many common access patterns. Such demand requires an organization scheme that can effectively utilize the underlying storage system. However, the mismatch between conventional data layout on disk and common scientific access patterns leads to significant performance degradation when a subset of data is accessed. To this end, we design a system-aware Optimized Chunking model, which aims to find an optimized organization that can strike for a good balance between data transfer efficiency and processing overhead. To enable such model for scientific applications, we propose SMART-IO, a two-level data organization framework that can organize the blocks of multidimensional data efficiently. This scheme can adapt data layouts based on data characteristics and underlying storage systems, and enable efficient scientific analytics. Our experimental results demonstrate that SMART-IO can significantly improve the read performance for challenging access patterns, and speed up data analytics. For a mission critical combustion simulation code S3D, Smart-IO achieves up to 72 times speedup for planar reads of a 3-D variable compared to the logically contiguous data layout.","PeriodicalId":278764,"journal":{"name":"2012 IEEE 20th International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115383748","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}