Baatarsuren Munkhdorj, Keichi Takahashi, Khureltulga Dashdavaa, Yasuhiro Watashiba, Y. Kido, S. Date, S. Shimojo
{"title":"Design and implementation of control sequence generator for SDN-enhanced MPI","authors":"Baatarsuren Munkhdorj, Keichi Takahashi, Khureltulga Dashdavaa, Yasuhiro Watashiba, Y. Kido, S. Date, S. Shimojo","doi":"10.1145/2832099.2832103","DOIUrl":"https://doi.org/10.1145/2832099.2832103","url":null,"abstract":"MPI (Message Passing Interface) offers a suite of APIs for inter-process communication among parallel processes. We have approached to the acceleration of MPI collective communication such as MPI_Bcast and MPI_Allreduce, taking advantage of network programmability brought by Software Defined Networking (SDN). The basic idea is to allow a SDN controller to dynamically control the packet flows generated by MPI collective communication based on the communication pattern and the underlying network conditions. Although our research have succeeded to accelerate an MPI collective communication in terms of execution time, the switching of network control functionality for MPI collective communication along MPI program execution have not been considered yet. This paper presents a mechanism that provides the control sequence for SDN controller to control packet flows based on the communication plan for the entire MPI application. The control sequence encloses a chronologically ordered list of the MPI collectives operated in the MPI application and the process-related information of each in the list. To verify if the SDN-enhanced MPI collectives can be used in combination with the proposed mechanism, the envisioned environment was prototyped. As a result, SDN-enhanced MPI collectives were able to be used in combination.","PeriodicalId":108576,"journal":{"name":"Network-aware Data Management","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128519033","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Hysteresis-based optimization of data transfer throughput","authors":"M. S. Q. Z. Nine, Kemal Guner, T. Kosar","doi":"10.1145/2832099.2832104","DOIUrl":"https://doi.org/10.1145/2832099.2832104","url":null,"abstract":"The achievable throughput for a data transfer can be determined by a variety of factors such as network bandwidth, round trip time, background traffic, dataset size, and end-system configuration. For the best-effort optimization of the transfer throughput, three application-layer transfer parameters -- pipelining, parallelism and concurrency -- have been actively used in the literature. However, it is highly challenging to identify the best combination of these parameter settings for a specific data transfer request. In this paper, we analyze historical data consisting of 70 Million file transfers; apply data mining techniques to extract the hidden relations among the parameters and the optimal throughput; and propose a novel approach based on hysteresis to predict the optimal parameter settings.","PeriodicalId":108576,"journal":{"name":"Network-aware Data Management","volume":"2016 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127423505","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Chengyu Fan, Susmit Shannigrahi, S. DiBenedetto, C. Olschanowsky, C. Papadopoulos, H. Newman
{"title":"Managing scientific data with named data networking","authors":"Chengyu Fan, Susmit Shannigrahi, S. DiBenedetto, C. Olschanowsky, C. Papadopoulos, H. Newman","doi":"10.1145/2832099.2832100","DOIUrl":"https://doi.org/10.1145/2832099.2832100","url":null,"abstract":"Many scientific domains, such as climate science and High Energy Physics (HEP), have data management requirements that are not well supported by the IP network architecture. Named Data Networking (NDN) is a new network architecture whose service model is better aligned with the needs of data-oriented applications. NDN provides features such as best-location retrieval, caching, load sharing, and transparent failover that would otherwise be painstakingly (re-)implemented by each application using point-to-point semantics in an IP network.\u0000 We present the first scientific data management application designed and implemented on top of NDN. We use this application to manage climate and HEP data over a dedicated, high-performance, testbed. Our application has two main components: a UI for dataset discovery queries and a federation of synchronized name catalogs. We show how NDN primitives can be used to implement common data management operations such as publishing, search, efficient retrieval, and publication access control.","PeriodicalId":108576,"journal":{"name":"Network-aware Data Management","volume":"100 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133957234","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
S. Tepsuporn, Fatma Alali, M. Veeraraghavan, Xiang Ji, Brian Cashman, Andrew J. Ragusa, Luke Fowler, C. Guok, T. Lehman, Xi Yang
{"title":"A multi-domain SDN for dynamic layer-2 path service","authors":"S. Tepsuporn, Fatma Alali, M. Veeraraghavan, Xiang Ji, Brian Cashman, Andrew J. Ragusa, Luke Fowler, C. Guok, T. Lehman, Xi Yang","doi":"10.1145/2832099.2832101","DOIUrl":"https://doi.org/10.1145/2832099.2832101","url":null,"abstract":"This paper describes our experience in deploying a multidomain Software-Defined Network (SDN) that supports dynamic Layer-2 (L2) path service, and offers insights gained from this experience. SDN controllers, capable of handling requests for advance-reservation and provisioning of rate-guaranteed L2 paths, were deployed in each domain. The experience demonstrated that this architecture can support global-scale multi-domain dynamic L2 path service. However, to reach this scale, better tools are required for diagnostics of end-to-end L2 connectivity, and better error-reporting functionality is needed from the SDN controllers. As a use case for rate-guaranteed L2 path service, we experimented with high-speed large dataset transfers. We found that a combination of Circuit TCP (CTCP), in which the sending rate is held fixed, and a token bucket filter based rate shaper at the sending host, is best to achieve almost 0-loss, high-throughput transfers across L2 paths. Detailed studies were conducted to understand the impact of the rate-shaper and CTCP parameters to find the best settings.","PeriodicalId":108576,"journal":{"name":"Network-aware Data Management","volume":"316 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132187140","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Approximate causal consistency for partially replicated geo-replicated cloud storage","authors":"A. Kshemkalyani, T. Hsu","doi":"10.1145/2832099.2832102","DOIUrl":"https://doi.org/10.1145/2832099.2832102","url":null,"abstract":"In geo-replicated systems and the cloud, data replication provides fault tolerance and low latency. Causal consistency in such systems is an interesting consistency model. Most existing works assume the data is fully replicated because this greatly simplifies the design of the algorithms to implement causal consistency. Recently, we proposed causal consistency under partial replication because it reduces the number of messages used under a wide range of workloads. One drawback of partial replication is that its meta-data tends to be relatively large when the message size is small. In this paper, we propose approximate causal consistency whereby we can reduce the meta-data at the cost of some violations of causal consistency. The amount of violations can be made arbitrarily small by controlling a tunable parameter, that we call credits.","PeriodicalId":108576,"journal":{"name":"Network-aware Data Management","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127984304","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Network-aware virtual machine consolidation for large data centers","authors":"Dharmesh Kakadia, N. Kopri, Vasudeva Varma","doi":"10.1145/2534695.2534702","DOIUrl":"https://doi.org/10.1145/2534695.2534702","url":null,"abstract":"Resource management in modern data centers has become a challenging task due to the tremendous growth of data centers. In large virtual data centers, performance of applications is highly dependent on the communication bandwidth available among virtual machines. Traditional algorithms either do not consider network I/O details of the applications or are computationally intensive. We address the problem of identifying the virtual machine clusters based on the network traffic and placing them intelligently in order to improve the application performance and optimize the network usage in large data center. We propose a greedy consolidation algorithm that ensures the number of migrations is small and the placement decisions are fast, which makes it practical for large data centers. We evaluated our approach on real world traces from private and academic data centers, using simulation and compared the existing algorithms on various parameters like scheduling time, performance improvement and number of migrations. We observed a ~70% savings of the interconnect bandwidth and overall ~60% improvements in the applications performances. Also, these improvements were produced within a fraction of scheduling time and number of migrations.","PeriodicalId":108576,"journal":{"name":"Network-aware Data Management","volume":"106 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132400998","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Geoffroy R. Vallée, S. Atchley, Youngjae Kim, G. Shipman
{"title":"End-to-end data movement using MPI-IO over routed terabits infrastructures","authors":"Geoffroy R. Vallée, S. Atchley, Youngjae Kim, G. Shipman","doi":"10.1145/2534695.2534705","DOIUrl":"https://doi.org/10.1145/2534695.2534705","url":null,"abstract":"Scientific discovery is nowadays driven by large-scale simulations running on massively parallel high-performance computing (HPC) systems. These applications each generate a large amount of data, which then needs to be post-processed for example for data mining or visualization. Unfortunately, the computing platform used for post processing might be different from the one on which the data is initially generated, introducing the challenge of moving large amount of data between computing platforms. This is especially challenging when these two platforms are geographically separated since the data needs to be moved between computing facilities. This is even more critical when scientists tightly couple their domain specific applications with a post processing application.\u0000 The paper presents a solution for the data transfer between MPI applications using a dedicated wide area network (WAN) terabit infrastructure. The proposed solution is based on parallel access to data files and the Message Passing Interface (MPI) over the Common Communication Infrastructure (CCI) for the data transfer over a routed infrastructure. In the context of this research, the Energy Sciences Network (ESnet) of the U.S. Department of Energy (DOE) is targeted for the transfer of data between DOE national laboratories.","PeriodicalId":108576,"journal":{"name":"Network-aware Data Management","volume":"44 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131576135","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"In-network, push-based network resource monitoring: scalable, responsive network management","authors":"Taylor L. Groves, D. Arnold, Yihua He","doi":"10.1145/2534695.2534704","DOIUrl":"https://doi.org/10.1145/2534695.2534704","url":null,"abstract":"We present preliminary work from our experiences with distributed, push-based monitoring of networks at Yahoo!. Network switches have grown beyond mere ASICs into machines which support unmodified Linux kernels and familiar user interfaces. These advances have enabled a paradigm shift in network monitoring. In lieu of traditional approaches where network diagnostics were delivered via SNMP we utilize Sysdb of Arista's EOS to implement a push based approach to network monitoring. This leaves the individual switches in charge of determining what monitoring data to send and when to send it. With this approach -- on-switch collection, dissemination, and analysis of interfaces and protocols become possible. This push based approach reduces the feedback loop of network diagnostics and enables networkaware applications, middleware and resource managers to have access to the freshest available data.\u0000 Our work utilizes the OpenTSDB monitoring framework to provide a scalable back-end for accessing and storing real-time statistics delivered by on-switch collection agents. OpenTSDB is built on top of Hadoop/HBase, which handles the underlying access and storage for the monitoring system. We wrote two collection agents as prototypes to explore the framework and demonstrate the benefits of push based network monitoring.","PeriodicalId":108576,"journal":{"name":"Network-aware Data Management","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128573280","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Efficient wide area data transfer protocols for 100 Gbps networks and beyond","authors":"E. Kissel, D. M. Swany, B. Tierney, Eric Pouyoul","doi":"10.1145/2534695.2534699","DOIUrl":"https://doi.org/10.1145/2534695.2534699","url":null,"abstract":"Due to a number of recent technology developments, now is the right time to re-examine the use of TCP for very large data transfers. These developments include the deployment of 100 Gigabit per second (Gbps) network backbones, hosts that can easily manage 40 Gbps, and higher, data transfers, the Science DMZ model, the availability of virtual circuit technology, and wide-area Remote Direct Memory Access (RDMA) protocols. In this paper we show that RDMA works well over wide-area virtual circuits, and uses much less CPU than TCP or UDP. We also characterize the limitations of RDMA in the presence of other traffic, including competing RDMA flows. We conclude that RDMA for Science DMZ to Science DMZ transfers of massive data is a viable and desirable option for high-performance data transfer.","PeriodicalId":108576,"journal":{"name":"Network-aware Data Management","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123942179","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Nathan Hanford, V. Ahuja, Mehmet Balman, M. Farrens, D. Ghosal, Eric Pouyoul, B. Tierney
{"title":"Characterizing the impact of end-system affinities on the end-to-end performance of high-speed flows","authors":"Nathan Hanford, V. Ahuja, Mehmet Balman, M. Farrens, D. Ghosal, Eric Pouyoul, B. Tierney","doi":"10.1145/2534695.2534697","DOIUrl":"https://doi.org/10.1145/2534695.2534697","url":null,"abstract":"Multi-core end-systems use Receive Side Scaling (RSS) to parallelize protocol processing. RSS uses a hash function on the standard flow descriptors and an indirection table to assign incoming packets to receive queues which are pinned to specific cores. This ensures flow affinity in that the interrupt processing of all packets belonging to a specific flow is processed by the same core. A key limitation of standard RSS is that it does not consider the application process that consumes the incoming data in determining the flow affinity. In this paper, we carry out a detailed experimental analysis of the performance impact of the application affinity in a 40 Gbps testbed network with a dual hexa-core end-system. We show, contrary to conventional wisdom, that when the application process and the flow are affinitized to the same core, the performance (measured in terms of end-to-end TCP throughput) is significantly lower than the line rate. Near line rate performance is observed when the flow and the application process are affinitized to different cores belonging to the same socket. Furthermore, affinitizing the application and the flow to cores on different sockets results in significantly lower throughput than the line rate. These results arise due to the memory bottleneck, which is demonstrated using preliminary correlational data on the cache hit rate in the core that services the application process.","PeriodicalId":108576,"journal":{"name":"Network-aware Data Management","volume":"63 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115925305","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}