{"title":"Genesis: a language for generating synthetic training programs for machine learning","authors":"A. Chiu, Joseph Garvey, T. Abdelrahman","doi":"10.1145/2742854.2742883","DOIUrl":"https://doi.org/10.1145/2742854.2742883","url":null,"abstract":"We describe Genesis, a language for the generation of synthetic programs for use in machine learning-based performance auto-tuning. The language allows users to annotate a template program to customize its code using statistical distributions and to generate program instances based on those distributions. This effectively allows users to generate training programs whose characteristics or features vary in a statistically controlled fashion. We describe the language constructs, a prototype preprocessor for the language, and three case studies that show the ability of Genesis to express a range of training programs in different domains. We evaluate the preprocessor's performance and the statistical quality of the samples it generates. We believe that Genesis is a useful tool for generating large and diverse sets of programs, a necessary component when training machine learning models for auto-tuning.","PeriodicalId":417279,"journal":{"name":"Proceedings of the 12th ACM International Conference on Computing Frontiers","volume":"416 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-05-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122792638","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Long Li, Nongda Hu, Ke Liu, Binzhang Fu, Mingyu Chen, Lixin Zhang
{"title":"AMTCP: an adaptive multi-path transmission control protocol","authors":"Long Li, Nongda Hu, Ke Liu, Binzhang Fu, Mingyu Chen, Lixin Zhang","doi":"10.1145/2742854.2742871","DOIUrl":"https://doi.org/10.1145/2742854.2742871","url":null,"abstract":"Enabling multiple paths in datacenter networks is a common practice to improve the performance and robustness. Multi-path TCP (MPTCP) explores multiple paths by splitting a single flow into multiple subflows. The number of the subflows in MPTCP is determined before a connection is established, and it usually remains unchanged during the lifetime of that connection. While MPTCP improves both bandwidth efficiency and network reliability, more subflows incur additional overhead, especially for small (so-called mice) subflows. Additionally, it is difficult to choose the appropriate number of the subflows for each TCP connection to achieve good performance without incurring significant overhead. To address this problem, we propose an adaptive multi-path transmission control protocol, namely the AMTCP, which dynamically adjusts the number of the subflows according to application workloads. Specifically, AMTCP divides the time into small intervals and measures the throughput of each subflow over the latest interval, then adjusts the number of the subflows dynamically with the goal of reducing resource and scheduling overheads for mice flows and achieving a higher throughput for elephant flows. Our evaluations show that AMTCP increases the throughput by over 30% compared to conventional TCP. Meanwhile, AMTCP decreases the average number of the subflows by more than 37.5% while achieving a similar throughput compared to MPTCP.","PeriodicalId":417279,"journal":{"name":"Proceedings of the 12th ACM International Conference on Computing Frontiers","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-05-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122928581","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yun Gao, W. Zhou, Jizhong Han, Dan Meng, Zhang Zhang, Zhiyong Xu
{"title":"An evaluation and analysis of graph processing frameworks on five key issues","authors":"Yun Gao, W. Zhou, Jizhong Han, Dan Meng, Zhang Zhang, Zhiyong Xu","doi":"10.1145/2742854.2742884","DOIUrl":"https://doi.org/10.1145/2742854.2742884","url":null,"abstract":"With the continuously emerging applications in fields like social media analysis, mining massive graphs has drawn increasing attentions from industry and academia. To aid the development of distributed graph algorithms, various programming frameworks have been proposed. To better understand their performance differences under specific scenarios, we analyzed and compared a set of seven representative frameworks under five design aspects, including distribution policy, on-disk data organization, programming model, synchronization policy and message model. Our experiments reveal some interesting phenomena. For example, We observed that the vertex-cut method overweighs the edge-cut method on neighbor-based algorithms while leads to inefficiency for non-neighbor-based algorithms. Furthermore, we observed that using asynchronous update can reduce the total workload by 20% to 30%, but the processing time may still doubled due to fine-grained lock conflicts. Overall, we analyzed the pros and cons of each option for the five key issues. We believe our findings will help end-users choose a suitable framework, and designers improve current ones.","PeriodicalId":417279,"journal":{"name":"Proceedings of the 12th ACM International Conference on Computing Frontiers","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-05-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117275214","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A practical framework for real-time diffusion analysis in social media","authors":"Miki Enoki, Issei Yoshida, M. Oguchi","doi":"10.1145/2742854.2742899","DOIUrl":"https://doi.org/10.1145/2742854.2742899","url":null,"abstract":"In a microblogging service such as Twitter, timely knowledge about what kinds of information are diffusing in social media is quite important for companies. It is also effective to identify the influential users who are retweeted frequently by many users. We are now developing an information diffusion analysis system that enables real-time analysis of streaming social data. However, streaming data is usually divided into segments called windows. The window size is decided by the amount of data or a length of time. This means that we have to use fragmented diffusion data for our diffusion analysis. We propose a customized time-window model by effectively estimating diffusion extinction, which enables an early decision to remove stale data from in-memory data store. We evaluate our implementation in terms of both the efficiency of query processing and effectiveness of our time-window model.","PeriodicalId":417279,"journal":{"name":"Proceedings of the 12th ACM International Conference on Computing Frontiers","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-05-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115136191","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
V. Vassiliadis, Charalampos Chalios, K. Parasyris, C. Antonopoulos, S. Lalis, Nikolaos Bellas, H. Vandierendonck, Dimitrios S. Nikolopoulos
{"title":"A significance-driven programming framework for energy-constrained approximate computing","authors":"V. Vassiliadis, Charalampos Chalios, K. Parasyris, C. Antonopoulos, S. Lalis, Nikolaos Bellas, H. Vandierendonck, Dimitrios S. Nikolopoulos","doi":"10.1145/2742854.2742857","DOIUrl":"https://doi.org/10.1145/2742854.2742857","url":null,"abstract":"Approximate execution is a viable technique for energy-constrained environments, provided that applications have the mechanisms to produce outputs of the highest possible quality within the given energy budget. We introduce a framework for energy-constrained execution with controlled and graceful quality loss. A simple programming model allows users to express the relative importance of computations for the quality of the end result, as well as minimum quality requirements. The significance-aware runtime system uses an application-specific analytical energy model to identify the degree of concurrency and approximation that maximizes quality while meeting user-specified energy constraints. Evaluation on a dual-socket 8-core server shows that the proposed framework predicts the optimal configuration with high accuracy, enabling energy-constrained executions that result in significantly higher quality compared to loop perforation, a compiler approximation technique.","PeriodicalId":417279,"journal":{"name":"Proceedings of the 12th ACM International Conference on Computing Frontiers","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-05-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116833893","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Cooperative repair based on tree structure for multiple failures in distributed storage systems with regenerating codes","authors":"Xiaoqiang Pei, Yijie Wang, Xingkong Ma, Yongquan Fu, Fangliang Xu","doi":"10.1145/2742854.2742869","DOIUrl":"https://doi.org/10.1145/2742854.2742869","url":null,"abstract":"Regenerating codes have been proposed to achieve an optimal trade-off curve between the amount of storage space and the network traffic for repair. However, existing repair schemes based on regenerating codes are inadequate to meet the requirements of small network traffic cost and high efficiency when repairing multiple failures. In this paper, we propose a cooperative repair scheme based on tree structure for multiple failures with regenerating codes, called CTREE. For generality, we propose a two-layer repair framework to support both repairs for single and multiple failures. For high repair efficiency, a parallel tree-structured data transmission technique is proposed to organize the data transmissions between the providers and newcomers. For small network network traffic cost, a core-based data exchange technique is proposed to organize the data exchanges between the coordinator and the other newcomers. To evaluate the performance of CTREE, we conduct experiments on both 30 physical and 200 virtual servers. Numerical analysis and extensive experiments confirm that CTREE can support both single and multiple failure repairs, significantly reduces the network traffic cost and improves the repair efficiency compared with the state-of-the-art approaches under various parameter settings.","PeriodicalId":417279,"journal":{"name":"Proceedings of the 12th ACM International Conference on Computing Frontiers","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-05-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131183291","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Muhammad Imran, M. Collier, P. Landais, K. Katrinis
{"title":"HOSA: hybrid optical switch architecture for data center networks","authors":"Muhammad Imran, M. Collier, P. Landais, K. Katrinis","doi":"10.1145/2742854.2742877","DOIUrl":"https://doi.org/10.1145/2742854.2742877","url":null,"abstract":"Optical interconnect is a fundamental requisite to realize Internet-scale data centers due to capabilities and benefits of optical devices. Optical interconnects are energy efficient and offer massive bandwidth support. State of the art interconnects can be divided into three types based on the optical technology used: 1) micro-electromechanical system (MEMS) optical cross connects (OXCs), 2) arrayed waveguide grating routers (AWGRs) and 3) semiconductor optical amplifiers (SOAs). MEMS switches are based on mature technology, have low insertion loss and cross-talk, and are data rate independent. They are also the most scalable and the cheapest class of optical switches. However, the reconfiguration time of these switches is in the order of tens of milliseconds. An AWGR switch is a passive device and works in conjunction with tunable wavelength converters (TWCs) or tunable lasers (TLs) while an SOA works as a gate element that manipulates light and also compensates for losses that occur during transmission of optical signals. AWGR and SOA switches have switching time in the range of nanoseconds but they are expensive as compared to MEMS. In this paper, we propose a novel all optical core interconnection scheme that utilizes potentials of both slow and fast optical switches. The core idea is to route traffic through slow or fast optical switch so that minimum end-to-end latency is achieved. Our architecture employs a single stage topology which allows our design to both incrementally scaled up (in capacity) and scaled out (in the number of racks) without requiring major re-cabling and network reconfiguration. We evaluate performance of the system using simulation and investigate a trade-off between cost and power consumption by comparing it with other well known interconnects. Our technique demonstrates a considerable improvement in power consumption and low latency with high throughput is achieved.","PeriodicalId":417279,"journal":{"name":"Proceedings of the 12th ACM International Conference on Computing Frontiers","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-05-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126844965","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Achieving high throughput and low delay in mobile data networks by accurately predicting queue lengths","authors":"Ke Liu, Jack Y. B. Lee","doi":"10.1145/2742854.2742875","DOIUrl":"https://doi.org/10.1145/2742854.2742875","url":null,"abstract":"Knowledge of the queue length for a radio link in a mobile data network has a significant effect on the performance of the communication protocol TCP. If the queue length can be accurately estimated and regulated to a target value, then low end-to-end delay and high bandwidth utilization can be achieved. One method for estimating and regulating the queue length is the queue-length-based congestion control (QCC) algorithm. However, this algorithm estimates the queue length over one RTT interval prior to transmission, and the actual queue length after that time can differ significantly, because the bandwidth can vary substantially between the neighboring propagation delays, which could result in a false positive in the queue length adaption, thereby affecting the QoS performance. To address this problem, we propose PQ-TCP, a method that predicts the queue length directly by predicting the bandwidth variations over the ensuing period of time equal to the propagation delay and using post-bandwidth analysis to minimize the prediction error. Trace-driven simulations are used to show that the QoS performance of PQ-TCP is superior to that of current QCC algorithms. PQ-TCP achieves the lowest RTT while maintaining nearly 90% bandwidth utilization for a small target queue length of 5 packets.","PeriodicalId":417279,"journal":{"name":"Proceedings of the 12th ACM International Conference on Computing Frontiers","volume":"70 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-05-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129495958","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Automated power gating methodology for dataflow-based reconfigurable systems","authors":"Tiziana Fanni, Carlo Sau, L. Raffo, F. Palumbo","doi":"10.1145/2742854.2747285","DOIUrl":"https://doi.org/10.1145/2742854.2747285","url":null,"abstract":"Modern embedded systems designers are required to implement efficient multi-functional applications, over portable platforms under strong energy and resources constraints. Automatic tools may help them in challenging such a complex scenario: to develop complex reconfigurable systems while reducing time-to-market. At the same time, automated methodologies can aid them to manage power consumption. Dataflow models of computation, thanks to their modularity, turned out to be extremely useful to these purposes. In this paper, we will demonstrate as they can be used to automatically achieve power management since the earliest stage of the design flow. In particular, we are focussing on the automation of power gating. The methodology has been evaluated on an image processing use case targeting an ASIC 90 nm CMOS technology.","PeriodicalId":417279,"journal":{"name":"Proceedings of the 12th ACM International Conference on Computing Frontiers","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-05-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130236779","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Min Li, Jian Tan, Yandong Wang, Li Zhang, V. Salapura
{"title":"SparkBench: a comprehensive benchmarking suite for in memory data analytic platform Spark","authors":"Min Li, Jian Tan, Yandong Wang, Li Zhang, V. Salapura","doi":"10.1145/2742854.2747283","DOIUrl":"https://doi.org/10.1145/2742854.2747283","url":null,"abstract":"Spark has been increasingly adopted by industries in recent years for big data analysis by providing a fault tolerant, scalable and easy-to-use in memory abstraction. Moreover, the community has been actively developing a rich ecosystem around Spark, making it even more attractive. However, there is not yet a Spark specify benchmark existing in the literature to guide the development and cluster deployment of Spark to better fit resource demands of user applications. In this paper, we present SparkBench, a Spark specific benchmarking suite, which includes a comprehensive set of applications. SparkBench covers four main categories of applications, including machine learning, graph computation, SQL query and streaming applications. We also characterize the resource consumption, data flow and timing information of each application and evaluate the performance impact of a key configuration parameter to guide the design and optimization of Spark data analytic platform.","PeriodicalId":417279,"journal":{"name":"Proceedings of the 12th ACM International Conference on Computing Frontiers","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-05-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130826400","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}