{"title":"A Study of FPGA Virtualization and Accelerator Scheduling","authors":"Qian Zhao, M. Iida, T. Sueyoshi","doi":"10.1145/3129457.3129503","DOIUrl":"https://doi.org/10.1145/3129457.3129503","url":null,"abstract":"Deploying field-programmable gate arrays (FPGAs) on the cloud to accelerate the processing of the explosively growing server workloads is becoming a clear trend today. However, the costs reduction of accelerator design and deployment is still difficult with conventional development methods and tools. In the previous work, we proposed the hCODE platform to simplify the design, share and deployment of FPGA accelerators, which adopted a shell-and-IP design pattern and developed supporting tools to improve the reusability and the portability of accelerator designs. In this paper, based on our previous work, we propose new design methods and tools for FPGA virtualization and scheduling that allowing IPs to be implemented at cluster scale in low cost. With the proposed platform, users can easily deploy multiple accelerators on one FPGA to improve on-chip resources and communication bandwidth utilization.","PeriodicalId":345943,"journal":{"name":"Proceedings of the first Workshop on Emerging Technologies for software-defined and reconfigurable hardware-accelerated Cloud Datacenters","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127025199","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Customized Architecture Technology for High Performance Computing","authors":"Jingfei Jiang","doi":"10.1145/3129457.3129500","DOIUrl":"https://doi.org/10.1145/3129457.3129500","url":null,"abstract":"Customized Architecture is one of the technical road for exascale high performance computing. We will give an overview about FPGA customized architecture. Research experiences on deep learning algorithms accelerators for data analyzing, footprint and cipher algorithms accelerators for information processing, and matrix processing algorithms accelerators for scientific computing will be discussed.","PeriodicalId":345943,"journal":{"name":"Proceedings of the first Workshop on Emerging Technologies for software-defined and reconfigurable hardware-accelerated Cloud Datacenters","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132878478","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Programming FPGAs Using OpenCL from Performance Model to Application Study","authors":"Yun Liang","doi":"10.1145/3129457.3129502","DOIUrl":"https://doi.org/10.1145/3129457.3129502","url":null,"abstract":"Recent adoption of OpenCL programming model by FPGA vendors has realized the function portability of OpenCL workloads on FPGA. However, the poor performance portability prevents its wide adoption. To harness the power of FPGAs using OpenCL programming model, it is advantageous to design an analytical performance model to estimate the performance of OpenCL workloads on FPGAs and provide insights into the performance bottlenecks of OpenCL model on FPGA architecture. In the first part of the talk, we present FlexCL, an analytical performance model for OpenCL workloads on flexible FPGAs. FlexCL estimates the overall performance by tightly coupling the on chip global memory and on-chip computation models based on the communication mode. Then, we present an application study of mapping stencil applications onto FPGAs using OpenCL programming model.","PeriodicalId":345943,"journal":{"name":"Proceedings of the first Workshop on Emerging Technologies for software-defined and reconfigurable hardware-accelerated Cloud Datacenters","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129824689","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Huming Zhu, J. Kou, Linyan Qiu, Yuqi Guo, Mingwei Niu, Maoguo Gong, L. Jiao
{"title":"Distributed SAR Image Change Detection with OpenCL-Enabled Spark","authors":"Huming Zhu, J. Kou, Linyan Qiu, Yuqi Guo, Mingwei Niu, Maoguo Gong, L. Jiao","doi":"10.1145/3129457.3129495","DOIUrl":"https://doi.org/10.1145/3129457.3129495","url":null,"abstract":"Distributed processing framework has been widely used in remote-sensing field. Spark, as a popular distributed computing framework, has been utilized to deal with big remote sensing data. However, it is inefficient due to that the application is not only data intensive but also computation intensive. For example, in Synthetic Aperture Radar (SAR) image change detection, clustering analysis can consume a lot of computing time and memory resources dealing with big remote sensing data. Coprocessors (GPU, MIC, etc.) have a high-compute power, which is able to handle computation intensive tasks. In this paper, we proposed an OpenCL-enabled Spark framework to accelerate Kernel Fuzzy C-Mean (KFCM) algorithm for SAR image change detection. And the computation intensive operations of KFCM are transferred to coprocessors of the cluster through the proposed OpenCL-enabled Spark framework. The experimental results on real SAR image indicate that the implementation on OpenCL-enabled Spark is efficient and scalable.","PeriodicalId":345943,"journal":{"name":"Proceedings of the first Workshop on Emerging Technologies for software-defined and reconfigurable hardware-accelerated Cloud Datacenters","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121325273","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"TCS: FaaS (FPGA as a service)","authors":"Jianlin Gao","doi":"10.1145/3129457.3129499","DOIUrl":"https://doi.org/10.1145/3129457.3129499","url":null,"abstract":"This presentation firstly points out the dilemma of traditional FPGA industry, then points out that the flexible and easy-to-use cloud services is a feasible way to solve the difficulties of FPGA. Tencent's architecture try to solve the puzzle of FPGA cloud service auto generation using the idea of API as a service. To achieve the goal, Tencent released HDK, SDK, and Tencent Computing Service (TCS) platform to help developers to automatically convert their APIs to cloud service.","PeriodicalId":345943,"journal":{"name":"Proceedings of the first Workshop on Emerging Technologies for software-defined and reconfigurable hardware-accelerated Cloud Datacenters","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114270906","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"DoCE: Direct Extension of On-Chip Interconnects over Converged Ethernet for Rack-Scale Memory Sharing","authors":"Yisong Chang, Ran Zhao, Lei Yu, Ke Zhang","doi":"10.1145/3129457.3129504","DOIUrl":"https://doi.org/10.1145/3129457.3129504","url":null,"abstract":"Novel rack-level interconnects are urgently required to support frequent inter-server communications in emerging large-scale distributed in-memory applications. In this paper, we introduce DoCE, a memory semantic fabric via Direct extension of on-chip interconnect (DEOI) over Converged Ethernet. Based on the architectural support for fine-grained remote memory sharing, DoCE provides a 9.6x speedup for distributed implementation of PageRank algorithm on our dual-node ARM SoC-FPGA prototype versus a conventional TCP/IP based solution. To the best of our knowledge, DoCE is the first implementation and prototype for memory semantic fabric via existing Ethernet infrastructure in ARM ecosystem.","PeriodicalId":345943,"journal":{"name":"Proceedings of the first Workshop on Emerging Technologies for software-defined and reconfigurable hardware-accelerated Cloud Datacenters","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126977394","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Slow or Down?: Seem to Be the Same for Cloud Users","authors":"Laiping Zhao, Xiaobo Zhou","doi":"10.1145/3129457.3129496","DOIUrl":"https://doi.org/10.1145/3129457.3129496","url":null,"abstract":"Recent years have seen the rapidly growing cloud computing market. A massive enterprise applications, like social networking, e-commerce, video streaming, email, web search, mapreduce, spark, are moving to cloud systems. These applications often require tens or hundreds of tasks or micro-services to complete, and need to deal with billions of visits per day while handling unprecedented volumes of data. At the same time, these applications need to deliver quick and predictable response times to their users. However, performance predictability has always been one of the biggest challenges in cloud computing. Despite many optimizations and improvements on both hardware and software, the distribution of latencies for Google's back end services show that while majority of requests take around 50-60 ms, significant fraction of requests takes longer than 100 ms, with the largest difference being almost 600 times [10]. The great variance impacts the quality of experience (QoE) for users and directly leads to revenue losses as well as increases in operational costs. Google's study shows that if the response time increase from 0.4 second to 0.9 second, then traffic and ad revenues down 20% [1]. Amazon also reports that every 100 ms increase on the response time leads to sales down 1% [4]. According to Nielsen [14], (i) 0.1 second is about the limit for having the user feel that the system is reacting instantaneously. (ii) 1.0 second is about the limit for the user's flow of thought to stay uninterrupted, even though the user will notice the delay. (iii) 10 seconds is about the limit for keeping the user's attention focused on the dialogue. For longer delays, users will want to perform other tasks while waiting for the computer to finish. In this sense, \"slow response\" and \"service unavailable\" seem to be the same for cloud users. Currently, major cloud providers like Amazon, Microsoft, and Google merely state the uptime availability guarantee in their Service Level Agreements (SLA), but never provide guarantee on QoE (e.g., response time). Since the traditional availability is defined based on the failure/repair behaviors of cloud services, this clearly cannot satisfy user's requirements on quick response time. The reason for this is that the complex and diverse uncertainty behaviors in cloud systems make performance predictability very difficult. In general, these uncertainties have two main characteristics: • Diversity: Uncertainties in cloud systems come from many diverse sources, including hardware layer (e.g., failures, system resource competition, network resource competition) and software layer (e.g., scheduling algorithm, software bugs, unexpected workload, loss of data) [9]. • Transmissibility: The uncertainties may not only affect a single service, but also degrade the performance of a chain of services or other co-loated applications. For example, the loss of a piece of intermediate data would require the re-generation of data from its parent ta","PeriodicalId":345943,"journal":{"name":"Proceedings of the first Workshop on Emerging Technologies for software-defined and reconfigurable hardware-accelerated Cloud Datacenters","volume":"75 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129859808","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Building the Reconfigurable Cloud Ecosystem","authors":"P. Chow","doi":"10.1145/3129457.3129501","DOIUrl":"https://doi.org/10.1145/3129457.3129501","url":null,"abstract":"Microsoft has clearly made the case for using FPGAs at scale in the cloud and Intel is committed to leveraging the benefits of hardware acceleration with their acquisition of Altera. However, we still cannot use FPGAs with the same ease we have with software-based systems, let alone do it easily at scale in the cloud. High-level synthesis is necessary for making FPGAs accessible, but it is not sufficient. Making FPGAs easy to use for computation requires more than developing accessible tools for creating hardware targeted for FPGAs. The software computing world has a lot of taken-for-granted, sometimes invisible and good open source infrastructure that is missing for using FPGAs as computing devices. The problem is compounded when we want to use FPGAs at the scale of the cloud. I will present the need for some common infrastructure and abstraction layers to support the use of FPGAs for computing at scale, and describe relevant work at the University of Toronto that can contribute towards the development of an open source framework for the use and deployment of FPGAs at scale.","PeriodicalId":345943,"journal":{"name":"Proceedings of the first Workshop on Emerging Technologies for software-defined and reconfigurable hardware-accelerated Cloud Datacenters","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130948150","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Rethinking the SDN Abstraction","authors":"Chengchen Hu","doi":"10.1145/3129457.3129498","DOIUrl":"https://doi.org/10.1145/3129457.3129498","url":null,"abstract":"Software Defined Networking (SDN) greatly simplifies network management and introduces unprecedented flexibility by decoupling control functions from the network data plane. However, such a decoupling also opens a box of various open questions, which are not well addressed, e.g., scalability issues and security concerns. This talk firstly describes the background of SDN and the abstraction that SDN is possessing now, and secondly presents scalability/security problems and our on-going research progress. In addition, the promising directions will also be discussed in the talk.","PeriodicalId":345943,"journal":{"name":"Proceedings of the first Workshop on Emerging Technologies for software-defined and reconfigurable hardware-accelerated Cloud Datacenters","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114684097","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Anomaly Detection in Clouds: Challenges and Practice","authors":"Kejiang Ye","doi":"10.1145/3129457.3129497","DOIUrl":"https://doi.org/10.1145/3129457.3129497","url":null,"abstract":"Cloud computing is an important infrastructure for many enterprises. After 10 years of development, cloud computing has achieved a great success, and has greatly changed the economy, society, science and industries. In particular, with the rapid development of mobile Internet and big data technology, almost all of the online services and data services are built on the top of cloud computing, such as the online banking services provided by banks, the electronic services provided by the news media, the government cloud information systems provided by the government departments, the mobile services provided by the communications companies. Besides, tens of thousands of Start-ups rely on the provision of cloud computing services. Therefore, ensuring cloud reliability is very important and essential. However, the reality is that the current cloud systems are not reliable enough. On February 28th 2017, Amazon Web Services, the popular storage and hosting platform used by a huge range of companies, experienced S3 service interruption for 4 hours in the Northern Virginia (US-EAST-1) Region, and then quickly spread other online service providers who rely on the S3 service [2]. This failure caused a huge economic loss. It is because cloud computing service providers typically set a Service Level Agreement (SLA) with customers. For example, when customers require 99.99% availability, it means that 99.99% of the time must meet the requirement for 365 days per year. If the service breaks more than 0.01%, compensation is required. In fact, with the continuous development and maturity of cloud computing, a large number of traditional business systems have been deployed on the cloud platform. Cloud computing integrates existing hardware resources through virtualization technology to create a shared resource pool that enables applications to obtain computing, storage, and network resources on demand, effectively enhancing the scalability and resource utilization of traditional IT infrastructures and significantly reducing the operation cost of the traditional business systems. However, with the growing number of applications running on the cloud, the scale of cloud data center has been expanding, the current cloud computing system has become very complex, mainly reflected in: 1) Large scale. A typical data center involves more than 100,000 servers and 10,000 switches, more nodes usually mean higher probability of failure; 2) Complex application structure. Web search, e-commerce and other typical cloud program has a complex interactive behavior. For example, an Amazon page request involves interaction with hundreds of components [7], error in any one component will lead to the whole application anomalies; 3) Shared resource pattern. One of the basic features of cloud computing is resource sharing, a typical server in Google Cloud data center hosts 5 to 18 applications simultaneously, each server runs about 10.69 applications [5]. Resource competition will interfer","PeriodicalId":345943,"journal":{"name":"Proceedings of the first Workshop on Emerging Technologies for software-defined and reconfigurable hardware-accelerated Cloud Datacenters","volume":"66 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124535373","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}