{"title":"实验作为大数据系统性能评估的工具","authors":"A. Apon","doi":"10.1145/2694730.2694734","DOIUrl":null,"url":null,"abstract":"The complex big data systems of today are difficult, if not impossible, to model analytically. The challenges of these distributed and parallel data processing systems include heterogeneous network communication, a mix of storage, memory, and computing devices, and common failures of communication and devices. Particular challenges with big data systems include the variety and volume of data that place previously unseen stresses on distributed computing systems. Experimentation using production-quality hardware and software and realistic data is required to understand system tradeoffs. At the same time, experimental evaluation has challenges, including access to hardware resources at scale, robust workload characterization, data characterization, configuration management of software and systems, and sometimes insidious optimization issues around the mix of software stacks or hardware/software resource allocation. In this talk we present a number of the research challenges when experimentation is used as a tool for the performance evaluation of big data systems, some approaches to solutions, and open questions for this area.","PeriodicalId":298926,"journal":{"name":"Proceedings of the 1st Workshop on Performance Analysis of Big Data Systems","volume":"29 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Experimentation as a Tool for the Performance Evaluation of Big Data Systems\",\"authors\":\"A. Apon\",\"doi\":\"10.1145/2694730.2694734\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The complex big data systems of today are difficult, if not impossible, to model analytically. The challenges of these distributed and parallel data processing systems include heterogeneous network communication, a mix of storage, memory, and computing devices, and common failures of communication and devices. Particular challenges with big data systems include the variety and volume of data that place previously unseen stresses on distributed computing systems. Experimentation using production-quality hardware and software and realistic data is required to understand system tradeoffs. At the same time, experimental evaluation has challenges, including access to hardware resources at scale, robust workload characterization, data characterization, configuration management of software and systems, and sometimes insidious optimization issues around the mix of software stacks or hardware/software resource allocation. In this talk we present a number of the research challenges when experimentation is used as a tool for the performance evaluation of big data systems, some approaches to solutions, and open questions for this area.\",\"PeriodicalId\":298926,\"journal\":{\"name\":\"Proceedings of the 1st Workshop on Performance Analysis of Big Data Systems\",\"volume\":\"29 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-02-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 1st Workshop on Performance Analysis of Big Data Systems\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2694730.2694734\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 1st Workshop on Performance Analysis of Big Data Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2694730.2694734","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Experimentation as a Tool for the Performance Evaluation of Big Data Systems
The complex big data systems of today are difficult, if not impossible, to model analytically. The challenges of these distributed and parallel data processing systems include heterogeneous network communication, a mix of storage, memory, and computing devices, and common failures of communication and devices. Particular challenges with big data systems include the variety and volume of data that place previously unseen stresses on distributed computing systems. Experimentation using production-quality hardware and software and realistic data is required to understand system tradeoffs. At the same time, experimental evaluation has challenges, including access to hardware resources at scale, robust workload characterization, data characterization, configuration management of software and systems, and sometimes insidious optimization issues around the mix of software stacks or hardware/software resource allocation. In this talk we present a number of the research challenges when experimentation is used as a tool for the performance evaluation of big data systems, some approaches to solutions, and open questions for this area.