{"title":"A Benchmark Suite for Unstructured Data Processing","authors":"C. Smullen, S. Tarapore, S. Gurumurthi","doi":"10.1109/SNAPI.2007.8","DOIUrl":null,"url":null,"abstract":"A large fraction of the data that will stored and accessed in future systems is expected to be unstructured, in the form of images, audio files, etc. Therefore, it is very important to design future I/O subsystems to provide efficient storage, and access to these vast and continuously growing repositories of unstructured data. To facilitate system design and evaluation, we first need benchmarks that capture the processing and I/O access characteristics of applications that operate on unstructured data. In this paper, we present an unstructured data processing benchmark suite that we have developed. We provide detailed descriptions of the workloads in the benchmark suite and discuss the larger space of application characteristics that each of them capture.","PeriodicalId":347839,"journal":{"name":"Fourth International Workshop on Storage Network Architecture and Parallel I/Os (SNAPI 2007)","volume":"39 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2007-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"11","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Fourth International Workshop on Storage Network Architecture and Parallel I/Os (SNAPI 2007)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SNAPI.2007.8","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 11
Abstract
A large fraction of the data that will stored and accessed in future systems is expected to be unstructured, in the form of images, audio files, etc. Therefore, it is very important to design future I/O subsystems to provide efficient storage, and access to these vast and continuously growing repositories of unstructured data. To facilitate system design and evaluation, we first need benchmarks that capture the processing and I/O access characteristics of applications that operate on unstructured data. In this paper, we present an unstructured data processing benchmark suite that we have developed. We provide detailed descriptions of the workloads in the benchmark suite and discuss the larger space of application characteristics that each of them capture.