Comparative performance analysis of a Big Data NORA problem on a variety of architectures

P. Kogge, D. Bayliss
{"title":"Comparative performance analysis of a Big Data NORA problem on a variety of architectures","authors":"P. Kogge, D. Bayliss","doi":"10.1109/CTS.2013.6567199","DOIUrl":null,"url":null,"abstract":"Non Obvious Relationship Analysis (NORA) is one of the most stressing classes of Big Data Analytics problems. This paper proposes a reference NORA problem that is representative of real problems, and can rationally scale to very large sizes. It then develops a highly concurrent implementation that can run on large systems. Each step of this implementation is sized in terms of how much of four different resources (CPU, memory, disk, and network) might be used. From this, a parameterized model projecting both execution time and utilizations is used to identify the “tall poles” in performance. The parameters are then modified to represent several different target systems, from a large cluster typical of today to variations in an advanced architecture where processing has been moved into memory. A “thought experiment” then uses this model to discover the parameters of a system that would provide both a near 100X speedup, but with a balanced design where no resource is badly over or under utilized.","PeriodicalId":256633,"journal":{"name":"2013 International Conference on Collaboration Technologies and Systems (CTS)","volume":"41 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 International Conference on Collaboration Technologies and Systems (CTS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CTS.2013.6567199","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 8

Abstract

Non Obvious Relationship Analysis (NORA) is one of the most stressing classes of Big Data Analytics problems. This paper proposes a reference NORA problem that is representative of real problems, and can rationally scale to very large sizes. It then develops a highly concurrent implementation that can run on large systems. Each step of this implementation is sized in terms of how much of four different resources (CPU, memory, disk, and network) might be used. From this, a parameterized model projecting both execution time and utilizations is used to identify the “tall poles” in performance. The parameters are then modified to represent several different target systems, from a large cluster typical of today to variations in an advanced architecture where processing has been moved into memory. A “thought experiment” then uses this model to discover the parameters of a system that would provide both a near 100X speedup, but with a balanced design where no resource is badly over or under utilized.
不同架构下大数据NORA问题的性能比较分析
非明显关系分析(NORA)是大数据分析中最具挑战性的问题之一。本文提出了一个代表实际问题的参考NORA问题,并且可以合理地扩展到非常大的规模。然后开发一个可以在大型系统上运行的高度并发实现。该实现的每个步骤都是根据可能使用的四种不同资源(CPU、内存、磁盘和网络)的多少来确定大小的。由此,可以使用一个参数化模型来预测执行时间和利用率,以确定性能中的“最高极点”。然后修改参数以表示几个不同的目标系统,从当今典型的大型集群到高级体系结构中的变体,其中处理已转移到内存中。然后,一个“思想实验”使用这个模型来发现一个系统的参数,这个系统既可以提供近100倍的加速,又具有平衡的设计,没有资源严重过剩或利用不足。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信