Performance and memory-access characterization of data mining applications

J. P. Bradford, J. Fortes
{"title":"Performance and memory-access characterization of data mining applications","authors":"J. P. Bradford, J. Fortes","doi":"10.1109/WWC.1998.809358","DOIUrl":null,"url":null,"abstract":"Characterizes the performance and memory-access behavior of a decision tree induction program, a previously unstudied application used in data mining and knowledge discovery in databases. Performance is studied via RSIM, an execution-driven simulator, for three uniprocessor models that exploit instruction-level parallelism to varying degrees. Several properties of the program are noted. Out-of-order dispatch and multiple issue provide a significant performance advantage: 50%-250% improvement in inter-processor communication (IPC) for out-of-order dispatch vs. in-order dispatch, and 5%-120% improvement in IPC for four-way issue vs. single issue. Multiple issue provides a greater performance improvement for larger L2 cache sizes, when the program is limited by CPU performance; out-of-order dispatch provides a greater performance improvement for smaller L2 cache sizes. The program has a very small instruction footprint: for an 8-kB L1 instruction cache, the instruction miss rate is below 0.1%. A small (8 kB) L1 data cache is sufficient to capture most of the locality of the data references, resulting in L1 miss rates between 10%-20%. Increasing the size of the L2 data cache does not significantly improve performance until a significant fraction (over 1/4) of the data set fits into the L2 cache. Lastly, a procedure is developed for scaling the cache sizes when using scaled-down data sets, allowing the results for smaller data sets to be used to predict the performance of full-sized data sets.","PeriodicalId":190931,"journal":{"name":"Workload Characterization: Methodology and Case Studies. Based on the First Workshop on Workload Characterization","volume":"87 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1998-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"29","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Workload Characterization: Methodology and Case Studies. Based on the First Workshop on Workload Characterization","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/WWC.1998.809358","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 29

Abstract

Characterizes the performance and memory-access behavior of a decision tree induction program, a previously unstudied application used in data mining and knowledge discovery in databases. Performance is studied via RSIM, an execution-driven simulator, for three uniprocessor models that exploit instruction-level parallelism to varying degrees. Several properties of the program are noted. Out-of-order dispatch and multiple issue provide a significant performance advantage: 50%-250% improvement in inter-processor communication (IPC) for out-of-order dispatch vs. in-order dispatch, and 5%-120% improvement in IPC for four-way issue vs. single issue. Multiple issue provides a greater performance improvement for larger L2 cache sizes, when the program is limited by CPU performance; out-of-order dispatch provides a greater performance improvement for smaller L2 cache sizes. The program has a very small instruction footprint: for an 8-kB L1 instruction cache, the instruction miss rate is below 0.1%. A small (8 kB) L1 data cache is sufficient to capture most of the locality of the data references, resulting in L1 miss rates between 10%-20%. Increasing the size of the L2 data cache does not significantly improve performance until a significant fraction (over 1/4) of the data set fits into the L2 cache. Lastly, a procedure is developed for scaling the cache sizes when using scaled-down data sets, allowing the results for smaller data sets to be used to predict the performance of full-sized data sets.
数据挖掘应用程序的性能和内存访问特性
描述了决策树归纳程序的性能和内存访问行为,决策树归纳程序是一个以前未研究的应用程序,用于数据库的数据挖掘和知识发现。通过RSIM(一个执行驱动模拟器)研究了三种不同程度地利用指令级并行性的单处理器模型的性能。注意到该程序的几个属性。乱序调度和多问题提供了显著的性能优势:与有序调度相比,乱序调度的处理器间通信(IPC)提高了50%-250%,与单问题相比,四方问题的IPC提高了5%-120%。当程序受到CPU性能限制时,对于较大的二级缓存大小,多重问题提供了更大的性能改进;乱序调度为较小的二级缓存大小提供了更大的性能改进。该程序具有非常小的指令占用:对于8 kb的L1指令缓存,指令缺失率低于0.1%。一个小的(8 kB) L1数据缓存足以捕获数据引用的大部分位置,导致L1缺失率在10%-20%之间。增加L2数据缓存的大小不会显著提高性能,直到数据集的很大一部分(超过1/4)适合L2缓存。最后,开发了一个过程,用于在使用按比例缩小的数据集时缩放缓存大小,从而允许使用较小数据集的结果来预测全尺寸数据集的性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信