Benchmarking data analysis and machine learning applications on the Intel KNL many-core processor

C. Byun, J. Kepner, W. Arcand, David Bestor, Bill Bergeron, V. Gadepally, Michael Houle, M. Hubbell, Michael Jones, Anna Klein, P. Michaleas, Lauren Milechin, J. Mullen, Andrew Prout, Antonio Rosa, S. Samsi, Charles Yee, A. Reuther
{"title":"Benchmarking data analysis and machine learning applications on the Intel KNL many-core processor","authors":"C. Byun, J. Kepner, W. Arcand, David Bestor, Bill Bergeron, V. Gadepally, Michael Houle, M. Hubbell, Michael Jones, Anna Klein, P. Michaleas, Lauren Milechin, J. Mullen, Andrew Prout, Antonio Rosa, S. Samsi, Charles Yee, A. Reuther","doi":"10.1109/HPEC.2017.8091067","DOIUrl":null,"url":null,"abstract":"Knights Landing (KNL) is the code name for the second-generation Intel Xeon Phi product family. KNL has generated significant interest in the data analysis and machine learning communities because its new many-core architecture targets both of these workloads. The KNL many-core vector processor design enables it to exploit much higher levels of parallelism. At the Lincoln Laboratory Supercomputing Center (LLSC), the majority of users are running data analysis applications such as MATLAB and Octave. More recently, machine learning applications, such as the UC Berkeley Caffe deep learning framework, have become increasingly important to LLSC users. Thus, the performance of these applications on KNL systems is of high interest to LLSC users and the broader data analysis and machine learning communities. Our data analysis benchmarks of these application on the Intel KNL processor indicate that single-core double-precision generalized matrix multiply (DGEMM) performance on KNL systems has improved by ∼3.5× compared to prior Intel Xeon technologies. Our data analysis applications also achieved ∼60% of the theoretical peak performance. Also a performance comparison of a machine learning application, Caffe, between the two different Intel CPUs, Xeon E5 v3 and Xeon Phi 7210, demonstrated a 2.7× improvement on a KNL node.","PeriodicalId":364903,"journal":{"name":"2017 IEEE High Performance Extreme Computing Conference (HPEC)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-07-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"15","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 IEEE High Performance Extreme Computing Conference (HPEC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/HPEC.2017.8091067","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 15

Abstract

Knights Landing (KNL) is the code name for the second-generation Intel Xeon Phi product family. KNL has generated significant interest in the data analysis and machine learning communities because its new many-core architecture targets both of these workloads. The KNL many-core vector processor design enables it to exploit much higher levels of parallelism. At the Lincoln Laboratory Supercomputing Center (LLSC), the majority of users are running data analysis applications such as MATLAB and Octave. More recently, machine learning applications, such as the UC Berkeley Caffe deep learning framework, have become increasingly important to LLSC users. Thus, the performance of these applications on KNL systems is of high interest to LLSC users and the broader data analysis and machine learning communities. Our data analysis benchmarks of these application on the Intel KNL processor indicate that single-core double-precision generalized matrix multiply (DGEMM) performance on KNL systems has improved by ∼3.5× compared to prior Intel Xeon technologies. Our data analysis applications also achieved ∼60% of the theoretical peak performance. Also a performance comparison of a machine learning application, Caffe, between the two different Intel CPUs, Xeon E5 v3 and Xeon Phi 7210, demonstrated a 2.7× improvement on a KNL node.
在Intel KNL多核处理器上对数据分析和机器学习应用进行基准测试
骑士登陆(KNL)是第二代英特尔至强Phi产品系列的代号。KNL引起了数据分析和机器学习社区的极大兴趣,因为它的新多核架构针对这两个工作负载。KNL多核矢量处理器设计使其能够利用更高级别的并行性。在林肯实验室超级计算中心(LLSC),大多数用户都在运行数据分析应用程序,如MATLAB和Octave。最近,机器学习应用程序,如加州大学伯克利分校的Caffe深度学习框架,对LLSC用户变得越来越重要。因此,这些应用程序在KNL系统上的性能对LLSC用户和更广泛的数据分析和机器学习社区非常感兴趣。我们在英特尔KNL处理器上对这些应用程序进行的数据分析基准测试表明,与之前的英特尔至强技术相比,KNL系统上的单核双精度广义矩阵乘法(DGEMM)性能提高了约3.5倍。我们的数据分析应用程序也达到了理论峰值性能的约60%。此外,机器学习应用程序Caffe在两种不同的英特尔cpu (Xeon E5 v3和Xeon Phi 7210)之间的性能比较显示,KNL节点的性能提高了2.7倍。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信