{"title":"NCPP:新数据集上的 Nova CPU 性能预测器","authors":"Xiaoman Liu","doi":"arxiv-2407.03385","DOIUrl":null,"url":null,"abstract":"CPU performance prediction, which involves forecasting the performance scores\nof a CPU based on its hardware characteristics during its operation, is a\ncritical technology for computational system design and resource management in\nthe big data era. However, this research field currently faces two significant\nchallenges. First, collecting real-world data is challenging due to the wide\nvariety of CPU products on the market and the highly specialized nature of\nrelevant hardware characteristics. In the research process, this field lacks a\nstandard dataset with unified hardware characteristics, wide data coverage, and\ncomprehensive benchmarks. Second, existing methods based on hardware simulation\nmodels or machine learning exhibit notable shortcomings, such as lengthy\nsimulation test cycles and low prediction accuracy. To bridge these gaps, we\nfirst collect, preprocess, and standardize historical data from the 4th\nGeneration Intel Xeon Scalable Processors across multiple benchmark suites to\ncreate a new dataset, named PerfCastDB. Subsequently, we design a deep learning\nbased model called Nova CPU Performance Predictor (NCPP) as the baseline for\nthis new dataset. The NCPP network is designed based on group attention\nmechanism. It effectively quantifies the implicit relationships between\nhardware characteristics within and across groups and comprehensively models\nthe impact of various hardware characteristics on CPU performance prediction.\nWe conduct comparative experiments using the proposed PerfCastDB dataset.\nCompared to existing approaches, NCPP achieves superior evaluation results,\ndemonstrating its effectiveness. Furthermore, we have open-sourced part of the\ndataset and the NCPP network code to facilitate subsequent research. The\nresources can be accessed at https://github.com/xiaoman-liu/NCPP.","PeriodicalId":501291,"journal":{"name":"arXiv - CS - Performance","volume":"50 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"NCPP: Nova CPU Performance Predictor on a Novel Dataset\",\"authors\":\"Xiaoman Liu\",\"doi\":\"arxiv-2407.03385\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"CPU performance prediction, which involves forecasting the performance scores\\nof a CPU based on its hardware characteristics during its operation, is a\\ncritical technology for computational system design and resource management in\\nthe big data era. However, this research field currently faces two significant\\nchallenges. First, collecting real-world data is challenging due to the wide\\nvariety of CPU products on the market and the highly specialized nature of\\nrelevant hardware characteristics. In the research process, this field lacks a\\nstandard dataset with unified hardware characteristics, wide data coverage, and\\ncomprehensive benchmarks. Second, existing methods based on hardware simulation\\nmodels or machine learning exhibit notable shortcomings, such as lengthy\\nsimulation test cycles and low prediction accuracy. To bridge these gaps, we\\nfirst collect, preprocess, and standardize historical data from the 4th\\nGeneration Intel Xeon Scalable Processors across multiple benchmark suites to\\ncreate a new dataset, named PerfCastDB. Subsequently, we design a deep learning\\nbased model called Nova CPU Performance Predictor (NCPP) as the baseline for\\nthis new dataset. The NCPP network is designed based on group attention\\nmechanism. It effectively quantifies the implicit relationships between\\nhardware characteristics within and across groups and comprehensively models\\nthe impact of various hardware characteristics on CPU performance prediction.\\nWe conduct comparative experiments using the proposed PerfCastDB dataset.\\nCompared to existing approaches, NCPP achieves superior evaluation results,\\ndemonstrating its effectiveness. Furthermore, we have open-sourced part of the\\ndataset and the NCPP network code to facilitate subsequent research. The\\nresources can be accessed at https://github.com/xiaoman-liu/NCPP.\",\"PeriodicalId\":501291,\"journal\":{\"name\":\"arXiv - CS - Performance\",\"volume\":\"50 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-07-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - CS - Performance\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2407.03385\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Performance","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2407.03385","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
摘要
CPU 性能预测是指根据 CPU 运行时的硬件特性预测其性能得分,是大数据时代计算系统设计和资源管理的关键技术。然而,这一研究领域目前面临两个重大挑战。首先,由于市场上的 CPU 产品种类繁多,相关硬件特性的专业性很强,因此收集真实世界的数据具有很大的挑战性。在研究过程中,该领域缺乏硬件特性统一、数据覆盖面广、基准全面的标准数据集。其次,现有的基于硬件仿真模型或机器学习的方法存在明显缺陷,如仿真测试周期长、预测准确率低等。为了弥补这些不足,我们首先收集、预处理和标准化了第四代英特尔至强可扩展处理器在多个基准套件中的历史数据,创建了一个名为 PerfCastDB 的新数据集。随后,我们设计了一个基于深度学习的模型,名为 Nova CPU 性能预测器(NCPP),作为新数据集的基线。NCPP 网络是基于群体注意机制设计的。我们使用提出的 PerfCastDB 数据集进行了对比实验,与现有方法相比,NCPP 获得了更优越的评估结果,证明了其有效性。此外,我们还开源了部分数据集和 NCPP 网络代码,以方便后续研究。相关资源可通过 https://github.com/xiaoman-liu/NCPP 访问。
NCPP: Nova CPU Performance Predictor on a Novel Dataset
CPU performance prediction, which involves forecasting the performance scores
of a CPU based on its hardware characteristics during its operation, is a
critical technology for computational system design and resource management in
the big data era. However, this research field currently faces two significant
challenges. First, collecting real-world data is challenging due to the wide
variety of CPU products on the market and the highly specialized nature of
relevant hardware characteristics. In the research process, this field lacks a
standard dataset with unified hardware characteristics, wide data coverage, and
comprehensive benchmarks. Second, existing methods based on hardware simulation
models or machine learning exhibit notable shortcomings, such as lengthy
simulation test cycles and low prediction accuracy. To bridge these gaps, we
first collect, preprocess, and standardize historical data from the 4th
Generation Intel Xeon Scalable Processors across multiple benchmark suites to
create a new dataset, named PerfCastDB. Subsequently, we design a deep learning
based model called Nova CPU Performance Predictor (NCPP) as the baseline for
this new dataset. The NCPP network is designed based on group attention
mechanism. It effectively quantifies the implicit relationships between
hardware characteristics within and across groups and comprehensively models
the impact of various hardware characteristics on CPU performance prediction.
We conduct comparative experiments using the proposed PerfCastDB dataset.
Compared to existing approaches, NCPP achieves superior evaluation results,
demonstrating its effectiveness. Furthermore, we have open-sourced part of the
dataset and the NCPP network code to facilitate subsequent research. The
resources can be accessed at https://github.com/xiaoman-liu/NCPP.