{"title":"重新评估SBC集群中的计算性能:跨代的HPL基准测试","authors":"Z. Krpić, I. Lukić, M. Habijan, L. Loina","doi":"10.1016/j.future.2025.108137","DOIUrl":null,"url":null,"abstract":"<div><div>Single Board Computer Clusters (SBCCs) are increasingly used as accessible, low-power platforms for parallel and distributed computing, particularly in edge and fog environments. Yet their performance remains underexplored through reproducible, tuned evaluations. This paper presents a benchmarking methodology based on the High Performance Linpack (HPL) benchmark, selected for its use of dense linear algebra kernels common in scientific and machine learning workloads. The evaluation includes HPL parameter tuning, compiler configuration, and comparison of ATLAS vs. OpenBLAS.</div><div>We apply the methodology SBCs spanning a decade of development: Raspberry Pi 1B, 3B, 4B, and 5, Cubieboard 2, Odroid U3, and Odroid-MC1. Results show that software-level tuning without overclocking or hardware modification can yield performance improvements of up to 2.3<span><math><mo>×</mo></math></span> over prior reports. A 146<span><math><mo>×</mo></math></span> increase in HPL performance between the Pi 1B and Pi 5 illustrates the evolution in computational capability within a stable form factor. OpenBLAS outperforms ATLAS on newer platforms, while ATLAS retains marginal advantages on older boards.</div><div>The findings provide a reproducible baseline for SBCC performance evaluation and support their relevance for benchmarking, education, and energy-efficient high-performance workloads in scenarios where conventional clusters are impractical due to cost, size, or power.</div></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":"176 ","pages":"Article 108137"},"PeriodicalIF":6.2000,"publicationDate":"2025-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Re-evaluating compute performance in SBC clusters: HPL benchmarking across generations\",\"authors\":\"Z. Krpić, I. Lukić, M. Habijan, L. Loina\",\"doi\":\"10.1016/j.future.2025.108137\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Single Board Computer Clusters (SBCCs) are increasingly used as accessible, low-power platforms for parallel and distributed computing, particularly in edge and fog environments. Yet their performance remains underexplored through reproducible, tuned evaluations. This paper presents a benchmarking methodology based on the High Performance Linpack (HPL) benchmark, selected for its use of dense linear algebra kernels common in scientific and machine learning workloads. The evaluation includes HPL parameter tuning, compiler configuration, and comparison of ATLAS vs. OpenBLAS.</div><div>We apply the methodology SBCs spanning a decade of development: Raspberry Pi 1B, 3B, 4B, and 5, Cubieboard 2, Odroid U3, and Odroid-MC1. Results show that software-level tuning without overclocking or hardware modification can yield performance improvements of up to 2.3<span><math><mo>×</mo></math></span> over prior reports. A 146<span><math><mo>×</mo></math></span> increase in HPL performance between the Pi 1B and Pi 5 illustrates the evolution in computational capability within a stable form factor. OpenBLAS outperforms ATLAS on newer platforms, while ATLAS retains marginal advantages on older boards.</div><div>The findings provide a reproducible baseline for SBCC performance evaluation and support their relevance for benchmarking, education, and energy-efficient high-performance workloads in scenarios where conventional clusters are impractical due to cost, size, or power.</div></div>\",\"PeriodicalId\":55132,\"journal\":{\"name\":\"Future Generation Computer Systems-The International Journal of Escience\",\"volume\":\"176 \",\"pages\":\"Article 108137\"},\"PeriodicalIF\":6.2000,\"publicationDate\":\"2025-09-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Future Generation Computer Systems-The International Journal of Escience\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0167739X25004315\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, THEORY & METHODS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Future Generation Computer Systems-The International Journal of Escience","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0167739X25004315","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}
Re-evaluating compute performance in SBC clusters: HPL benchmarking across generations
Single Board Computer Clusters (SBCCs) are increasingly used as accessible, low-power platforms for parallel and distributed computing, particularly in edge and fog environments. Yet their performance remains underexplored through reproducible, tuned evaluations. This paper presents a benchmarking methodology based on the High Performance Linpack (HPL) benchmark, selected for its use of dense linear algebra kernels common in scientific and machine learning workloads. The evaluation includes HPL parameter tuning, compiler configuration, and comparison of ATLAS vs. OpenBLAS.
We apply the methodology SBCs spanning a decade of development: Raspberry Pi 1B, 3B, 4B, and 5, Cubieboard 2, Odroid U3, and Odroid-MC1. Results show that software-level tuning without overclocking or hardware modification can yield performance improvements of up to 2.3 over prior reports. A 146 increase in HPL performance between the Pi 1B and Pi 5 illustrates the evolution in computational capability within a stable form factor. OpenBLAS outperforms ATLAS on newer platforms, while ATLAS retains marginal advantages on older boards.
The findings provide a reproducible baseline for SBCC performance evaluation and support their relevance for benchmarking, education, and energy-efficient high-performance workloads in scenarios where conventional clusters are impractical due to cost, size, or power.
期刊介绍:
Computing infrastructures and systems are constantly evolving, resulting in increasingly complex and collaborative scientific applications. To cope with these advancements, there is a growing need for collaborative tools that can effectively map, control, and execute these applications.
Furthermore, with the explosion of Big Data, there is a requirement for innovative methods and infrastructures to collect, analyze, and derive meaningful insights from the vast amount of data generated. This necessitates the integration of computational and storage capabilities, databases, sensors, and human collaboration.
Future Generation Computer Systems aims to pioneer advancements in distributed systems, collaborative environments, high-performance computing, and Big Data analytics. It strives to stay at the forefront of developments in grids, clouds, and the Internet of Things (IoT) to effectively address the challenges posed by these wide-area, fully distributed sensing and computing systems.