耐辐射NVIDIA Tegra K1片上系统的性能评估

Derrek C. Landauer, Tyler M. Lovelly
{"title":"耐辐射NVIDIA Tegra K1片上系统的性能评估","authors":"Derrek C. Landauer, Tyler M. Lovelly","doi":"10.1109/SCC57168.2023.00014","DOIUrl":null,"url":null,"abstract":"Radiation-hardened (rad-hard) processors are designed to be reliable in extreme radiation environments, but they typically have lower performance than commercial-off-the-shelf (COTS) processors. For space missions that require more computational performance than rad-hard processors can provide, alternative solutions such as COTS-based systems-on-chips (SoCs) may be considered. One such SoC, the NVIDIA Tegra K1 (TK1), has achieved adequate radiation tolerance for some classes of space missions. Several vendors have developed radiation-tolerant single-board computer solutions targeted primarily for low Earth orbit (LEO) space missions that can utilize COTS-based hardware due to shorter planned lifetimes with lower radiation requirements. With an increased interest in space-based computing using advanced SoCs such as the TK1, a need exists for an improved understanding of its computational capabilities. This research study characterizes the performance of each computational element of the TK1, including the ARM Cortex-A15 MPCore CPU, the NVIDIA Kepler GK20A GPU, and their constituent computational units. Hardware measurements are generated using the SpaceBench benchmarking library on a TK1 development board. Software optimizations are studied for improved parallel performance using OpenMP for CPU multithreading, ARM NEON for single-instruction multiple-data (SIMD) operations, Compute Unified Device Architecture (CUDA) for GPU parallelization, and optimized Basic Linear Algebra Subprograms (BLAS) software libraries. By characterizing the computational performance of the TK1 and demonstrating how to optimize software effectively for each computational unit within the architecture, future designers can better understand how to successfully port their applications to COTS-based SoCs to enable improved capabilities in space systems. Experimental outcomes show that both the CPU and GPU achieved high levels of parallel efficiency with the optimizations employed and that the GPU outperformed the CPU for nearly every benchmark, with single-precision floating-point (SPFP) operations achieving the highest performance.","PeriodicalId":258620,"journal":{"name":"2023 IEEE Space Computing Conference (SCC)","volume":"31 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Performance Evaluation of the Radiation-Tolerant NVIDIA Tegra K1 System-on-Chip\",\"authors\":\"Derrek C. Landauer, Tyler M. Lovelly\",\"doi\":\"10.1109/SCC57168.2023.00014\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Radiation-hardened (rad-hard) processors are designed to be reliable in extreme radiation environments, but they typically have lower performance than commercial-off-the-shelf (COTS) processors. For space missions that require more computational performance than rad-hard processors can provide, alternative solutions such as COTS-based systems-on-chips (SoCs) may be considered. One such SoC, the NVIDIA Tegra K1 (TK1), has achieved adequate radiation tolerance for some classes of space missions. Several vendors have developed radiation-tolerant single-board computer solutions targeted primarily for low Earth orbit (LEO) space missions that can utilize COTS-based hardware due to shorter planned lifetimes with lower radiation requirements. With an increased interest in space-based computing using advanced SoCs such as the TK1, a need exists for an improved understanding of its computational capabilities. This research study characterizes the performance of each computational element of the TK1, including the ARM Cortex-A15 MPCore CPU, the NVIDIA Kepler GK20A GPU, and their constituent computational units. Hardware measurements are generated using the SpaceBench benchmarking library on a TK1 development board. Software optimizations are studied for improved parallel performance using OpenMP for CPU multithreading, ARM NEON for single-instruction multiple-data (SIMD) operations, Compute Unified Device Architecture (CUDA) for GPU parallelization, and optimized Basic Linear Algebra Subprograms (BLAS) software libraries. By characterizing the computational performance of the TK1 and demonstrating how to optimize software effectively for each computational unit within the architecture, future designers can better understand how to successfully port their applications to COTS-based SoCs to enable improved capabilities in space systems. Experimental outcomes show that both the CPU and GPU achieved high levels of parallel efficiency with the optimizations employed and that the GPU outperformed the CPU for nearly every benchmark, with single-precision floating-point (SPFP) operations achieving the highest performance.\",\"PeriodicalId\":258620,\"journal\":{\"name\":\"2023 IEEE Space Computing Conference (SCC)\",\"volume\":\"31 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2023 IEEE Space Computing Conference (SCC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/SCC57168.2023.00014\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 IEEE Space Computing Conference (SCC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SCC57168.2023.00014","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

摘要

抗辐射(rad-hard)处理器被设计为在极端辐射环境中可靠,但它们的性能通常低于商用现货(COTS)处理器。对于需要比雷达硬处理器提供更多计算性能的空间任务,可以考虑诸如基于cots的片上系统(soc)等替代解决方案。其中一个SoC, NVIDIA Tegra K1 (TK1),已经达到了足够的辐射耐受某些类别的太空任务。一些供应商已经开发了耐辐射单板计算机解决方案,主要针对低地球轨道(LEO)空间任务,由于计划寿命较短,辐射要求较低,因此可以利用基于cots的硬件。随着人们对使用先进soc(如TK1)进行天基计算的兴趣越来越大,需要更好地理解其计算能力。本研究描述了TK1的每个计算单元的性能,包括ARM Cortex-A15 MPCore CPU、NVIDIA Kepler GK20A GPU及其组成计算单元。硬件测量是使用TK1开发板上的SpaceBench基准库生成的。研究了软件优化以提高并行性能,使用OpenMP用于CPU多线程,ARM NEON用于单指令多数据(SIMD)操作,计算统一设备架构(CUDA)用于GPU并行化,并优化了基本线性代数子程序(BLAS)软件库。通过表征TK1的计算性能,并演示如何有效地优化架构内每个计算单元的软件,未来的设计师可以更好地了解如何成功地将他们的应用程序移植到基于cots的soc上,以提高空间系统的能力。实验结果表明,CPU和GPU都实现了高水平的并行效率,并且GPU在几乎所有基准测试中都优于CPU,单精度浮点(SPFP)操作实现了最高性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Performance Evaluation of the Radiation-Tolerant NVIDIA Tegra K1 System-on-Chip
Radiation-hardened (rad-hard) processors are designed to be reliable in extreme radiation environments, but they typically have lower performance than commercial-off-the-shelf (COTS) processors. For space missions that require more computational performance than rad-hard processors can provide, alternative solutions such as COTS-based systems-on-chips (SoCs) may be considered. One such SoC, the NVIDIA Tegra K1 (TK1), has achieved adequate radiation tolerance for some classes of space missions. Several vendors have developed radiation-tolerant single-board computer solutions targeted primarily for low Earth orbit (LEO) space missions that can utilize COTS-based hardware due to shorter planned lifetimes with lower radiation requirements. With an increased interest in space-based computing using advanced SoCs such as the TK1, a need exists for an improved understanding of its computational capabilities. This research study characterizes the performance of each computational element of the TK1, including the ARM Cortex-A15 MPCore CPU, the NVIDIA Kepler GK20A GPU, and their constituent computational units. Hardware measurements are generated using the SpaceBench benchmarking library on a TK1 development board. Software optimizations are studied for improved parallel performance using OpenMP for CPU multithreading, ARM NEON for single-instruction multiple-data (SIMD) operations, Compute Unified Device Architecture (CUDA) for GPU parallelization, and optimized Basic Linear Algebra Subprograms (BLAS) software libraries. By characterizing the computational performance of the TK1 and demonstrating how to optimize software effectively for each computational unit within the architecture, future designers can better understand how to successfully port their applications to COTS-based SoCs to enable improved capabilities in space systems. Experimental outcomes show that both the CPU and GPU achieved high levels of parallel efficiency with the optimizations employed and that the GPU outperformed the CPU for nearly every benchmark, with single-precision floating-point (SPFP) operations achieving the highest performance.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信