{"title":"GPU-accelerated cloud computing services and performance evaluation","authors":"Zakery Collins, Gennaro De Luca, Yinong Chen","doi":"10.1016/j.simpat.2025.103181","DOIUrl":null,"url":null,"abstract":"<div><div>This paper explores the feasibility of replacing traditional CPU-based cloud computing with Graphic Processing Unit GPU-accelerated services. Using NVIDIA’s CUDA GPU-accelerated C/<em>C</em>++ and Python libraries, we benchmark the performance of GPU computing against multithreaded CPU computing across several domains, including machine learning and large-scale image processing. A novel contribution of this work is an intelligent autoscaling system that maximizes single-GPU resource utilization before scaling to additional GPUs, improving efficiency in cloud-based deployments. Our simulation experiments demonstrate significant performance gains for GPU-accelerated computing and highlight the impact of optimized resource allocation in cloud environments. For example, in a machine learning experiment, using a dataset with 8.790 entries, the execution of a GeForce 3060 ti GPU is 3.42 times faster than a 16-thread CPU computer. Compared with the same 16-thread CPU, Tesla K80 GPU is 4.17 times faster. Furthermore, we provide an analysis of GPU performance optimization strategies, including memory management, concurrency techniques, and workload distribution methodologies, offering insights into the long-term scalability and cost-effectiveness of GPU-accelerated cloud infrastructure.</div></div>","PeriodicalId":49518,"journal":{"name":"Simulation Modelling Practice and Theory","volume":"144 ","pages":"Article 103181"},"PeriodicalIF":3.5000,"publicationDate":"2025-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Simulation Modelling Practice and Theory","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1569190X25001169","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 0
Abstract
This paper explores the feasibility of replacing traditional CPU-based cloud computing with Graphic Processing Unit GPU-accelerated services. Using NVIDIA’s CUDA GPU-accelerated C/C++ and Python libraries, we benchmark the performance of GPU computing against multithreaded CPU computing across several domains, including machine learning and large-scale image processing. A novel contribution of this work is an intelligent autoscaling system that maximizes single-GPU resource utilization before scaling to additional GPUs, improving efficiency in cloud-based deployments. Our simulation experiments demonstrate significant performance gains for GPU-accelerated computing and highlight the impact of optimized resource allocation in cloud environments. For example, in a machine learning experiment, using a dataset with 8.790 entries, the execution of a GeForce 3060 ti GPU is 3.42 times faster than a 16-thread CPU computer. Compared with the same 16-thread CPU, Tesla K80 GPU is 4.17 times faster. Furthermore, we provide an analysis of GPU performance optimization strategies, including memory management, concurrency techniques, and workload distribution methodologies, offering insights into the long-term scalability and cost-effectiveness of GPU-accelerated cloud infrastructure.
期刊介绍:
The journal Simulation Modelling Practice and Theory provides a forum for original, high-quality papers dealing with any aspect of systems simulation and modelling.
The journal aims at being a reference and a powerful tool to all those professionally active and/or interested in the methods and applications of simulation. Submitted papers will be peer reviewed and must significantly contribute to modelling and simulation in general or use modelling and simulation in application areas.
Paper submission is solicited on:
• theoretical aspects of modelling and simulation including formal modelling, model-checking, random number generators, sensitivity analysis, variance reduction techniques, experimental design, meta-modelling, methods and algorithms for validation and verification, selection and comparison procedures etc.;
• methodology and application of modelling and simulation in any area, including computer systems, networks, real-time and embedded systems, mobile and intelligent agents, manufacturing and transportation systems, management, engineering, biomedical engineering, economics, ecology and environment, education, transaction handling, etc.;
• simulation languages and environments including those, specific to distributed computing, grid computing, high performance computers or computer networks, etc.;
• distributed and real-time simulation, simulation interoperability;
• tools for high performance computing simulation, including dedicated architectures and parallel computing.