Enhancing computational efficiency in 3-D seismic modelling with half-precision floating-point numbers based on the curvilinear grid finite-difference method
{"title":"Enhancing computational efficiency in 3-D seismic modelling with half-precision floating-point numbers based on the curvilinear grid finite-difference method","authors":"Jialiang Wan, Wenqiang Wang, Zhenguo Zhang","doi":"10.1093/gji/ggae235","DOIUrl":null,"url":null,"abstract":"Summary Large-scale and high-resolution seismic modelling are very significant to simulating seismic waves, evaluating earthquake hazards, and advancing exploration seismology. However, achieving high-resolution seismic modelling requires substantial computing and storage resources, resulting in a considerable computational cost. To enhance computational efficiency and performance, recent heterogeneous computing platforms, such as Nvidia Graphics Processing Units (GPUs), natively support half-precision floating-point numbers (FP16). FP16 operations can privide faster calculation speed, lower storage requirements and greater performance enhancement over single-precision floating-point numbers (FP32), thus providing significant benefits for seismic modelling. Nevertheless, the inherent limitation of fewer 16-bit representations in FP16 may lead to severe numerical overflow, underflow, and floating-point errors during computation. In this study, to ensure stable wave equation solutions and minimize the floating-point errors, we employ a scaling strategy to adjust the computation of FP16 arithmetic operations. For optimal GPU floating-point performance, we implement a 2-way single instruction multiple data (SIMD) within the floating-point units (FPUs) of CUDA cores. Moreover, we implement an earthquake simulation solver for FP16 operations based on curvilinear grid finite-difference method (CGFDM) and perform several earthquake simulations. Comparing the results of wavefield data with the standard CGFDM using FP32, the errors introduced by FP16 are minimal, demonstrating excellent consistency with the FP32 results. Performance analysis indicates that FP16 seismic modelling exhibits a remarkable improvement in computational efficiency, achieving a speedup of approximately 1.75 and reducing memory usage by half compared to the FP32 version.","PeriodicalId":12519,"journal":{"name":"Geophysical Journal International","volume":"20 1","pages":""},"PeriodicalIF":2.8000,"publicationDate":"2024-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Geophysical Journal International","FirstCategoryId":"89","ListUrlMain":"https://doi.org/10.1093/gji/ggae235","RegionNum":3,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"GEOCHEMISTRY & GEOPHYSICS","Score":null,"Total":0}
引用次数: 0
Abstract
Summary Large-scale and high-resolution seismic modelling are very significant to simulating seismic waves, evaluating earthquake hazards, and advancing exploration seismology. However, achieving high-resolution seismic modelling requires substantial computing and storage resources, resulting in a considerable computational cost. To enhance computational efficiency and performance, recent heterogeneous computing platforms, such as Nvidia Graphics Processing Units (GPUs), natively support half-precision floating-point numbers (FP16). FP16 operations can privide faster calculation speed, lower storage requirements and greater performance enhancement over single-precision floating-point numbers (FP32), thus providing significant benefits for seismic modelling. Nevertheless, the inherent limitation of fewer 16-bit representations in FP16 may lead to severe numerical overflow, underflow, and floating-point errors during computation. In this study, to ensure stable wave equation solutions and minimize the floating-point errors, we employ a scaling strategy to adjust the computation of FP16 arithmetic operations. For optimal GPU floating-point performance, we implement a 2-way single instruction multiple data (SIMD) within the floating-point units (FPUs) of CUDA cores. Moreover, we implement an earthquake simulation solver for FP16 operations based on curvilinear grid finite-difference method (CGFDM) and perform several earthquake simulations. Comparing the results of wavefield data with the standard CGFDM using FP32, the errors introduced by FP16 are minimal, demonstrating excellent consistency with the FP32 results. Performance analysis indicates that FP16 seismic modelling exhibits a remarkable improvement in computational efficiency, achieving a speedup of approximately 1.75 and reducing memory usage by half compared to the FP32 version.
期刊介绍:
Geophysical Journal International publishes top quality research papers, express letters, invited review papers and book reviews on all aspects of theoretical, computational, applied and observational geophysics.