Optimizing inference of segmentation on high-resolution images in MLExchange.

IF 2.7 3区计算机科学 Q2 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

Journal of Supercomputing Pub Date : 2025-01-01 Epub Date: 2025-06-20 DOI:10.1007/s11227-025-07413-5

Shizhao Lu, Tanny Chavez, Wiebke Koepp, Guanhua Hao, Petrus H Zwart, Alexander Hexemer

{"title":"Optimizing inference of segmentation on high-resolution images in MLExchange.","authors":"Shizhao Lu, Tanny Chavez, Wiebke Koepp, Guanhua Hao, Petrus H Zwart, Alexander Hexemer","doi":"10.1007/s11227-025-07413-5","DOIUrl":null,"url":null,"abstract":"<p><p>MLExchange is a machine learning (ML) operations platform providing web user-interfaces (UIs) for data visualization and analysis pipelines at synchrotron facilities. Among these UIs is the segmentation app which helps synchrotron users utilize ML algorithms to automatically segment high-resolution scientific images with minimal manual annotation effort. In this work, we share code optimizations that significantly speed up the segmentation inference workflow of large data in short time. By optimizing the sequence of CPU-GPU data transfers and introducing CPU parallelization to key operations, we improve the per-device, per-image frame computational efficiency and observe close to 3 <math><mo>×</mo></math> speedup over the original segmentation inference workflow run time when utilizing a single GPU. Further adaptations enabling multi-GPU inference yield more than 40 <math><mo>×</mo></math> speedup with 100 GPUs compared to the optimized single GPU inference workflow. This acceleration of the segmentation inference workflow will provide MLExchange users with easy access to segmentation results with little wait time.</p>","PeriodicalId":50034,"journal":{"name":"Journal of Supercomputing","volume":"81 9","pages":"1058"},"PeriodicalIF":2.7000,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12181103/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Supercomputing","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s11227-025-07413-5","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/6/20 0:00:00","PubModel":"Epub","JCR":"Q2","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}

引用次数: 0

Abstract

MLExchange is a machine learning (ML) operations platform providing web user-interfaces (UIs) for data visualization and analysis pipelines at synchrotron facilities. Among these UIs is the segmentation app which helps synchrotron users utilize ML algorithms to automatically segment high-resolution scientific images with minimal manual annotation effort. In this work, we share code optimizations that significantly speed up the segmentation inference workflow of large data in short time. By optimizing the sequence of CPU-GPU data transfers and introducing CPU parallelization to key operations, we improve the per-device, per-image frame computational efficiency and observe close to 3 $\times$ speedup over the original segmentation inference workflow run time when utilizing a single GPU. Further adaptations enabling multi-GPU inference yield more than 40 $\times$ speedup with 100 GPUs compared to the optimized single GPU inference workflow. This acceleration of the segmentation inference workflow will provide MLExchange users with easy access to segmentation results with little wait time.

Abstract Image

查看原文本刊更多论文

MLExchange中高分辨率图像分割推理的优化。

MLExchange是一个机器学习（ML）操作平台，为同步加速器设施的数据可视化和分析管道提供web用户界面（ui）。在这些用户界面中是分割应用程序，它帮助同步加速器用户利用机器学习算法自动分割高分辨率的科学图像，以最少的手工注释工作。在这项工作中，我们共享了在短时间内显著加快大数据分割推理工作流程的代码优化。通过优化CPU-GPU数据传输的顺序并将CPU并行化引入关键操作，我们提高了每个设备，每个图像帧的计算效率，并且在使用单个GPU时观察到比原始分割推理工作流运行时间加快近3倍。与优化的单GPU推理工作流程相比，进一步的调整使多GPU推理能够在100个GPU上产生超过40倍的加速。这种分割推理工作流程的加速将使MLExchange用户能够轻松访问分割结果，而无需等待时间。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Journal of Supercomputing 工程技术-工程：电子与电气

CiteScore

6.30

自引率

12.10%

发文量

734

审稿时长

13 months

期刊介绍： The Journal of Supercomputing publishes papers on the technology, architecture and systems, algorithms, languages and programs, performance measures and methods, and applications of all aspects of Supercomputing. Tutorial and survey papers are intended for workers and students in the fields associated with and employing advanced computer systems. The journal also publishes letters to the editor, especially in areas relating to policy, succinct statements of paradoxes, intuitively puzzling results, partial results and real needs. Published theoretical and practical papers are advanced, in-depth treatments describing new developments and new ideas. Each includes an introduction summarizing prior, directly pertinent work that is useful for the reader to understand, in order to appreciate the advances being described.