PZS-Net: Incorporating of Frame Sequence and Multi-Scale Priors for Prostate Zonal Segmentation in Transrectal Ultrasound

IF 6.8 Q1 AUTOMATION & CONTROL SYSTEMS

Advanced intelligent systems (Weinheim an der Bergstrasse, Germany) Pub Date : 2024-10-29 DOI:10.1002/aisy.202400302

Jianguo Ju, Qian Zhang, Pengfei Xu, Tiange Liu, Cheng Li, Ziyu Guan

{"title":"PZS-Net: Incorporating of Frame Sequence and Multi-Scale Priors for Prostate Zonal Segmentation in Transrectal Ultrasound","authors":"Jianguo Ju, Qian Zhang, Pengfei Xu, Tiange Liu, Cheng Li, Ziyu Guan","doi":"10.1002/aisy.202400302","DOIUrl":null,"url":null,"abstract":"<p>Transrectal ultrasound (TRUS) videos offer valuable histopathologic information about the prostate. Accurate prostate zonal segmentation in TRUS videos is vital for diagnosing prostate cancer and guiding surgery. However, TRUS videos are manually recorded by urologists, resulting in no standardized coordinate system, which limits direct prostate zonal segmentation in these videos. To overcome the limitation, a novel Prostate Zonal Segmentation Network (PZS-Net), based on U-Net, which learns critical cross-frame information and multi-scale features from sequential frames, is proposed. First, a sequential frame cross-attention (SFCA) module is designed to capture remote information from sequential frames to enhance the feature representation of the current frame. The SFCA module is embedded at each skip connection layer to extract crucial cross-frame information. Then, a multi-scale fusion (MSF) module that utilizes three parallel branches with different atrous convolutions is designed. The MSF module is placed at the bottleneck layer to dynamically fuse multi-scale context information from high-level features. Extensive experiments on TRUS image datasets show that the PZS-Net achieves higher accuracy in both the transitional zone (dice coefficient [Dice]: 68.90% ± 1.73%, mean intersection over union [mIoU]: 59.19% ± 2.09%, 95% Hausdorff distance [HD95]: 5.02 ± 0.83 mm) and the peripheral zone (Dice: 63.99% ± 3.16%, mIoU: 54.60% ± 3.35%, HD95: 5.28 ± 1.12 mm) and demonstrates the effectiveness and competitiveness of its key components via comprehensive ablation studies.</p>","PeriodicalId":93858,"journal":{"name":"Advanced intelligent systems (Weinheim an der Bergstrasse, Germany)","volume":"7 1","pages":""},"PeriodicalIF":6.8000,"publicationDate":"2024-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/aisy.202400302","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Advanced intelligent systems (Weinheim an der Bergstrasse, Germany)","FirstCategoryId":"1085","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/aisy.202400302","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}

引用次数: 0

Abstract

Transrectal ultrasound (TRUS) videos offer valuable histopathologic information about the prostate. Accurate prostate zonal segmentation in TRUS videos is vital for diagnosing prostate cancer and guiding surgery. However, TRUS videos are manually recorded by urologists, resulting in no standardized coordinate system, which limits direct prostate zonal segmentation in these videos. To overcome the limitation, a novel Prostate Zonal Segmentation Network (PZS-Net), based on U-Net, which learns critical cross-frame information and multi-scale features from sequential frames, is proposed. First, a sequential frame cross-attention (SFCA) module is designed to capture remote information from sequential frames to enhance the feature representation of the current frame. The SFCA module is embedded at each skip connection layer to extract crucial cross-frame information. Then, a multi-scale fusion (MSF) module that utilizes three parallel branches with different atrous convolutions is designed. The MSF module is placed at the bottleneck layer to dynamically fuse multi-scale context information from high-level features. Extensive experiments on TRUS image datasets show that the PZS-Net achieves higher accuracy in both the transitional zone (dice coefficient [Dice]: 68.90% ± 1.73%, mean intersection over union [mIoU]: 59.19% ± 2.09%, 95% Hausdorff distance [HD95]: 5.02 ± 0.83 mm) and the peripheral zone (Dice: 63.99% ± 3.16%, mIoU: 54.60% ± 3.35%, HD95: 5.28 ± 1.12 mm) and demonstrates the effectiveness and competitiveness of its key components via comprehensive ablation studies.

Abstract Image