{"title":"An adaptive recalibrative contextual squeeze-and-excitation self-attention V-Net for kidney tumor segmentation in RCC imaging","authors":"C. Pabitha, S. Benila, B. Vanathi","doi":"10.1140/epjp/s13360-025-06686-2","DOIUrl":null,"url":null,"abstract":"<div><p>Accurate and efficient kidney tumor segmentation in renal cell carcinoma (RCC) imaging is essential for early diagnosis and surgical intervention. However, existing models struggle with class imbalance, small tumor detection, boundary irregularities, and imaging variations across CT protocols, limiting their clinical applicability and generalization. To address these challenges, we propose an advanced segmentation framework called as Adaptive Recalibrative Contextual Squeeze-and-Excitation Self-Attention V-Net (ARCSAV-Net). The novel ARCSAV-Net combines various innovations in the traditional V-Net architecture to more effectively segment kidney tumors in RCC images. First, Adaptive Recalibrative Contextual Squeeze-and-Excitation (AR-CSE) Blocks enhance feature prioritization by utilizing radiomic biomarkers such as entropy and vascular features to reduce class imbalance and tumor heterogeneity. Second, the Self-Attention V-Net Mechanism enhances boundary definition by reducing redundant features and enhancing focus on low-contrast and small tumors to enhance segmentation accuracy. Third, Task-Switching Self-Supervision (TSSS) reinforces feature learning through alternating between primary segmentation and secondary tasks such as rotation and intensity prediction to mitigate overfitting and enhance model robustness. Second, Context-Based Confidence Estimation (CBCT) strengthens uncertain predictions to impose consistency on segmentation across varying imaging protocols. Lastly, Bayesian Hyperparameter Optimization (ML-TPE) adjusts model parameters with low computational overhead, reducing computational overhead while ensuring generalization. Experimental results on KiTS19 and KiTS21 datasets demonstrate that AR-CSE-SAV-Net achieves better segmentation performance, with a Dice Similarity Coefficient (DSC) of 0.985, Volumetric Overlap Error (VOE) of 0.16, and Mean Surface Distance (MSD) of 0.6 mm, significantly outperforming existing methods in accuracy and inference speed.</p></div>","PeriodicalId":792,"journal":{"name":"The European Physical Journal Plus","volume":"140 8","pages":""},"PeriodicalIF":2.9000,"publicationDate":"2025-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"The European Physical Journal Plus","FirstCategoryId":"4","ListUrlMain":"https://link.springer.com/article/10.1140/epjp/s13360-025-06686-2","RegionNum":3,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"PHYSICS, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0
Abstract
Accurate and efficient kidney tumor segmentation in renal cell carcinoma (RCC) imaging is essential for early diagnosis and surgical intervention. However, existing models struggle with class imbalance, small tumor detection, boundary irregularities, and imaging variations across CT protocols, limiting their clinical applicability and generalization. To address these challenges, we propose an advanced segmentation framework called as Adaptive Recalibrative Contextual Squeeze-and-Excitation Self-Attention V-Net (ARCSAV-Net). The novel ARCSAV-Net combines various innovations in the traditional V-Net architecture to more effectively segment kidney tumors in RCC images. First, Adaptive Recalibrative Contextual Squeeze-and-Excitation (AR-CSE) Blocks enhance feature prioritization by utilizing radiomic biomarkers such as entropy and vascular features to reduce class imbalance and tumor heterogeneity. Second, the Self-Attention V-Net Mechanism enhances boundary definition by reducing redundant features and enhancing focus on low-contrast and small tumors to enhance segmentation accuracy. Third, Task-Switching Self-Supervision (TSSS) reinforces feature learning through alternating between primary segmentation and secondary tasks such as rotation and intensity prediction to mitigate overfitting and enhance model robustness. Second, Context-Based Confidence Estimation (CBCT) strengthens uncertain predictions to impose consistency on segmentation across varying imaging protocols. Lastly, Bayesian Hyperparameter Optimization (ML-TPE) adjusts model parameters with low computational overhead, reducing computational overhead while ensuring generalization. Experimental results on KiTS19 and KiTS21 datasets demonstrate that AR-CSE-SAV-Net achieves better segmentation performance, with a Dice Similarity Coefficient (DSC) of 0.985, Volumetric Overlap Error (VOE) of 0.16, and Mean Surface Distance (MSD) of 0.6 mm, significantly outperforming existing methods in accuracy and inference speed.
期刊介绍:
The aims of this peer-reviewed online journal are to distribute and archive all relevant material required to document, assess, validate and reconstruct in detail the body of knowledge in the physical and related sciences.
The scope of EPJ Plus encompasses a broad landscape of fields and disciplines in the physical and related sciences - such as covered by the topical EPJ journals and with the explicit addition of geophysics, astrophysics, general relativity and cosmology, mathematical and quantum physics, classical and fluid mechanics, accelerator and medical physics, as well as physics techniques applied to any other topics, including energy, environment and cultural heritage.