Melanoma Detection through Combining Reinforcement Learning, Generative Adversarial Network, and Bayesian Optimization

IF 4.9 2区医学 Q1 ENGINEERING, BIOMEDICAL

Biomedical Signal Processing and Control Pub Date : 2025-10-09 DOI:10.1016/j.bspc.2025.108668

Jing Yang , Yajie Wan , Su Diao , Osama Alfarraj , Fahad Alblehai , Amr Tolba , Zaffar Ahmed Shaikh , Lip Yee Por , Roohallah Alizadehsani , Yudong Zhang

{"title":"Melanoma Detection through Combining Reinforcement Learning, Generative Adversarial Network, and Bayesian Optimization","authors":"Jing Yang , Yajie Wan , Su Diao , Osama Alfarraj , Fahad Alblehai , Amr Tolba , Zaffar Ahmed Shaikh , Lip Yee Por , Roohallah Alizadehsani , Yudong Zhang","doi":"10.1016/j.bspc.2025.108668","DOIUrl":null,"url":null,"abstract":"<div><div>Melanoma, a highly aggressive form of skin cancer, is primarily driven by DNA alterations often linked to environmental factors such as ultraviolet radiation. Addressing the need for improved early detection, this study tackles the key limitations of current methods, which frequently employ convolutional neural networks (CNNs) but struggle with feature selection, class imbalance, hyperparameter tuning, and generalizability. Our strategy leverages dilated convolution (DC) layers trained using reinforcement learning (RL). Unlike other RL-based approaches that handle these challenges in isolation, our method introduces a multi-stage architecture. It integrates RL for feature selection and class balancing. Shapley additive explanations (SHAP) guide feature identification, while augmented rewards for underrepresented classes help mitigate data imbalance. Bayesian optimization hyperband (BOHB) is used for hyperparameter tuning in a unified training process. BOHB combines the predictive strength of Bayesian optimization with the efficiency of hyperband, accelerating model tuning. It also includes an online GAN module for dynamic data augmentation that responds to the evolving output of the RL agent. A novel regularization technique stabilizes GAN training and prevents mode collapse. Importantly, existing RL methods face the challenge of balancing exploration and exploitation. In our RL model, the scope loss function (SLF), integrated with RL, balances exploration and exploitation, thereby ensuring accuracy and generalizability. Collectively, the model jointly tackles four persistent challenges in earlier RL-based approaches: poor exploration–exploitation balance, unstable reward dynamics, static data augmentation, and manual hyperparameter tuning. The model achieved F-measures of 94.3 %, 93.7 %, and 91.5 % on ISIC-2020, HAM10000, and PH2, respectively. This advancement significantly improves early melanoma detection and supports more accurate treatment decisions, contributing valuably to the ongoing effort to combat this lethal cancer.</div></div>","PeriodicalId":55362,"journal":{"name":"Biomedical Signal Processing and Control","volume":"112 ","pages":"Article 108668"},"PeriodicalIF":4.9000,"publicationDate":"2025-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Biomedical Signal Processing and Control","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1746809425011796","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, BIOMEDICAL","Score":null,"Total":0}

引用次数: 0

Abstract

Melanoma, a highly aggressive form of skin cancer, is primarily driven by DNA alterations often linked to environmental factors such as ultraviolet radiation. Addressing the need for improved early detection, this study tackles the key limitations of current methods, which frequently employ convolutional neural networks (CNNs) but struggle with feature selection, class imbalance, hyperparameter tuning, and generalizability. Our strategy leverages dilated convolution (DC) layers trained using reinforcement learning (RL). Unlike other RL-based approaches that handle these challenges in isolation, our method introduces a multi-stage architecture. It integrates RL for feature selection and class balancing. Shapley additive explanations (SHAP) guide feature identification, while augmented rewards for underrepresented classes help mitigate data imbalance. Bayesian optimization hyperband (BOHB) is used for hyperparameter tuning in a unified training process. BOHB combines the predictive strength of Bayesian optimization with the efficiency of hyperband, accelerating model tuning. It also includes an online GAN module for dynamic data augmentation that responds to the evolving output of the RL agent. A novel regularization technique stabilizes GAN training and prevents mode collapse. Importantly, existing RL methods face the challenge of balancing exploration and exploitation. In our RL model, the scope loss function (SLF), integrated with RL, balances exploration and exploitation, thereby ensuring accuracy and generalizability. Collectively, the model jointly tackles four persistent challenges in earlier RL-based approaches: poor exploration–exploitation balance, unstable reward dynamics, static data augmentation, and manual hyperparameter tuning. The model achieved F-measures of 94.3 %, 93.7 %, and 91.5 % on ISIC-2020, HAM10000, and PH2, respectively. This advancement significantly improves early melanoma detection and supports more accurate treatment decisions, contributing valuably to the ongoing effort to combat this lethal cancer.

查看原文本刊更多论文

结合强化学习、生成对抗网络和贝叶斯优化的黑色素瘤检测

黑色素瘤是一种高度侵袭性的皮肤癌，主要是由DNA改变引起的，这种改变通常与紫外线辐射等环境因素有关。为了解决改进早期检测的需要，本研究解决了当前方法的关键局限性，这些方法经常使用卷积神经网络（cnn），但在特征选择、类不平衡、超参数调优和泛化方面存在困难。我们的策略利用了使用强化学习（RL）训练的扩展卷积（DC）层。与其他孤立处理这些挑战的基于强化学习的方法不同，我们的方法引入了多阶段架构。它集成了RL来进行特征选择和类平衡。Shapley加性解释（SHAP）指导特征识别，而对代表性不足的类的增强奖励有助于减轻数据不平衡。在统一训练过程中，采用贝叶斯优化超带（BOHB）进行超参数整定。BOHB结合了贝叶斯优化的预测强度和超频带的效率，加速了模型的调整。它还包括一个在线GAN模块，用于动态数据增强，以响应RL代理的不断变化的输出。一种新的正则化技术稳定了GAN训练并防止了模式崩溃。重要的是，现有的强化学习方法面临着平衡勘探和开发的挑战。在我们的RL模型中，范围损失函数（SLF）与RL相结合，平衡了勘探和开发，从而保证了准确性和泛化性。总的来说，该模型共同解决了早期基于强化学习的方法中持续存在的四个挑战：糟糕的勘探开发平衡、不稳定的奖励动态、静态数据增强和手动超参数调优。该模型在ISIC-2020、HAM10000和PH2上的f值分别为94.3%、93.7%和91.5%。这一进展显著提高了黑色素瘤的早期检测，并支持更准确的治疗决策，为对抗这种致命癌症的持续努力做出了有价值的贡献。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Biomedical Signal Processing and Control 工程技术-工程：生物医学

CiteScore

9.80

自引率

13.70%

发文量

822

审稿时长

4 months

期刊介绍： Biomedical Signal Processing and Control aims to provide a cross-disciplinary international forum for the interchange of information on research in the measurement and analysis of signals and images in clinical medicine and the biological sciences. Emphasis is placed on contributions dealing with the practical, applications-led research on the use of methods and devices in clinical diagnosis, patient monitoring and management. Biomedical Signal Processing and Control reflects the main areas in which these methods are being used and developed at the interface of both engineering and clinical science. The scope of the journal is defined to include relevant review papers, technical notes, short communications and letters. Tutorial papers and special issues will also be published.