Stochastic Differential Equation Approach as Uncertainty-Aware Feature Recalibration Module in Image Classification

IF 3 4区 计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC
Romen Samuel Wabina, Prut Saowaprut, Junwei Yang, Christine Wagas Pitos
{"title":"Stochastic Differential Equation Approach as Uncertainty-Aware Feature Recalibration Module in Image Classification","authors":"Romen Samuel Wabina,&nbsp;Prut Saowaprut,&nbsp;Junwei Yang,&nbsp;Christine Wagas Pitos","doi":"10.1002/ima.70131","DOIUrl":null,"url":null,"abstract":"<p>Despite significant advancements in image classification, deep learning models struggle to accurately discern fine details in images, producing overly confident and imbalanced predictions for certain classes. These models typically employ feature recalibration techniques but do not account for the underlying uncertainty in predictions—particularly in complex sequential tasks like image classification. These uncertainties can significantly impact the reliability of subsequent analyses, potentially compromising accuracy across various applications. To address these limitations, we introduce the Stochastic Differential Equation Recalibration Module (SDERM), a novel approach designed to dynamically adjust the channel-wise feature responses in convolutional neural networks. It integrates a stochastic differential equation (SDE) framework into a feature recalibration module to capture the inherent uncertainties in the data and its model predictions. To the best of our knowledge, our study is the first to explore the integration of SDE-based feature recalibration modules in image classification. We build SDERM based on two interconnected networks—drift and diffusion network. The drift network serves as a deterministic component that approximates the predictive function of the model that systematically influences recalibrations of the predictions without considering the randomness. Concurrently, the diffusion network uses the Wiener process that captures the inherent uncertainties within the data and the network's predictions. We tested the classification accuracy of SDERM in ResNet50, ResNet101, and ResNet152 against other recalibration modules, including Squeeze-Excitation (SE), Convolutional Block Attention Module (CBAM), Gather and Excite (GE), and Position-Aware Recalibration Module (PARM), as well as the original Bottleneck architecture. Public image classification datasets were used, including CIFAR-10, SVHN, FashionMNIST, and HAM10000, and their classification accuracies were evaluated using the F1 score. The proposed ResNetSDE architecture achieved state-of-the-art F1 scores across four of five benchmark datasets. On Fashion-MNIST, ResNetSDE attained an F1 score of 0.937 (CI: 0.932–0.941), outperforming all baseline recalibration methods by margins of 0.9%–1.3%. For CIFAR-10 and CIFAR-100, ResNetSDE achieved 0.886 (CI: 0.879–0.892) and 0.962 (CI: 0.958–0.965), respectively, surpassing ResNet-GE and ResNet-CBAM by 3.5% and 1.3%, respectively. ResNetSDE dominated SVHN with an F1 of 0.956 (CI: 0.953–0.958), a significant improvement over ResNet-CBAM's 0.948 (CI: 0.945–0.951). While ResNet-CBAM led on the class-imbalanced HAM10000 (0.770, CI: 0.758–0.782), ResNetSDE remained competitive (0.768, CI: 0.749–0.786) since its consistent superiority—evidenced by narrow confidence intervals—validates its efficacy as a feature recalibration framework. Our experiments demonstrate that SDERM can outperform existing feature recalibration modules in image classification. The integration of SDERM to ResNet enables leveraging adaptability toward the stochasticity of each dataset at various depths of the architecture in image classification where uncertainty plays a fundamental role.</p>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"35 4","pages":""},"PeriodicalIF":3.0000,"publicationDate":"2025-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/ima.70131","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Imaging Systems and Technology","FirstCategoryId":"94","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/ima.70131","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0

Abstract

Despite significant advancements in image classification, deep learning models struggle to accurately discern fine details in images, producing overly confident and imbalanced predictions for certain classes. These models typically employ feature recalibration techniques but do not account for the underlying uncertainty in predictions—particularly in complex sequential tasks like image classification. These uncertainties can significantly impact the reliability of subsequent analyses, potentially compromising accuracy across various applications. To address these limitations, we introduce the Stochastic Differential Equation Recalibration Module (SDERM), a novel approach designed to dynamically adjust the channel-wise feature responses in convolutional neural networks. It integrates a stochastic differential equation (SDE) framework into a feature recalibration module to capture the inherent uncertainties in the data and its model predictions. To the best of our knowledge, our study is the first to explore the integration of SDE-based feature recalibration modules in image classification. We build SDERM based on two interconnected networks—drift and diffusion network. The drift network serves as a deterministic component that approximates the predictive function of the model that systematically influences recalibrations of the predictions without considering the randomness. Concurrently, the diffusion network uses the Wiener process that captures the inherent uncertainties within the data and the network's predictions. We tested the classification accuracy of SDERM in ResNet50, ResNet101, and ResNet152 against other recalibration modules, including Squeeze-Excitation (SE), Convolutional Block Attention Module (CBAM), Gather and Excite (GE), and Position-Aware Recalibration Module (PARM), as well as the original Bottleneck architecture. Public image classification datasets were used, including CIFAR-10, SVHN, FashionMNIST, and HAM10000, and their classification accuracies were evaluated using the F1 score. The proposed ResNetSDE architecture achieved state-of-the-art F1 scores across four of five benchmark datasets. On Fashion-MNIST, ResNetSDE attained an F1 score of 0.937 (CI: 0.932–0.941), outperforming all baseline recalibration methods by margins of 0.9%–1.3%. For CIFAR-10 and CIFAR-100, ResNetSDE achieved 0.886 (CI: 0.879–0.892) and 0.962 (CI: 0.958–0.965), respectively, surpassing ResNet-GE and ResNet-CBAM by 3.5% and 1.3%, respectively. ResNetSDE dominated SVHN with an F1 of 0.956 (CI: 0.953–0.958), a significant improvement over ResNet-CBAM's 0.948 (CI: 0.945–0.951). While ResNet-CBAM led on the class-imbalanced HAM10000 (0.770, CI: 0.758–0.782), ResNetSDE remained competitive (0.768, CI: 0.749–0.786) since its consistent superiority—evidenced by narrow confidence intervals—validates its efficacy as a feature recalibration framework. Our experiments demonstrate that SDERM can outperform existing feature recalibration modules in image classification. The integration of SDERM to ResNet enables leveraging adaptability toward the stochasticity of each dataset at various depths of the architecture in image classification where uncertainty plays a fundamental role.

Abstract Image

随机微分方程方法作为图像分类中不确定性感知特征再校准模块
尽管在图像分类方面取得了重大进展,但深度学习模型难以准确识别图像中的细节,对某些类别产生过于自信和不平衡的预测。这些模型通常采用特征重新校准技术,但没有考虑到预测中潜在的不确定性,尤其是在图像分类等复杂的顺序任务中。这些不确定性会严重影响后续分析的可靠性,潜在地影响各种应用的准确性。为了解决这些限制,我们引入了随机微分方程再校准模块(SDERM),这是一种旨在动态调整卷积神经网络中通道特征响应的新方法。它将随机微分方程(SDE)框架集成到特征再校准模块中,以捕获数据及其模型预测中的固有不确定性。据我们所知,我们的研究是第一个探索基于sde的特征再校准模块在图像分类中的集成。我们基于两个相互连接的网络——漂移网络和扩散网络构建了SDERM。漂移网络作为一个确定性的组成部分,它近似于模型的预测函数,该模型系统地影响预测的重新校准,而不考虑随机性。同时,扩散网络使用维纳过程来捕捉数据和网络预测中固有的不确定性。我们在ResNet50、ResNet101和ResNet152中测试了SDERM对其他重校准模块的分类精度,包括挤压激励(SE)、卷积块注意模块(CBAM)、聚集和激发(GE)和位置感知重校准模块(PARM),以及原始的瓶颈架构。使用公共图像分类数据集CIFAR-10、SVHN、FashionMNIST和HAM10000,并使用F1评分评估其分类精度。提议的ResNetSDE架构在五个基准数据集中的四个中获得了最先进的F1分数。在Fashion-MNIST上,ResNetSDE的F1评分为0.937 (CI: 0.932-0.941),优于所有基线重新校准方法,差值为0.9%-1.3%。对于CIFAR-10和CIFAR-100, ResNetSDE分别达到0.886 (CI: 0.879-0.892)和0.962 (CI: 0.958-0.965),分别超过ResNet-GE和ResNet-CBAM 3.5%和1.3%。ResNetSDE优势SVHN, F1为0.956 (CI: 0.953-0.958),较ResNet-CBAM的0.948 (CI: 0.945-0.951)有显著改善。虽然ResNet-CBAM在类别不平衡的HAM10000上领先(0.770,CI: 0.758-0.782),但ResNetSDE仍然具有竞争力(0.768,CI: 0.749-0.786),因为其一致的优势(通过窄置信区间证明)验证了其作为特征重新校准框架的有效性。我们的实验表明,SDERM在图像分类方面优于现有的特征再校准模块。SDERM与ResNet的集成使得在不确定性起基本作用的图像分类中,能够在体系结构的不同深度利用对每个数据集的随机性的适应性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
International Journal of Imaging Systems and Technology
International Journal of Imaging Systems and Technology 工程技术-成像科学与照相技术
CiteScore
6.90
自引率
6.10%
发文量
138
审稿时长
3 months
期刊介绍: The International Journal of Imaging Systems and Technology (IMA) is a forum for the exchange of ideas and results relevant to imaging systems, including imaging physics and informatics. The journal covers all imaging modalities in humans and animals. IMA accepts technically sound and scientifically rigorous research in the interdisciplinary field of imaging, including relevant algorithmic research and hardware and software development, and their applications relevant to medical research. The journal provides a platform to publish original research in structural and functional imaging. The journal is also open to imaging studies of the human body and on animals that describe novel diagnostic imaging and analyses methods. Technical, theoretical, and clinical research in both normal and clinical populations is encouraged. Submissions describing methods, software, databases, replication studies as well as negative results are also considered. The scope of the journal includes, but is not limited to, the following in the context of biomedical research: Imaging and neuro-imaging modalities: structural MRI, functional MRI, PET, SPECT, CT, ultrasound, EEG, MEG, NIRS etc.; Neuromodulation and brain stimulation techniques such as TMS and tDCS; Software and hardware for imaging, especially related to human and animal health; Image segmentation in normal and clinical populations; Pattern analysis and classification using machine learning techniques; Computational modeling and analysis; Brain connectivity and connectomics; Systems-level characterization of brain function; Neural networks and neurorobotics; Computer vision, based on human/animal physiology; Brain-computer interface (BCI) technology; Big data, databasing and data mining.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信