Applied Intelligence最新文献

筛选
英文 中文
Learning promotion policies with attention-based deep Q-networks 基于注意的深度q网络学习推广策略
IF 3.5 2区 计算机科学
Applied Intelligence Pub Date : 2025-10-03 DOI: 10.1007/s10489-025-06914-3
Yingnan Xu, Xuchun Wu, Zhenjun Li, Congli Liu, Yansheng Zhang
{"title":"Learning promotion policies with attention-based deep Q-networks","authors":"Yingnan Xu,&nbsp;Xuchun Wu,&nbsp;Zhenjun Li,&nbsp;Congli Liu,&nbsp;Yansheng Zhang","doi":"10.1007/s10489-025-06914-3","DOIUrl":"10.1007/s10489-025-06914-3","url":null,"abstract":"<div><p>In financial services, personalized promotion strategies are critical for sustaining customer engagement and driving asset growth. We present FAT-DQN, a deep reinforcement learning framework for off-line environments that models sequential decision-making as a Markov Decision Process (MDP), where promotional actions influence future changes in customer assets under management (AUM). FAT-DQN extends the standard Deep Q-Network (DQN) architecture with a multi-head self-attention mechanism over promotion–reward histories augmented by learnable temporal encodings, and applies Feature-wise Linear Modulation (FiLM) to incorporate customer-segment embeddings. To improve robustness, we employ per-customer reward normalization and evaluate policies with both ranking-based metrics and counterfactual off-policy estimators. Empirical results on real promotion logs show that FAT-DQN consistently outperforms baseline methods, yielding a higher mean NDCG@3 (0.7744) compared to Batch-Constrained deep Q-learning (BCQ, 0.7325) and DQN (0.6852). It further improves alignment between predicted and realized outcomes, achieving a Spearman correlation of 0.2584, compared to 0.1619 for BCQ and 0.1522 for DQN. Counterfactual evaluations further show that FAT-DQN delivers consistently strong off-policy estimates, confirming its robustness across evaluation settings. These findings demonstrate that attention-based architectures with modulation offer a more effective and interpretable alternative to standard reinforcement learning approaches for personalized promotion planning in financial services.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"55 15","pages":""},"PeriodicalIF":3.5,"publicationDate":"2025-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145210886","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multimodal prompt learning with selective feature fusion: towards robust cross-modal alignment 基于选择性特征融合的多模态提示学习:面向鲁棒跨模态对齐
IF 3.5 2区 计算机科学
Applied Intelligence Pub Date : 2025-10-03 DOI: 10.1007/s10489-025-06919-y
Jiabao Han, Yahui Wang, Wei Zhong, Ying Zhang, Xichao Yuan
{"title":"Multimodal prompt learning with selective feature fusion: towards robust cross-modal alignment","authors":"Jiabao Han,&nbsp;Yahui Wang,&nbsp;Wei Zhong,&nbsp;Ying Zhang,&nbsp;Xichao Yuan","doi":"10.1007/s10489-025-06919-y","DOIUrl":"10.1007/s10489-025-06919-y","url":null,"abstract":"<div><p>Vision–language models (VLMs) have shown impressive transferability but still struggle with robustness and generalization when applied to downstream tasks with limited supervision. To address these challenges, we propose a Selective Feature Fusion (SFF) framework that adaptively suppresses noisy visual regions and reinforces task-relevant cross-modal cues through lightweight, learnable gating. Our approach integrates text-guided visual masking and image-aware textual calibration into a unified pipeline, enabling more discriminative and semantically aligned multimodal representations. Comprehensive evaluations across nine widely used benchmarks demonstrate that our method consistently surpasses strong prompt-learning baselines under both few-shot and base-to-novel generalization settings. In particular, under the 8-shot scenario, our approach achieves the best overall accuracy, maintaining a clear margin over representative methods such as CoCoOp and MaPLe. These results highlight not only the robustness of our design but also its effectiveness in capturing cross-modal semantics under data-limited conditions. Further analyses, including ablation studies and qualitative visualizations, confirm that the proposed gating and calibration modules are complementary and play indispensable roles in improving performance. Taken together, this work provides a simple yet powerful strategy for enhancing the adaptability and generalization of VLMs in real-world scenarios.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"55 15","pages":""},"PeriodicalIF":3.5,"publicationDate":"2025-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145210240","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
DIMCAR: dynamic intent modeling and context-aware recommendations in sparse data environment towards next basket prediction DIMCAR:稀疏数据环境下下一个篮预测的动态意图建模和上下文感知建议
IF 3.5 2区 计算机科学
Applied Intelligence Pub Date : 2025-10-03 DOI: 10.1007/s10489-025-06796-5
John Kingsley Arthur, Conghua Zhou, Xiang-Jun Shen, Ronky Wrancis Amber-Doh, Eric Appiah Mantey, Jeremiah Osei-Kwakye
{"title":"DIMCAR: dynamic intent modeling and context-aware recommendations in sparse data environment towards next basket prediction","authors":"John Kingsley Arthur,&nbsp;Conghua Zhou,&nbsp;Xiang-Jun Shen,&nbsp;Ronky Wrancis Amber-Doh,&nbsp;Eric Appiah Mantey,&nbsp;Jeremiah Osei-Kwakye","doi":"10.1007/s10489-025-06796-5","DOIUrl":"10.1007/s10489-025-06796-5","url":null,"abstract":"<div><p>In the fast-changing world of e-commerce, the success of recommender systems is crucial for boosting user engagement and increasing sales. Conventional models often struggle with evolving user preferences and data sparsity, hindering accurate predictions. Existing Graph-based regularization mechanisms and deep learning approaches address these challenges but remain sensitive to noise and computational complexity, limiting their effectiveness in large-scale, real-time settings. We propose a novel multi-layered Next Basket Recommender System called dynamic intent modelling and context-aware recommendation (DIMCAR) model to overcome these limitations. First, we resolve the data sparsity problem by constructing a novel optimized Graph Sparse Regularization framework for Non-negative Matrix Factorization (OGSR-NMF) framework integrating a time-varying graph structure, a novel hybrid sparsity norm, a modified Proximal Alternating Linearized Minimization (mPALM). Additionally, we dynamically model user intents and context using attention mechanisms and Gated Recurrent Units (GRUs). Finally, we integrate a novel Adaptive Reptile Basket Optimization Algorithm into a Deep Convolutional Neural Network, enhancing the model's adaptability to changing user behaviours in real time. Theoretical analysis and experiments on four benchmark datasets demonstrate that DIMCAR outperforms existing models in recommendation accuracy and user satisfaction.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"55 15","pages":""},"PeriodicalIF":3.5,"publicationDate":"2025-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145210238","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
EGPT-SPE: story point effort estimation using improved GPT-2 by removing inefficient attention heads EGPT-SPE:通过移除低效的注意力头,使用改进的GPT-2进行故事点工作量估计
IF 3.5 2区 计算机科学
Applied Intelligence Pub Date : 2025-10-02 DOI: 10.1007/s10489-025-06824-4
Amna Shahid Cheemaa, Muhammad Azhar, Fahim Arif, Qazi Mazhar ul haq, Muhammad Sohail, Asma Iqbal
{"title":"EGPT-SPE: story point effort estimation using improved GPT-2 by removing inefficient attention heads","authors":"Amna Shahid Cheemaa,&nbsp;Muhammad Azhar,&nbsp;Fahim Arif,&nbsp;Qazi Mazhar ul haq,&nbsp;Muhammad Sohail,&nbsp;Asma Iqbal","doi":"10.1007/s10489-025-06824-4","DOIUrl":"10.1007/s10489-025-06824-4","url":null,"abstract":"<div><p>Estimating story points from user requirements is crucial in the Software Development Life Cycle (SDLC) as it impacts resource allocation and timelines; inaccuracies can lead to missed deadlines and increased costs, harming a company’s reputation. While various techniques have emerged to automate this process, conventional machine learning methods often fail to understand the context of user requirements, and deep learning approaches face high computational costs. To address these issues, the Efficient GPT for Story Point Estimation (EGPT-SPE) algorithm optimizes the Multi-Head Attention module by removing inefficient heads, enhancing accuracy and reducing costs. Experiments on the Choetkiertikul dataset (23,313 issues across 16 open-source projects) and the TAWOS dataset (458,232 issues across 39 open-source projects from 12 public JIRA repositories) demonstrated a 5 to 15 percent accuracy improvement in both within-project and cross-project estimations, validating the algorithm’s effectiveness in agile story point estimation.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"55 15","pages":""},"PeriodicalIF":3.5,"publicationDate":"2025-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145210222","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
RXNet: cross-modality person re-identification based on a dual-branch network RXNet:基于双分支网络的跨模态人员再识别
IF 3.5 2区 计算机科学
Applied Intelligence Pub Date : 2025-10-01 DOI: 10.1007/s10489-025-06501-6
Weiyang Zhang, Jiong Guo, Qiang Liu, Maoyang Zou, Honggang Chen, Jing Peng
{"title":"RXNet: cross-modality person re-identification based on a dual-branch network","authors":"Weiyang Zhang,&nbsp;Jiong Guo,&nbsp;Qiang Liu,&nbsp;Maoyang Zou,&nbsp;Honggang Chen,&nbsp;Jing Peng","doi":"10.1007/s10489-025-06501-6","DOIUrl":"10.1007/s10489-025-06501-6","url":null,"abstract":"<div><p>The goal of text-based person re-identification (TI-ReID) is to match individuals using various methods by integrating information from both images and text. TI-ReID encounters significant challenges because of the clear differences in features between images and textual descriptions. Contemporary techniques commonly utilize a method that merges general and specific characteristics to obtain more detailed feature representations. However, these techniques depend on additional models for estimating or segmenting human poses to determine local characteristics, making it challenging to apply them in practice. To solve this problem, we propose a dual-path network based on RegNet and XLNet for TI-ReID (RXNet). In the image segment, RegNet is employed to acquire multitiered semantic image attributes and dynamically assimilate distinct local features through visual focus. In the text segment, XLNet is utilized, to extract significant semantic attributes from the text via a two-way encoding system based on an autoregressive model. Furthermore, to increase the efficacy of our model, we develop both residual triplet attention and dual attention to align features across different modalities. Additionally, we replace cross-entropy ID loss with smoothing ID loss to prevent overfitting while improving the efficiency of the model. Experimental results on the CUHK-PEDES dataset show that the proposed method achieves a rank-1/mAP accuracy of 85.49%/73.40%, outperforming the current state-of-the-art methods by a large margin.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"55 15","pages":""},"PeriodicalIF":3.5,"publicationDate":"2025-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145210584","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Deep learning techniques for point cloud tasks: a review 点云任务的深度学习技术:综述
IF 3.5 2区 计算机科学
Applied Intelligence Pub Date : 2025-09-30 DOI: 10.1007/s10489-025-06854-y
Xiaona Song, Haozhe Zhang, Lijun Wang, Jinxing Niu, Ying Zhu, Junjie Nian, Ruixue Cheng
{"title":"Deep learning techniques for point cloud tasks: a review","authors":"Xiaona Song,&nbsp;Haozhe Zhang,&nbsp;Lijun Wang,&nbsp;Jinxing Niu,&nbsp;Ying Zhu,&nbsp;Junjie Nian,&nbsp;Ruixue Cheng","doi":"10.1007/s10489-025-06854-y","DOIUrl":"10.1007/s10489-025-06854-y","url":null,"abstract":"<div><p>As a significant means of representing 3D scenes, point clouds are extensively utilized in various fields Such as computer vision, autonomous driving, robotic interaction, and urban modeling. While deep learning has achieved remarkable Success in the realm of two-dimensional images, and its application to three-dimensional point clouds is also progressively gaining traction. However, the irregular and unstructured nature of point cloud data presents numerous challenges when applying deep learning algorithms to these 3D representations. To foster future research endeavors, this paper concentrates on three fundamental tasks associated with point clouds: classification, object detection, and semantic segmentation. It systematically reviews the current state of development regarding deep learning algorithms pertinent to these tasks. By organizing and analyzing existing literature alongside experimental results derived from publicly available datasets, this paper compares the strengths of different methodologies while also highlighting their limitations. Ultimately, it summarizes the technical challenges encountered in advancing deep learning algorithms for point clouds and outlines potential avenues for progress within this domain.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"55 15","pages":""},"PeriodicalIF":3.5,"publicationDate":"2025-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145210990","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Balancing act: engagement detection in online learning through master-assistant models with an enhanced hierarchical attention mechanism 平衡行为:通过具有增强的分层注意机制的主-助理模型在在线学习中的参与检测
IF 3.5 2区 计算机科学
Applied Intelligence Pub Date : 2025-09-30 DOI: 10.1007/s10489-025-06893-5
Tingting Han, Ruqian Liu, Shuwei Dou, Wei Wang, Xiaoming Ding, Wenxia Zhang, Jihao Lang, Wenxuan Li, Jixing Han
{"title":"Balancing act: engagement detection in online learning through master-assistant models with an enhanced hierarchical attention mechanism","authors":"Tingting Han,&nbsp;Ruqian Liu,&nbsp;Shuwei Dou,&nbsp;Wei Wang,&nbsp;Xiaoming Ding,&nbsp;Wenxia Zhang,&nbsp;Jihao Lang,&nbsp;Wenxuan Li,&nbsp;Jixing Han","doi":"10.1007/s10489-025-06893-5","DOIUrl":"10.1007/s10489-025-06893-5","url":null,"abstract":"<div><p>The rapid expansion of online learning calls for the establishment of effective approaches to monitor and boost student engagement, which constitutes a key element influencing learning outcomes. The class imbalances within engagement datasets pose substantial challenges to precise detection and classification. Existing methods for detecting student engagement in online learning adopt weighted loss to address the issue of class imbalance in public datasets. However, due to the challenge of selecting appropriate weights and the risk of overfitting, the effectiveness of this approach often relies on extensive experiments for manual adjustments. To tackle this problem, we propose a Master-Assistant model to address the performance degradation caused by class imbalance to ensure effective detection of student engagement. The Assistant model is designed for coarse-grained classification according to different assistant strategies to assist the Master model for fine-grained classification. Furthermore, we extract multiple engagement-related handcrafted features and assigned different weights via an enhanced hierarchical attention mechanism. Finally, an accuracy of 70.69% and an F1-score of 68% are achieved on the Dataset for Affective States in E-Environments (DAiSEE), setting new state-of-the-art (SOTA) scores. Additionally, experiments on three other imbalanced datasets also validate the robustness of the Master-Assistant model in solving the class imbalance problem.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"55 15","pages":""},"PeriodicalIF":3.5,"publicationDate":"2025-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145210988","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Detection method for improving shape perception of small object defects on metal surfaces 改进金属表面小物体缺陷形状感知的检测方法
IF 3.5 2区 计算机科学
Applied Intelligence Pub Date : 2025-09-29 DOI: 10.1007/s10489-025-06873-9
Xingfei Zhu, Christophe Montagne, Qimeng Wang, Lingxiang Hu, Jinghu Yu, Hedi Tabia, Qianqian Hu
{"title":"Detection method for improving shape perception of small object defects on metal surfaces","authors":"Xingfei Zhu,&nbsp;Christophe Montagne,&nbsp;Qimeng Wang,&nbsp;Lingxiang Hu,&nbsp;Jinghu Yu,&nbsp;Hedi Tabia,&nbsp;Qianqian Hu","doi":"10.1007/s10489-025-06873-9","DOIUrl":"10.1007/s10489-025-06873-9","url":null,"abstract":"<div><p>Defects on metal surfaces often exhibit complexity with diverse shapes, small sizes, and irregular patterns, leading to frequent missed and false detections during inspection and posing significant challenges to automated detection systems. Existing advanced object detectors, when applied directly to small defect detection on metal surfaces, fail to achieve satisfactory results. To mitigate these issues, we proposed a detection method to enhance the shape perception of small object defects on metal surfaces, namely MetalYOLO. Firstly, a novel location-aware attention mechanism is designed to integrate deformable convolutions to form a new feature selection module to enhance the focus on key defect features, optimizes the generation of offsets, and improve the model’s ability to adapt to complex shape objects. Secondly, the standard up-sampling module is replaced with a dynamic sampling module to dynamically adjust the sampling pattern of the input feature distribution to improve computational efficiency and retain complex or small-scale object features, thereby improving detection accuracy. Finally, a new detail-enhanced detection head is designed to further improve the network’s ability to capture fine-grained details by introducing a detail-enhanced attention-sharing module so as to utilize contextual information to selectively suppress irrelevant features, thereby reducing information redundancy. The proposed model is compared with baseline models on the ILS-MB and NEU-DET datasets. and the experimental results show significant improvements in false detection and missed detection rates with only a slight loss in inference speed. Meanwhile, the mAP reached 80.4% and 79.0%, respectively, which is 1.7% and 3.2% higher than the baseline algorithm.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"55 15","pages":""},"PeriodicalIF":3.5,"publicationDate":"2025-09-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145210926","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
TransMambaCC: Integrating Transformer and Pyramid Mamba Network for RGB-T Crowd Counting TransMambaCC:集成变压器和金字塔曼巴网络的RGB-T人群计数
IF 3.5 2区 计算机科学
Applied Intelligence Pub Date : 2025-09-29 DOI: 10.1007/s10489-025-06912-5
Yangjian Chen, Huailin Zhao, Liangjun Huang, Yubo Yang, Wencan Kang, Jianwei Zhang
{"title":"TransMambaCC: Integrating Transformer and Pyramid Mamba Network for RGB-T Crowd Counting","authors":"Yangjian Chen,&nbsp;Huailin Zhao,&nbsp;Liangjun Huang,&nbsp;Yubo Yang,&nbsp;Wencan Kang,&nbsp;Jianwei Zhang","doi":"10.1007/s10489-025-06912-5","DOIUrl":"10.1007/s10489-025-06912-5","url":null,"abstract":"<div><p>RGB-T crowd counting is a challenging task that integrates RGB and thermal images to address the limitations of RGB-only approaches in scenes with poor illumination or occlusion. While transformer-based models have shown remarkable success in terms of capturing long-range dependencies, their high computational demands limit their practical applicability. To address this issue, a novel hybrid model named TransMambaCC, which fuses the analytical strength of transformer with the computational efficiency of Mamba, is proposed. This integration not only improves crowd analysis performance, but also significantly reduces computational overhead of the model. Additionally, a Pyramid Mamba module is innovatively designed to address the head-scale variations observed in congested scenes. Extensive experiments conducted on the RGBT-CC dataset demonstrate the superiority of TransMambaCC over the existing approaches in terms of both accuracy and efficiency. Furthermore, the model exhibits strong generalization capabilities, as evidenced by its performance on the ShanghaiTechRGBD dataset. The code is available at https://github.com/yjchen3250/TransMambaCC.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"55 15","pages":""},"PeriodicalIF":3.5,"publicationDate":"2025-09-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145210908","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Enhancing vitiligo stage diagnosis through a reliable multimodal model with uncertainty calibration 通过不确定校正的可靠多模态模型提高白癜风分期诊断
IF 3.5 2区 计算机科学
Applied Intelligence Pub Date : 2025-09-29 DOI: 10.1007/s10489-025-06839-x
Zhiming Li, Shuying Jiang, Fan Xiang, Chunying Li, Shuli Li, Tianwen Gao, Kaiqiao He, Jianru Chen, Junpeng Zhang, Junran Zhang
{"title":"Enhancing vitiligo stage diagnosis through a reliable multimodal model with uncertainty calibration","authors":"Zhiming Li,&nbsp;Shuying Jiang,&nbsp;Fan Xiang,&nbsp;Chunying Li,&nbsp;Shuli Li,&nbsp;Tianwen Gao,&nbsp;Kaiqiao He,&nbsp;Jianru Chen,&nbsp;Junpeng Zhang,&nbsp;Junran Zhang","doi":"10.1007/s10489-025-06839-x","DOIUrl":"10.1007/s10489-025-06839-x","url":null,"abstract":"<div><p>Vitiligo is a common dermatological disease featuring hypopigmentation. Accurate staging of vitiligo is crucial for enhancing treatment efficacy. However, traditional diagnostic methods, which rely on physicians' subjective judgments, are time-consuming, labor-intensive, and prone to misdiagnosis. Recently, AI-powered multimodal dermatological classification models have demonstrated significant potential in this area. But the credibility of these models at the decision-making stage is an area that requires further refinement. This study proposes a multimodal disease staging diagnostic model with uncertainty calibration to analyze multimodal samples from three stages of vitiligo. The model innovatively extracts feature information from various modalities and transforms it into a Dirichlet distribution to assess sample uncertainty. Then, the Dempster—Shafer theory is used to fuse evidence from different modalities, generating a final diagnostic result and an uncertainty score. Additionally, an uncertainty—based loss function is designed. And by using an uncertainty threshold method, the model can detect high—uncertainty samples that require additional judgment, effectively reducing the risk of misdiagnosis and missed diagnosis. Experimental results show that this model outperforms existing methods in terms of accuracy, precision, recall, and F1—score. Anomaly detection and noise—resistance experiments verify the model's robustness in handling unknown and noisy data. This model offers a new approach for AI-assisted vitiligo diagnosis, which can assist doctors in making more accurate diagnostic decisions, contribute to improving treatment efficiency.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"55 15","pages":""},"PeriodicalIF":3.5,"publicationDate":"2025-09-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145210385","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信