{"title":"A comprehensive review of AI methods in upper extremity/limb bone fracture detection","authors":"Zahra Moradi Pour, Stefano Berretti","doi":"10.1007/s10462-025-11296-6","DOIUrl":"10.1007/s10462-025-11296-6","url":null,"abstract":"<div><p>Accurate detection of bone fractures is crucial for patient care, however, the traditional manual review of medical images like X-rays, Computed Tomography (CT) scans, Magnetic Resonance Imaging (MRIs), and ultrasounds is time-consuming and labor-intensive. The shortage of clinicians, limited access to expert radiologists, and heavy workloads increase the risk of errors, which can slow down patients recovery. Artificial Intelligence (AI) models like Faster R-CNN have shown significant diagnostic accuracy (ACC) and sensitivity (SEN), often outperforming on-call radiologists in detecting complex fracture types. For example, Faster R-CNN has achieved SEN exceeding 90% in distal radius fracture detection. However, despite these advancements, AI-driven fracture detection systems still face several challenges, including the need for extensive annotated datasets, variability in imaging quality across clinical settings, potential biases in model training, and concerns regarding the interpretability and reliability of AI-generated predictions. This review provides a comprehensive analysis of recent advancements and limitations in AI-based fracture detection, offering quantitative insights into model performance. By examining these aspects, the study highlights the importance of integrating AI systems into clinical workflows, while addressing existing barriers to their widespread adoption. This analysis underscores AI’s potential to enhance diagnostic efficiency, reduce human error, and improve patient outcomes.</p></div>","PeriodicalId":8449,"journal":{"name":"Artificial Intelligence Review","volume":"58 10","pages":""},"PeriodicalIF":13.9,"publicationDate":"2025-07-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10462-025-11296-6.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145164653","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Exploring the frontiers of LLMs in psychological applications: a comprehensive review","authors":"Luoma Ke, Song Tong, Peng Cheng, Kaiping Peng","doi":"10.1007/s10462-025-11297-5","DOIUrl":"10.1007/s10462-025-11297-5","url":null,"abstract":"<div><p>This review explores the frontiers of large language models (LLMs) in psychological applications. Psychology has undergone several theoretical changes, and the current use of artificial intelligence (AI) and machine learning, particularly LLMs, promises to open up new research directions. We aim to provide a detailed exploration of how LLMs are transforming psychological research. We discuss the impact of LLMs across various branches of psychology—including cognitive and behavioral, clinical and counseling, educational and developmental, and social and cultural psychology—highlighting their ability to model patterns, cognition, and behavior similar to those observed in humans. Furthermore, we explore the ability of such models to generate coherent, contextually relevant text, offering innovative tools for literature reviews, hypothesis generation, experimental designs, experimental subjects, and data analysis in psychology. We emphasize the importance of addressing technical and ethical challenges, including data privacy, the ethics of using LLMs in psychological research, and the need for a deeper understanding of these models’ limitations. Researchers should use LLMs responsibly in psychological studies, adhering to ethical standards and considering the potential consequences of deploying these technologies in sensitive areas. Overall, this review provides a comprehensive overview of the current state of LLMs in psychology, exploring the potential benefits and challenges. We hope it can serve as a call to action for researchers to responsibly leverage LLMs’ advantages while addressing the associated risks.</p></div>","PeriodicalId":8449,"journal":{"name":"Artificial Intelligence Review","volume":"58 10","pages":""},"PeriodicalIF":13.9,"publicationDate":"2025-07-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10462-025-11297-5.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145164654","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Linear projection fused graph-based semi-supervised learning on multi-view data","authors":"Jingjun Bi, Fadi Dornaika, Jinan Charafeddine","doi":"10.1007/s10462-025-11313-8","DOIUrl":"10.1007/s10462-025-11313-8","url":null,"abstract":"<div><p>In recent years, the surge in data-driven applications across various domains has spurred heightened interest in semi-supervised learning applied to graphs. This surge is attributed to the ubiquitous presence of graph data structures in real-world contexts, such as social networks’ interpersonal relationships, recommender systems’ user behavior graphs, and bioinformatics’ molecular interaction networks. However, for certain data types like images, not only is there a dearth of explicit graph structure, but also the existence of multiple view description methods complicates matters further. The intricacies of multi-view data pose challenges in directly applying traditional semi-supervised learning techniques to graphs. Consequently, researchers have begun exploring the fusion of semi-supervised learning with deep learning to leverage its wealth of information and enhance model efficacy. Effectively amalgamating graph structures with multi-view data remains a challenging problem necessitating further research. This paper introduces the Linear projection Fused Graph-based Semi-supervised Classification (LFGSC) method tailored for multi-view data, building upon the Graph Convolutional Network (GCN) architecture. Firstly, for each view, we leverage a semi-supervised approach that provides the concurrent estimation of the corresponding graph and the flexible linear data representations in a low-dimensional feature space. Subsequently, an adaptive and unified graph is generated, followed by the utilization of a fully connected network to fuse the projected features further and reduce dimensionality. Finally, the fused features and graph are inputted into a GCN to conduct semi-supervised classification. During training, the model incorporates cross-entropy loss, manifold regularization loss, graph auto-encoder loss, and supervised contrastive loss. Leveraging linear transformation significantly diminishes the input feature dimensions for GCN, thereby achieving high accuracy while substantially reducing computational overhead. Furthermore, experimental results conducted on various bench-marked multi-view image datasets demonstrate the superiority of LFGSC over existing semi-supervised learning methods for multi-view scenarios. (Source code: https://github.com/BiJingjun/LFGSC.)</p></div>","PeriodicalId":8449,"journal":{"name":"Artificial Intelligence Review","volume":"58 10","pages":""},"PeriodicalIF":13.9,"publicationDate":"2025-07-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10462-025-11313-8.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145164655","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Armaano Ajay, R. Karthik, Akshaj Singh Bisht, Abhay Karan Singh
{"title":"QDeepColonNet: a quantum-based deep learning network for colorectal cancer classification using attention-driven DenseNet and shuffled dynamic local feature extraction network","authors":"Armaano Ajay, R. Karthik, Akshaj Singh Bisht, Abhay Karan Singh","doi":"10.1007/s10462-025-11295-7","DOIUrl":"10.1007/s10462-025-11295-7","url":null,"abstract":"<div><p>Colorectal Cancer (CRC) is one of the most common and severe types of cancer globally, affecting millions of people each year. It primarily develops from benign polyps in the colon or rectum, which can turn malignant if not detected and treated early, leading to serious health risks. Current diagnostic methods for CRC detection are primarily manual and require significant time, resources and expertise. This creates a pressing need for automated solutions that are both efficient and highly accurate. This research proposes a hybrid Deep Learning (DL) and Quantum Machine Learning (QML)-based system for CRC classification, designed to address these challenges using a dual-track approach. The proposed QDeepColonNet leverages DL for robust feature extraction, combining DenseNet with an Enhanced Feature Learnable Group Attention (EFLGA) block to capture both high and mid-level features. Additionally, it integrates the Shuffled Dynamic Local Feature Extraction Network (SDLFEN) with a Lightweight Multi-Kernel Convolution (LMKC) block to capture short-range dependencies. The concatenated feature maps from both tracks are further refined by Efficient Channel Attention (ECA), enhancing cross-channel interactions without increasing complexity. Finally, the refined features are classified using a QML-based classifier, which effectively handles intricate data and captures complex feature relationships. To the best of our understanding, this is the first study to incorporate a QML-based hybrid classification network CRC detection. The performance of the proposed QDeepColonNet surpassed several state-of-the-art DL models and achieved a classification accuracy of 98.92% when tested on the EBHI dataset.</p></div>","PeriodicalId":8449,"journal":{"name":"Artificial Intelligence Review","volume":"58 10","pages":""},"PeriodicalIF":13.9,"publicationDate":"2025-07-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10462-025-11295-7.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145163116","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ishak Pacal, Serhat Kilicarslan, Burhanettin Ozdemir, Muhammet Deveci, Seifedine Kadry
{"title":"Efficient and autonomous detection of olive leaf diseases using AI-enhanced MetaFormer","authors":"Ishak Pacal, Serhat Kilicarslan, Burhanettin Ozdemir, Muhammet Deveci, Seifedine Kadry","doi":"10.1007/s10462-025-11131-y","DOIUrl":"10.1007/s10462-025-11131-y","url":null,"abstract":"<div><p>Agriculture forms the cornerstone of global food security, with olives playing a pivotal role not only as a food source but also in cosmetics, medicine, and other industries. However, diseases affecting olive trees pose significant threats to agricultural productivity and economic stability, underscoring the need for innovative detection solutions. A promising solution to these challenges is the development of deep learning-based computer-aided diagnostic applications, which have shown remarkable success in various fields, especially in recent years. This study presents a novel deep-learning approach for olive leaf disease detection, introducing a MetaFormer-based architecture that combines the power of transformer-based components, specifically separable self-attention, with the efficiency of a lightweight design. The proposed model was evaluated using two distinct datasets, Dataset-1 and Dataset-2, where it achieved impressive accuracy rates of 99.31% and 96.91%, respectively. When compared to other cutting-edge models such as Swin-Base, MaxViT-Base, DeiT3-Base, CAFormer-s18, CAFormer-m36, ResNet50, and MobileNetv3, the Proposed Model outperformed them in terms of accuracy, precision, recall, and F1-score. These advancements were made possible through the incorporation of separable self-attention, which allows for capturing both local and global dependencies in olive leaf images, and a streamlined architecture that reduces computational complexity without sacrificing performance. Furthermore, Grad-CAM visualizations highlighted the interpretability of the model, confirming its ability to focus on disease-relevant regions of the images. This study offers a significant contribution to the field of agricultural disease detection, particularly in olive farming, and sets the stage for future work in adapting the model for other crops and real-time applications in agriculture.</p></div>","PeriodicalId":8449,"journal":{"name":"Artificial Intelligence Review","volume":"58 10","pages":""},"PeriodicalIF":13.9,"publicationDate":"2025-07-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10462-025-11131-y.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145162902","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Michael Schlee, Gillian Kant, Christoph Ehrling, Benjamin Säfken, Thomas Kneib
{"title":"Decoding synthetic news: an interpretable multimodal framework for the classification of news articles in a novel news corpus","authors":"Michael Schlee, Gillian Kant, Christoph Ehrling, Benjamin Säfken, Thomas Kneib","doi":"10.1007/s10462-025-11188-9","DOIUrl":"10.1007/s10462-025-11188-9","url":null,"abstract":"<div><p>Recent advancements in Artificial Intelligence (AI), notably the development of Large Language Models (LLMs) and text-to-image diffusion models, have facilitated the creation of realistic textual content and images. Specifically, platforms like ChatGPT and Midjourney have simplified the creation of high-quality text and visuals with minimal expertise and cost. The increasing sophistication of Generative AI presents challenges in ensuring the integrity of news, media, and information quality, making it increasingly difficult to distinguish between real and artificially generated textual and visual content. Our work addressed this problem in two ways. First, by means of ChatGPT and Midjourney, we created a comprehensive novel multimodal news corpus named <i>SyN24News</i> based on the <i>N24News</i> corpus, on which we evaluated our model. Second, we developed a novel explainable synthetic news detector for discriminating between real and synthetic news articles. We leveraged a Neural Additive Model (NAM)-like network structure that ensures effect separation by handling input data in separate subnetworks. Complex structures and patterns are extracted by deep features from unstructured data, i.e., images and texts, using fine-tuned VGG and DistilBERT subnetworks. We ensured further explainability by individually processing carefully chosen handcrafted text and image features in simple Multilayer Perceptrons (MLPs), allowing for graphical interpretation of corresponding structured effects. Our findings indicate that textual information are the main drivers in the decision-making finding process. Structured textual effects, particularly Flesch-Kincaid reading ease and sentiment, have a much higher influence on the classification outcome than visual features such as dissimilarity and homogeneity.</p></div>","PeriodicalId":8449,"journal":{"name":"Artificial Intelligence Review","volume":"58 10","pages":""},"PeriodicalIF":13.9,"publicationDate":"2025-07-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10462-025-11188-9.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145162903","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Qian Zhang, Jinhua Sheng, Qiao Zhang, Ze Yang, Yu Xin, Binbing Wang, Rong Zhang, for the Alzheimer’s Disease Neuroimaging Initiative
{"title":"Optimizing deep learning with improved Harris Hawks optimization for Alzheimer’s disease detection","authors":"Qian Zhang, Jinhua Sheng, Qiao Zhang, Ze Yang, Yu Xin, Binbing Wang, Rong Zhang, for the Alzheimer’s Disease Neuroimaging Initiative","doi":"10.1007/s10462-025-11304-9","DOIUrl":"10.1007/s10462-025-11304-9","url":null,"abstract":"<div><p>As the global population ages, Alzheimer’s disease (AD) poses a significant worldwide challenge as a leading cause of dementia, with a slow early progression that eventually leads to nerve cell death and currently lacks effective treatment. However, early diagnosis can slow its progression through pharmaceutical intervention, making accurate early diagnosis using computer-aided diagnosis (CAD) systems crucial. This study aims to enhance the accuracy of early AD diagnosis by developing an improved optimization approach for deep learning-based CAD systems. To achieve this, this paper proposes an improved Harris Hawks optimization algorithm (HHO), named CAHHO, which incorporates crisscross search and adaptive β-Hill climbing mechanisms, thereby enhancing population diversity and search space coverage during the exploration phase, while adaptively adjusting the step size during the exploitation phase to improve local search precision. Comparative experiments with classical algorithms, HHO variants, and advanced optimization methods validate the superiority of the proposed CAHHO. Specifically, this study employs the deep learning model residual network with 18 layers (ResNet18) as the base model for AD diagnosis and uses CAHHO to optimize key hyperparameters, including the number of channels and learning rate. Experiments on the AD neuroimaging initiative dataset demonstrate that the ResNet18-CAHHO model outperforms existing methods in classifying AD, mild cognitive impairment (MCI), and normal control (NC) subjects. Specifically, it achieves accuracies of 0.93077, 0.80102, and 0.80513 in the diagnosis of AD versus NC, MCI versus NC, and AD versus MCI, respectively. Furthermore, Gradient-Weighted Class Activation Mapping (Grad-CAM) visualizations reveal critical brain regions associated with AD, providing valuable diagnostic support for clinicians and holding significant promise for early intervention.</p></div>","PeriodicalId":8449,"journal":{"name":"Artificial Intelligence Review","volume":"58 10","pages":""},"PeriodicalIF":13.9,"publicationDate":"2025-07-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10462-025-11304-9.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145162904","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Khyati Sethia, Petr Strakos, Milan Jaros, Jan Kubicek, Jan Roman, Marek Penhaker, Lubomir Riha
{"title":"Advances in liver, liver lesion, hepatic vasculature, and biliary segmentation: a comprehensive review of traditional and deep learning approaches","authors":"Khyati Sethia, Petr Strakos, Milan Jaros, Jan Kubicek, Jan Roman, Marek Penhaker, Lubomir Riha","doi":"10.1007/s10462-025-11310-x","DOIUrl":"10.1007/s10462-025-11310-x","url":null,"abstract":"<div><h3>Background and motivation</h3><p>Liver segmentation plays a critical role in medical imaging, aiding in diagnosis, treatment planning, and surgical interventions for liver diseases. Precise segmentation of liver structures, including vessels, tumors, and other substructures, is essential for effective patient management. Traditional manual methods are time-consuming and prone to variability, prompting the development of automated techniques. This review aims to evaluate the evolution of liver segmentation methodologies, focusing on recent advancements in deep learning and hybrid approaches.</p><h3>Materials and methods</h3><p>This review follows the PRISMA guidelines for systematic analysis, including a detailed database search across PubMed, Web of Science, Scopus, and IEEE Xplore. The search focused on segmentation techniques for various liver structures using deep learning, traditional methods, and hybrid models. A total of 7819 studies were initially identified, with 190 selected for detailed analysis based on inclusion criteria like Dice Similarity Coefficient (DSC) metrics and clinical applicability.</p><h3>Results</h3><p>The analysis identified deep learning models, such as U-Net variants and Swin Transformer-based architectures, as leading methods for liver parenchyma and tumor segmentation, achieving DSC values up to 98.9% on benchmark datasets. For vessel segmentation, methods like DeepLabV3+ and the feature-based approaches demonstrated robustness across different datasets. Despite progress, challenges remain in segmenting structures like biliary ducts and hematomas due to limited annotated data and imaging variability.</p><h3>Discussion</h3><p>While deep learning has significantly improved segmentation accuracy, challenges such as class imbalance and variability across imaging modalities persist. Hybrid approaches that combine traditional image processing with advanced neural networks show potential for further improvement. Future research should focus on enhancing generalizability through multi-modal data integration and exploring semi-supervised learning methods to overcome data scarcity.</p><h3>Conclusion</h3><p>This comprehensive review highlights the advancements and ongoing challenges in liver segmentation, emphasizing the need for continuous innovation. By addressing current limitations, future methodologies can improve accuracy, efficiency, and clinical relevance, ultimately enhancing patient outcomes in hepatology.</p></div>","PeriodicalId":8449,"journal":{"name":"Artificial Intelligence Review","volume":"58 10","pages":""},"PeriodicalIF":13.9,"publicationDate":"2025-07-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10462-025-11310-x.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145161863","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mohsen Ahmadi, Debojit Biswas, Maohua Lin, Frank D. Vrionis, Javad Hashemi, Yufei Tang
{"title":"Physics-informed machine learning for advancing computational medical imaging: integrating data-driven approaches with fundamental physical principles","authors":"Mohsen Ahmadi, Debojit Biswas, Maohua Lin, Frank D. Vrionis, Javad Hashemi, Yufei Tang","doi":"10.1007/s10462-025-11303-w","DOIUrl":"10.1007/s10462-025-11303-w","url":null,"abstract":"<div><p>Medical imaging is a cornerstone of modern healthcare, enabling precise diagnosis, treatment planning, and disease monitoring. Traditional machine learning (ML) approaches have significantly improved medical image analysis, yet they face challenges such as data scarcity, lack of interpretability, and variability in imaging protocols. Physics-Informed Machine Learning (PIML) offers a transformative solution by integrating fundamental physical laws, usually in partial differential equations and boundary conditions, into data-driven ML models. PIML constrains the solution space, enhances interpretability, and reduces the dependency on large, annotated datasets. This review provides an overview of the principles, methodologies, and applications of PIML in medical imaging, with a focus on imaging modalities such as MRI, CT, and ultrasound. We discuss the taxonomy of PIML approaches based on observational, inductive, and learning biases, showing their roles in enhancing model accuracy and generalization. Additionally, we explore the impact of PIML on image reconstruction, segmentation, enhancement, and anomaly detection, demonstrating its effectiveness in addressing noise, resolution, and diagnostic accuracy challenges. Despite its advantages, PIML faces challenges in the accurate representation of complex physiological processes, computational efficiency, and the integration of physics-based priors across diverse applications. This review points out future research directions including the development of hybrid models that combine PIML with deep learning techniques and large foundation models, improved benchmark datasets, and scalable algorithms for real-time applications. The findings of this review highlight PIML as a pivotal approach for advancing medical imaging, bridging the gap between theoretical models and practical implementation in clinical settings.</p></div>","PeriodicalId":8449,"journal":{"name":"Artificial Intelligence Review","volume":"58 10","pages":""},"PeriodicalIF":13.9,"publicationDate":"2025-07-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10462-025-11303-w.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145161821","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xu Song, Yongyao Li, Yunfan Zhang, Yufei Liu, Lei Jiang
{"title":"An overview of learning-based dexterous grasping: recent advances and future directions","authors":"Xu Song, Yongyao Li, Yunfan Zhang, Yufei Liu, Lei Jiang","doi":"10.1007/s10462-025-11262-2","DOIUrl":"10.1007/s10462-025-11262-2","url":null,"abstract":"<div><p>Recently, the practical implications of dexterous grasping technology have become a key point of research in robotics and artificial intelligence. At its core, this technology aims to empower robots to achieve human-level grasping capabilities. To help researchers quickly acquire the latest advancements, we have conducted a comprehensive review of the recent research developments, focusing on learning-based approaches, from two perspectives: Grasp Generation (GG) and Grasp Execution (GE). Specifically, GG refers to generating appropriate grasping poses for the target object. GE refers to executing grasp poses by motion planning and motion control. Afterwards, we introduce recent benchmark datasets and evaluation metrics. Based on these extensive benchmarks, we offer a comparative analysis of the state-of-the-art solutions. Lastly, we highlight several research directions that need to be further addressed, which will greatly facilitate the practical deployment of dexterous grasping technology in industrial manufacturing, household services, medical rehabilitation, <i>etc</i>. We believe it is a crucial area of research for future progress in robotic manipulation.</p></div>","PeriodicalId":8449,"journal":{"name":"Artificial Intelligence Review","volume":"58 10","pages":""},"PeriodicalIF":13.9,"publicationDate":"2025-07-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10462-025-11262-2.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145161864","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}