{"title":"Joint Detection of Rhythmic and Morphological Abnormalities in Electrocardiographic Images: A Multitask Learning Approach","authors":"Pharvesh Salman Choudhary;L.N. Sharma;Samarendra Dandapat","doi":"10.1109/TAI.2025.3530383","DOIUrl":"https://doi.org/10.1109/TAI.2025.3530383","url":null,"abstract":"The electrocardiogram (ECG) is the most widely used diagnostic tool for the characterization of heart function. Although automated methods of ECG interpretation can improve clinical care, but most methods are designed on signal-based data. In this work, we consider images of paper-based representations of multichannel ECG to develop intelligent methods for its analysis. Cardiovascular abnormalities are manifested in ECG through either morphological alterations, rhythmic variations, or a combination of both. To effectively classify these cardiac abnormalities, we formulate a multitask learning framework comprising two primary tasks relating to the classification of morphological and rhythmic abnormalities and an auxiliary task on delineating regions pertaining to the primary tasks. We employ a dynamic task weighting approach based on homoscedastic uncertainty to balance the task-specific losses in the multitask framework. We evaluate our method on two databases: an internal database containing clinical ECG images obtained from multiple medical centres in Assam, India, and the other comprising ECG images extracted from a publicly available 12-lead ECG dataset. Experimental evaluation shows that our proposed deep architecture outperforms single-task learning counterparts and achieves promising performance for both morphological ailments and rhythm classification tasks. Results also demonstrate superior performance compared to other image-based state-of-the-art methods. Moreover, analysis of the post-hoc interpretation in the form of saliency maps verifies the model's performance and provides clinically meaningful inferences to its predictions.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"6 7","pages":"1894-1905"},"PeriodicalIF":0.0,"publicationDate":"2025-01-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144519254","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Next-Generation Computer Vision in Veterinary Medicine: A Study on Canine Ophthalmology","authors":"Matija Burić;Marina Ivašić-Kos","doi":"10.1109/TAI.2025.3530380","DOIUrl":"https://doi.org/10.1109/TAI.2025.3530380","url":null,"abstract":"Taking into account the achievements of state-of-the-art computer vision methods in recent years, the aim of this research was to examine the extent to which their application can help in the detection of symptoms of eye diseases in dogs and the diagnosis of ophthalmological conditions in order to provide owners with preliminary information about the disease of their pets and speed up making diagnoses to veterinarians. In the research, clinical data of canine eye diseases including at least one of the 4 symptoms of the disease was collected and a set was formed to train the segmentation model, which was expanded with synthesized data generated using the LoRA Stable Diffusion model verified by an ophthalmologist. An extended segmentation model based on U-Net architecture with ResNet34 backbone was fine-tuned on the prepared set and compared to zero-training GPT-4o and Grounding SAM. The results show that the fine-tuned U-Net model gives the best segmentation results of eye disease symptoms of 97% base of pixel accuracy metric and significantly outperforms other tested methods. The segmentation masks are used as part of the prompts for GPT-4 and GPT-4o to generate diagnoses of diseases having the specified symptoms. The generated diagnostic results were evaluated using text evaluation metrics and that the most accurate diagnosis according to the Bert score of 84% is achieved using GPT-4o in combination with the U-Net segmentation mask. The article proposes a pipeline that gives the best results and solutions to be considered for other diagnostic procedures in ophthalmology and veterinary medicine.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"6 7","pages":"1884-1893"},"PeriodicalIF":0.0,"publicationDate":"2025-01-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144519300","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Tong Guo;Yi Mei;Wenbo Du;Yisheng Lv;Yumeng Li;Tao Song
{"title":"Emergency Scheduling of Aerial Vehicles via Graph Neural Neighborhood Search","authors":"Tong Guo;Yi Mei;Wenbo Du;Yisheng Lv;Yumeng Li;Tao Song","doi":"10.1109/TAI.2025.3528381","DOIUrl":"https://doi.org/10.1109/TAI.2025.3528381","url":null,"abstract":"The thriving advances in autonomous vehicles and aviation have enabled the efficient implementation of aerial last-mile delivery services to meet the pressing demand for urgent relief supply distribution. Variable neighborhood search (VNS) is a promising technique for aerial emergency scheduling. However, the existing VNS methods usually exhaustively explore all considered neighborhoods with a prefixed order, leading to an inefficient search process and slow convergence speed. To address this issue, this article proposes a novel <bold>g</b>raph n<bold>e</b>ural <bold>n</b>eighborhood <bold>s</b>earch (GENIS) algorithm, which includes an online reinforcement learning (RL) agent that guides the search process by selecting the most appropriate low-level local search operators based on the search state. We develop a dual-graph neural representation learning method to extract comprehensive and informative feature representations from the search state. Besides, we propose a reward-shaping policy learning method to address the decaying reward issue along the search process. Extensive experiments conducted across various benchmark instances demonstrate that the proposed algorithm significantly outperforms the state-of-the-art approaches. Further investigations validate the effectiveness of the newly designed knowledge guidance scheme and the learned feature representations.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"6 7","pages":"1808-1822"},"PeriodicalIF":0.0,"publicationDate":"2025-01-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144519249","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Age-Aware UAV-Aided Energy Harvesting for the Design of Wireless Rechargeable Mobile Networks","authors":"Aditya Singh;Rajesh M. Hegde","doi":"10.1109/TAI.2025.3528377","DOIUrl":"https://doi.org/10.1109/TAI.2025.3528377","url":null,"abstract":"The proliferation of Internet of Things (IoT) technology has enhanced connectivity and automation in industries and daily life. The introduction of mobile IoT devices (IoTDs) has further expanded the productivity of these networks beyond conventional cyber–physical systems, resulting in wireless rechargeable mobile networks (WRMNs). However, the inherent limitations of low-powered IoTDs mandate their repetitive charging in dynamic environments. To address this, we propose radio frequency (RF) energy harvesting from unmanned aerial vehicles (UAVs) to supplement the energy needs of IoTDs. Moreover, the IoTDs’ mobility and nonuniform energy utilization are challenging for UAV scheduling in WRMNs. Additionally, maintaining a balance between efficient utilization of UAV energy and IoTD energy harvesting adds complexity to the problem. In this work, we introduce the age of charging (AoC) metric to quantify IoTDs’ repetitive charging and propose an energy-efficient UAV scheduling scheme to maximize UAV energy usage efficiency (EUE) in WRMNs. Moreover, a Markov decision process (MDP) is formulated to address UAV-EUE maximization. Subsequently, a deep reinforcement learning (DRL) scheme is proposed within the deep deterministic policy gradient (DDPG) framework to optimize UAV charging sequences. The DRL agent (UAV) autonomously learns optimal charging strategies considering IoTD mobility patterns, energy demand fluctuations, and IoTD energy-harvesting capabilities. Simulation results demonstrate the superiority of the proposed DRL algorithm over existing DRL-based UAV scheduling schemes, significantly enhancing the operational lifespan of WRMNs and ensuring network stability and continuous functionality. This motivates the adoption of the proposed DRL scheme for developing autonomous, energy-aware, next-generation IoT applications.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"6 7","pages":"1797-1807"},"PeriodicalIF":0.0,"publicationDate":"2025-01-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144519250","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Meta-XAI for Explaining the Explainer: Unveiling Image Features Driving Deep Learning Decisions","authors":"Simone Bianco","doi":"10.1109/TAI.2025.3529397","DOIUrl":"https://doi.org/10.1109/TAI.2025.3529397","url":null,"abstract":"Deep learning has revolutionized computer vision by allowing neural networks to automatically learn features from data. However, the highly nonlinear nature of deep neural networks makes them difficult to interpret, leading to concerns about potential biases in critical applications. To address this, researchers have advocated for explainable artificial intelligence (XAI). Many XAI techniques have been proposed but all of them only highlight image regions influencing model decisions, lacking any further explanations. In this article, we propose a posthoc model-agnostic meta-XAI method that explains why specific image regions are used for decisions. The article presents the experimental setup and results, discussing the perturbations used for explanations in color, frequency, shape, shading, and texture. The explanation is given in terms of human-interpretable image features, e.g., color, shape, shading, and texture both as perturbation plots and as visual summary through the use of the newly introduced normalized area under the curve score. The experimental results confirm the previous findings that vision deep learning models are biased toward texture, but also highlight the importance of color, frequency content, and perceptually salient structures in the final decision.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"6 7","pages":"1859-1869"},"PeriodicalIF":0.0,"publicationDate":"2025-01-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144519234","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Junfei Wang;He Huang;Jingze Feng;Steven Wong;Lihua Xie;Jianfei Yang
{"title":"A Trustworthy AIoT-Enabled Localization System via Federated Learning and Blockchain","authors":"Junfei Wang;He Huang;Jingze Feng;Steven Wong;Lihua Xie;Jianfei Yang","doi":"10.1109/TAI.2025.3528917","DOIUrl":"https://doi.org/10.1109/TAI.2025.3528917","url":null,"abstract":"There is a significant demand for indoor localization technology in smart buildings, and the most promising solution in this field is using radio frequency (RF) sensors and fingerprinting-based methods that employ machine learning models trained on crowd-sourced user data gathered from Internet of Things (IoT) devices. However, this raises security and privacy issues in practice. Some researchers propose to use federated learning (FL) to partially overcome privacy problems, but there still remain security concerns, e.g., single-point failure and malicious attacks. In this article, we propose a framework named DFLoc to achieve precise 3-D localization tasks while considering the following two security concerns. Particularly, we design a specialized blockchain to decentralize the framework by distributing the tasks such as model distribution and aggregation, which are handled by a central server to all clients in most previous works, to tackle the single-point failure issue in ensuring reliable and accurate indoor localization. Moreover, we introduce an updated model verification mechanism within the blockchain to alleviate the concern of malicious node attacks. Experimental results substantiate the framework's capacity to deliver accurate 3-D location predictions and its superior resistance to the impacts of single-point failure and malicious attacks when compared to conventional centralized FL systems.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"6 7","pages":"1838-1848"},"PeriodicalIF":0.0,"publicationDate":"2025-01-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144519253","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"G-Mix: A Generalized Mixup Learning Framework Toward Flat Minima","authors":"Xingyu Li;Bo Tang","doi":"10.1109/TAI.2025.3529816","DOIUrl":"https://doi.org/10.1109/TAI.2025.3529816","url":null,"abstract":"Deep neural networks (DNNs) have demonstrated promising results in various complex tasks. However, such DNN models face challenges related to over-parameterization, particularly in scenarios where training data are scarce. In response to these challenges and to improve the generalization capabilities of DNNs, the Mixup technique has emerged, which effectively addresses the limitations posed by over-parameterization. Nevertheless, it still produces suboptimal outcomes. Inspired by the successful sharpness-aware minimization (SAM) method, which establishes a connection between the sharpness of the training loss landscape and model generalization, we propose a new learning framework called Generalized-Mixup, which combines the strengths of Mixup and SAM for training DNN models. The theoretical analysis provided demonstrates how the developed G-Mix framework enhances generalization. Additionally, to further optimize DNN performance with the G-Mix framework, we introduce two novel algorithms: Binary G-Mix (BG-Mix) and Decomposed G-Mix (DG-Mix). These algorithms partition the training data into two subsets based on the sharpness-sensitivity of each example to address the issue of “manifold intrusion” in Mixup. Both theoretical explanations and experimental results reveal that the proposed BG-Mix and DG-Mix algorithms further enhance model generalization across multiple datasets and models, achieving state-of-the-art performance.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"6 7","pages":"1870-1883"},"PeriodicalIF":0.0,"publicationDate":"2025-01-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144519315","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"AdaCoRCE Loss for Knowledge Distillation: A Novel Approach With Network Fission and Co-Teaching Technique","authors":"Shankey Garg;Pradeep Singh","doi":"10.1109/TAI.2025.3527402","DOIUrl":"https://doi.org/10.1109/TAI.2025.3527402","url":null,"abstract":"Deep models have been successful in almost every research field, and they are capable of handling complex problem statements. But most of the deep neural networks are huge in size with millions/billions of parameters requiring heavy resources and computations to be installed in edge devices. In this article, we present an efficient co-teaching strategy consisting of multiple small networks performing mutually at runtime to consistently improve the efficiency and generalization ability of neural networks. Unlike existing distillation mechanism, that utilizes large capacity pre-train teacher model to transfer knowledge to a smaller network unidirectionally, proposed framework treats all the networks as ‘teacher’ (student-sized) and co-teach them allowing them to compute concurrently and quickly with better generalizations. We have carefully divided the backbone network into small network using depth scaling with regularizations. Multiple small networks are used during the co-teaching process, and the proposed AdaCoRCE loss is used to make the network learn from each other. During training, these networks are provided with the two different views of same data to increase their diversity. Co-teaching scheme allows model to fetch stronger and unique representation of knowledge by using different data views and AdaCoRCE loss. This article provides a generalized framework that could be applied to various network structures (e.g., MobileNets, ResNet, MixNet, etc.) and it demonstrates efficient performance on variety of histology image datasets. In this article, we have used four different publicly available histology dataset on two types of diseases to evaluate the performance of proposed technique. Analysis on colorectal cancer and breast cancer histology images suggests that the proposed model enhances the overall performance of the model in terms of accuracy, GFLOPs and inference time. Further, the proposed framework is also analyzed using benchmark cifar-10 dataset and comparison of our result is done with several state-of-the-art results on mutual/collaborative learning. To the best of our knowledge, we analyzed that the proposed model outperformed these recent models in terms of accuracy, GFLOPs and inference time. Extensive result analysis on different histology benchmark datasets and benchmark cifar-10 dataset suggests that the proposed model is a generally applicable model that could be used for various computer vision-based tasks.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"6 7","pages":"1776-1786"},"PeriodicalIF":0.0,"publicationDate":"2025-01-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144519247","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Deep Learning-Based Method for Irrigation Status Detection in Tomato Using Plant Leaves","authors":"Tej Bahadur Shahi;Chiranjibi Sitaula;Krishna Prasad Bhandari;Shobha Poudel;Rupesh Bhandari;Ravindra Mishra;Bharat Kumar Sharma;Bhogendra Mishra","doi":"10.1109/TAI.2025.3528926","DOIUrl":"https://doi.org/10.1109/TAI.2025.3528926","url":null,"abstract":"The impact of climate change, arguably global warming and resulting drought, is one of the most escalating agricultural challenges affecting crop productivity. Therefore, effective water management is critical in agricultural practices. The analysis of plant leaves presents an opportunity to gauge irrigation status through automated solutions to encourage broader adoption among farmers. Currently, there is a notable absence of AI methods in the literature for detecting tomato plant irrigation status through leaf analysis. Addressing this gap, we propose a novel end-to-end deep learning (DL)-based method, inspired by the ResNet-50 model. Our model trims unnecessary blocks and reduces larger kernels, significantly streamlining the model to better fit with the leaf image dataset related to the tomato irrigation status. We evaluate our method using a newly developed dataset and find outstanding performance (Precision: 99.05%, Recall: 99.01%, F1-score: 99.01%, mean-average F1: 98.98%, weighted-average F1: 98.95%, Kappa: 98.61%, accuracy: 98.90%) while comparing with the pretrained DL models. Additionally, our model has fewer parameters and lower floating-point operations (FLOPs), enhancing its efficiency and suggesting its potential for more cost-effective and productive irrigation management practices.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"6 7","pages":"1849-1858"},"PeriodicalIF":0.0,"publicationDate":"2025-01-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144519256","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}