Ying Guo;Bingxin Li;Kexin Zhen;Jie Liu;Gaolei Li;Qi Wang;Yong-Jin Liu
{"title":"Consistency-Heterogenity Balanced Fake News Detection via Cross-Modal Matching","authors":"Ying Guo;Bingxin Li;Kexin Zhen;Jie Liu;Gaolei Li;Qi Wang;Yong-Jin Liu","doi":"10.1109/TAI.2025.3527921","DOIUrl":"https://doi.org/10.1109/TAI.2025.3527921","url":null,"abstract":"Generating synthetic content through generative AI (GAI) presents considerable hurdles for current fake news detection methodologies. Many existing detection approaches concentrate on feature-based multimodal fusion, neglecting semantic relationships such as correlations and diversities. In this study, we introduce an innovative cross-modal matching-driven approach to reconcile semantic relevance (text–image consistency) and semantic gap (text–image heterogeneity) in multimodal fake news detection. Unlike the conventional paradigm of multimodal fusion followed by detection, our approach integrates textual modality, visual modality (images), and text embedded within images (auxiliary modality) to construct an end-to-end framework. This framework considers the relevance of contents across different modalities while simultaneously addressing the gap in structures, achieving a delicate balance between consistency and heterogeneity. Consistency is fostered by evaluating intermodality correlation via pairwise-similarity scores, while heterogeneity is addressed by employing cross-attention mechanisms to account for intermodality diversity. To achieve equilibrium between consistency and heterogeneity, we employ attention-guided enhanced modality interaction and similarity-based dynamic weight assignment to establish robust frameworks. Comparative experiments conducted on the Chinese Weibo dataset and the English Twitter dataset demonstrate the effectiveness of our approach, surpassing the state-of-the-art by 7% to 13%.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"6 7","pages":"1787-1796"},"PeriodicalIF":0.0,"publicationDate":"2025-01-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144519255","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Chengting Yu;Fengzhao Zhang;Hanzhi Ma;Aili Wang;Er-Ping Li
{"title":"Go Beyond End-to-End Training: Boosting Greedy Local Learning With Context Supply","authors":"Chengting Yu;Fengzhao Zhang;Hanzhi Ma;Aili Wang;Er-Ping Li","doi":"10.1109/TAI.2025.3528384","DOIUrl":"https://doi.org/10.1109/TAI.2025.3528384","url":null,"abstract":"Traditional end-to-end (E2E) training of deep networks necessitates storing intermediate activations for back-propagation, resulting in a large memory footprint on GPUs and restricted model parallelization. As an alternative, greedy local learning partitions the network into gradient-isolated modules and trains supervisely based on local preliminary losses, thereby providing asynchronous and parallel training methods that substantially reduce memory cost. However, empirical experiments reveal that as the number of segmentations of the gradient-isolated module increases, the performance of the local learning scheme degrades substantially, severely limiting its expansibility. To avoid this issue, we theoretically analyze the greedy local learning from the standpoint of information theory and propose a ContSup scheme, which incorporates context supply between isolated modules to compensate for information loss. Experiments on benchmark datasets (i.e. CIFAR, SVHN, STL-10) achieve SOTA results and indicate that our proposed method can significantly improve the performance of greedy local learning with minimal memory and computational overhead, allowing for the boost of the number of isolated modules.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"6 7","pages":"1823-1837"},"PeriodicalIF":0.0,"publicationDate":"2025-01-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144519252","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ben Fei;Yixuan Li;Weidong Yang;Wen-Ming Chen;Zhijun Li
{"title":"Multimodality Consistency for Point Cloud Completion via Differentiable Rendering","authors":"Ben Fei;Yixuan Li;Weidong Yang;Wen-Ming Chen;Zhijun Li","doi":"10.1109/TAI.2025.3527922","DOIUrl":"https://doi.org/10.1109/TAI.2025.3527922","url":null,"abstract":"Point cloud completion aims to acquire complete and high-fidelity point clouds from partial and low-quality point clouds, which are used in remote sensing applications. Existing methods tend to solve this problem solely from the point cloud modality, limiting the completion process to only 3-D structure while overlooking the information from other modalities. Nevertheless, additional modalities possess valuable information that can greatly enhance the effectiveness of point cloud completion. The edge information in depth images can serve as a supervisory signal for ensuring accurate outlines and overall shape. To this end, we propose a brand-new point cloud completion network, dubbed multimodality differentiable rendering (<italic>MMDR</i>), which utilizes point-based differentiable rendering (DR) to obtain the depth images to ensure that the model preserves the point cloud structures from the depth image domain. Moreover, the attentional feature extractor (AFE) module is devised to exploit the global features inherent in the partial input, and the extracted global features together with the coordinates and features of the patch center are fed into the point roots predictor (PRP) module to obtain a set of point roots for the upsampling module with point upsampling Transformer (PU-Transformer). Furthermore, the multimodality consistency loss between the depth images from predicted point clouds and corresponding ground truth enables the PU-Transformer to generate a high-fidelity point cloud with predicted point agents. Extensive experiments conducted on various existing datasets give evidence that MMDR surpasses the off-the-shelf methods for point cloud completion after qualitative and quantitative analysis.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"6 7","pages":"1746-1760"},"PeriodicalIF":0.0,"publicationDate":"2025-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144519314","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Shuangliang Li;Jinwei Wang;Hao Wu;Jiawei Zhang;Xin Cheng;Xiangyang Luo;Bin Ma
{"title":"Defense Against Adversarial Faces at the Source: Strengthened Faces Based on Hidden Disturbances","authors":"Shuangliang Li;Jinwei Wang;Hao Wu;Jiawei Zhang;Xin Cheng;Xiangyang Luo;Bin Ma","doi":"10.1109/TAI.2025.3527923","DOIUrl":"https://doi.org/10.1109/TAI.2025.3527923","url":null,"abstract":"Face recognition (FR) systems, while widely used across various sectors, are vulnerable to adversarial attacks, particularly those based on deep neural networks. Despite existing efforts to enhance the robustness of FR models, they still face the risk of secondary adversarial attacks. To address this, we propose a novel approach employing “strengthened face” with preemptive defensive perturbations. Strengthened face ensures original recognition accuracy while safeguarding FR systems against secondary attacks. In the white-box scenario, the strengthened face utilizes gradient-based and optimization-based methods to minimize feature representation differences between face pairs. For the black-box scenario, we propose shielded gradient sign descent (SGSD) to optimize the gradient update direction of strengthened faces, ensuring the transferability and effectiveness against unknown adversarial attacks. Experimental results demonstrate the efficacy of strengthened faces in defending against adversarial faces without compromising the performance of FR models or face image visual quality. Moreover, SGSD outperforms conventional methods, achieving an average performance improvement of 4% in transferability across different attack intensities.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"6 7","pages":"1761-1775"},"PeriodicalIF":0.0,"publicationDate":"2025-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144519471","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Neural Network Output-Feedback Distributed Formation Control for NMASs Under Communication Delays and Switching Network","authors":"Haodong Zhou;Shaocheng Tong","doi":"10.1109/TAI.2025.3527404","DOIUrl":"https://doi.org/10.1109/TAI.2025.3527404","url":null,"abstract":"This article studies the neural network (NN) output-feedback distributed formation control problem of nonlinear multiagent systems (NMASs) under communication delays and jointly connected switching network. Since the communication between agents is affected by time-varying delay and some agents cannot access the leader's information under jointly connected switching network, a communication-delay-related distributed formation observer is designed to estimate the leader's information and simultaneously mitigate the effects of communication delays. NNs are adopted to identify unknown functions, and an NN state observer is established to reconstruct unmeasurable states. Then, based on the designed distributed formation observer and NN state observer, an NN output-feedback distributed formation control algorithm is proposed by the backstepping control theory. It is proven that the designed communication-delay-related distributed formation observer errors converge to zero exponentially. Meanwhile, the proposed distributed NN formation control approach ensures the NMAS is stable, and the formation tracking errors converge to a small neighborhood around zero. Finally, we apply the output-feedback distributed formation control scheme to unmanned surface vehicles (USVs), the simulation results verify its effectiveness.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"6 6","pages":"1591-1602"},"PeriodicalIF":0.0,"publicationDate":"2025-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144196569","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Stochastic Submodular Bandits With Delayed Composite Anonymous Bandit Feedback","authors":"Mohammad Pedramfar;Vaneet Aggarwal","doi":"10.1109/TAI.2025.3527375","DOIUrl":"https://doi.org/10.1109/TAI.2025.3527375","url":null,"abstract":"This article investigates the problem of combinatorial multiarmed bandits with stochastic submodular (in expectation) rewards and full-bandit delayed feedback, where the delayed feedback is assumed to be composite and anonymous. In other words, the delayed feedback is composed of components of rewards from past actions, with unknown division among the subcomponents. Three models of delayed feedback: bounded adversarial, stochastic independent, and stochastic conditionally independent are studied, and regret bounds are derived for each of the delay models. Ignoring the problem dependent parameters, we show that regret bound for all the delay models is <inline-formula><tex-math>$tilde{O}(T^{2/3}+T^{1/3}nu)$</tex-math></inline-formula> for time horizon <inline-formula><tex-math>$T$</tex-math></inline-formula>, where <inline-formula><tex-math>$nu$</tex-math></inline-formula> is a delay parameter defined differently in the three cases, thus demonstrating an additive term in regret with delay in all the three delay models. The considered algorithm is demonstrated to outperform other full-bandit approaches with delayed composite anonymous feedback. We also demonstrate the generalizability of our analysis of the delayed composite anonymous feedback in combinatorial bandits as long as there exists an algorithm for the offline problem satisfying a certain robustness condition.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"6 7","pages":"1727-1735"},"PeriodicalIF":0.0,"publicationDate":"2025-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144519246","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"CVDLLM: Automated Cardiovascular Disease Diagnosis With Large-Language-Model-Assisted Graph Attentive Feature Interaction","authors":"Xihe Qiu;Haoyu Wang;Xiaoyu Tan;Yaochu Jin","doi":"10.1109/TAI.2025.3527401","DOIUrl":"https://doi.org/10.1109/TAI.2025.3527401","url":null,"abstract":"Electrocardiogram (ECG) measurements are essential for detecting and treating cardiovascular disease (CVD). However, manual evaluation of ECGs is prone to errors due to morphological variations. Although machine learning methods have shown promise in diagnosing diseases, automatic CVD diagnosis based on ECGs is still suffering from low diagnosis accuracy due to the limited usage of time-series information and interlead correlations. In this article, we propose a large language model (LLM)-assisted graph attentive feature interaction learning framework (CVDLLM) for automatic ECG diagnosis. It utilizes ECG data from twelve leads to classify eight heart diseases, including rhythm abnormalities and normal conditions. Our framework combines convolutional and recurrent neural networks for independent time-series feature extraction from 12-lead ECG signals. By incorporating features extracted by heart rate variability (HRV) analysis, we employ graph attention neural networks (GAT) and self-attentive feature interaction mechanism (GSAT) for feature interaction and model learning. Leveraging LLMs with pretrained knowledge bases and advanced language comprehension, we extract and learn semantic embeddings from medical case data. This approach equips our framework with a deep semantic layer, significantly enhancing its capacity to understand complex medical texts. Additionally, by representing the twelve leads as a graph, our framework enables highly accurate disease diagnosis based on spatial and temporal interactions with 12-lead ECG signals. We evaluate the performance of our proposed framework and our framework achieves state-of-the-art performance with accuracy, precision, recall, and F1-score.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"6 6","pages":"1575-1590"},"PeriodicalIF":0.0,"publicationDate":"2025-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144196580","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Muhammad Fahim;S. M. Ahsan Kazmi;Vishal Sharma;Hyundong Shin;Trung Q. Duong
{"title":"Edge Intelligence: A Deep Distilled Model for Wearables to Enable Proactive Eldercare","authors":"Muhammad Fahim;S. M. Ahsan Kazmi;Vishal Sharma;Hyundong Shin;Trung Q. Duong","doi":"10.1109/TAI.2025.3527400","DOIUrl":"https://doi.org/10.1109/TAI.2025.3527400","url":null,"abstract":"Wearable devices are becoming affordable in our society to provide services from simple fitness tracking to the detection of heartbeat disorders. In the case of elderly populations, these devices have great potential to enable proactive eldercare, which can increase the number of years of independent living. The wearables can capture healthcare data continuously. For meaningful insight, deep learning models are preferable to process this data for robust outcomes. One of the major challenges includes deploying these models on edge devices, such as smartphones and wearables. The bottleneck is a large number of parameters and compute-intensive operations. In this research, we propose a novel knowledge distillation (KD) scheme by introducing a self-revision concept. This scheme effectively reduces model size and transfers knowledge from a deep model to a distilled model by filling learning gaps during the training. To evaluate our distilled model, a publicly available dataset, “growing old together validation (GOTOV)” is utilized, which is based on medical-grade standard wearables to monitor behavioral changes in the elderly. Our proposed model reduces the 0.7 million parameters to 1500, which enables edge intelligence. It achieves a 6% improvement in precision, a 9% increase in recall, and a 9% higher F1-score compared to the shallow model for recognizing elderly behavior.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"6 7","pages":"1736-1745"},"PeriodicalIF":0.0,"publicationDate":"2025-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144519355","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Cyber Shadows: Neutralizing Security Threats With AI and Targeted Policy Measures","authors":"Marc Schmitt;Pantelis Koutroumpis","doi":"10.1109/TAI.2025.3527398","DOIUrl":"https://doi.org/10.1109/TAI.2025.3527398","url":null,"abstract":"The digital age, driven by the Artificial Intelligence (AI) revolution, brings significant opportunities but also conceals security threats, which we refer to as cyber shadows. These threats pose risks at individual, organizational, and societal levels. This article examines the systemic impact of these cyber threats and proposes a comprehensive cybersecurity strategy that integrates AI-driven solutions, such as intrusion detection systems (IDS), with targeted policy interventions. By combining technological and regulatory measures, we create a multilevel defense capable of addressing both direct threats and indirect negative externalities. We emphasize that the synergy between AI-driven solutions and policy interventions is essential for neutralizing cyber threats and mitigating their negative impact on the digital economy. Finally, we underscore the need for continuous adaptation of these strategies, especially in response to the rapid advancement of autonomous AI-driven attacks, to ensure the creation of secure and resilient digital ecosystems.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"6 7","pages":"1697-1705"},"PeriodicalIF":0.0,"publicationDate":"2025-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10835143","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144519472","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Leveraging AI to Compromise IoT Device Privacy by Exploiting Hardware Imperfections","authors":"Mirza Athar Baig;Asif Iqbal;Muhammad Naveed Aman;Biplab Sikdar","doi":"10.1109/TAI.2025.3526139","DOIUrl":"https://doi.org/10.1109/TAI.2025.3526139","url":null,"abstract":"The constrained design, remote deployment, and sensitive data generated by Internet of Things (IoT) devices make them susceptible to various cyberattacks. One such attack is profiling IoT devices by tracking their packet transmissions. While existing methods mitigate these attacks using pseudonymous identities, we propose a novel attack strategy that exploits the physical layer characteristics of IoT devices. Specifically, we demonstrate how an attacker can leverage features extracted from device transmissions to identify packets originating from the same device. Once identified, the attacker can isolate the device's signals and potentially determine its physical location. This attack exploits the fact that microcontroller clock variations exist across devices, even within the same model line. By extracting transmission features and training machine learning (ML) models, we accurately identify the originating device of the packets. This study reveals inherent privacy vulnerabilities in IoT systems due to hardware imperfections that are beyond user control. These limitations have profound implications for the design of security frameworks in emerging ubiquitous sensing environments. Our experiments demonstrate that the proposed attack achieves 99% accuracy in real-world settings and can bypass privacy measures implemented at higher protocol layers. This work highlights the urgent need for privacy protection strategies across multiple layers of the IoT protocol stack.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"6 6","pages":"1561-1574"},"PeriodicalIF":0.0,"publicationDate":"2025-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144196568","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}