Carlo Alberto Barbano;Enzo Tartaglione;Marco Grangetto
{"title":"Unsupervised Learning of Unbiased Visual Representations","authors":"Carlo Alberto Barbano;Enzo Tartaglione;Marco Grangetto","doi":"10.1109/TAI.2024.3514554","DOIUrl":"https://doi.org/10.1109/TAI.2024.3514554","url":null,"abstract":"Deep neural networks often struggle to learn robust representations in the presence of dataset biases, leading to suboptimal generalization on unbiased datasets. This limitation arises because the models heavily depend on peripheral and confounding factors, inadvertently acquired during training. Existing approaches to address this problem typically involve explicit supervision of bias attributes or reliance on prior knowledge about the biases. In this study, we address the challenging scenario where no explicit annotations of bias are available, and there's no prior knowledge about its nature. We present a fully unsupervised debiasing framework with three key steps: first, leveraging the inherent tendency to learn malignant biases to acquire a bias-capturing model; next, employing a pseudo-labeling process to obtain bias labels; and finally, applying cutting-edge supervised debiasing techniques to achieve an unbiased model. Additionally, we introduce a theoretical framework for evaluating model biasedness and conduct a detailed analysis of how biases impact neural network training. Experimental results on both synthetic and real-world datasets demonstrate the effectiveness of our method, showcasing state-of-the-art performance in various settings, occasionally surpassing fully supervised debiasing approaches.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"6 5","pages":"1171-1183"},"PeriodicalIF":0.0,"publicationDate":"2024-12-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143892497","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Wenjie Sun;Chengke Wu;Qinge Xiao;Junjie Jiang;Yuanjun Guo;Ying Bi;Xinyu Wu;Zhile Yang
{"title":"CauseTerML: Causal Learning via Term Mining for Assessing Review Discrepancies","authors":"Wenjie Sun;Chengke Wu;Qinge Xiao;Junjie Jiang;Yuanjun Guo;Ying Bi;Xinyu Wu;Zhile Yang","doi":"10.1109/TAI.2024.3512500","DOIUrl":"https://doi.org/10.1109/TAI.2024.3512500","url":null,"abstract":"Innovation is a key driver of modern economic and technological development. Correct and equitable identification of innovation is essential for promoting market competitiveness and ensuring the optimal allocation of resources. Existing research on innovation evaluation mainly focuses on qualitative or quantitative evaluation of the results, while ignoring potential biases in the application process. This work investigates an unexplored issue in the field of innovation evaluation: Whether the technicality of the title of an application affects its degree of attention in the review process? The key lies in two aspects: how to evaluate the technicality of the title and how to quantify this effect. To achieve this goal, we combine the term extraction schemes and causal inference techniques by modelling the fairness detection task in a causal diagram, and propose a novel framework called CauseTerML. The framework can be applied to fairness detection in a variety of application scenarios. Extensive experiments on a real-world patent dataset validate the effectiveness of CauseTerML.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"6 5","pages":"1156-1170"},"PeriodicalIF":0.0,"publicationDate":"2024-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143892438","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Chengting Yu;Fengzhao Zhang;Ruizhe Chen;Aili Wang;Zuozhu Liu;Shurun Tan;Er-Ping Li
{"title":"Decoupling Dark Knowledge via Block-Wise Logit Distillation for Feature-Level Alignment","authors":"Chengting Yu;Fengzhao Zhang;Ruizhe Chen;Aili Wang;Zuozhu Liu;Shurun Tan;Er-Ping Li","doi":"10.1109/TAI.2024.3512498","DOIUrl":"https://doi.org/10.1109/TAI.2024.3512498","url":null,"abstract":"Knowledge distillation (KD), a learning manner with a larger teacher network guiding a smaller student network, transfers dark knowledge from the teacher to the student via logits or intermediate features, with the aim of producing a well-performed lightweight model. Notably, many subsequent feature-based KD methods outperformed the earliest logit-based KD method and iteratively generated numerous state-of-the-art distillation methods. Nevertheless, recent work has uncovered the potential of the logit-based method, bringing the simple KD form based on logits back into the limelight. Features or logits? They partially implement the KD with entirely distinct perspectives; therefore, choosing between logits and features is not straightforward. This article provides a unified perspective of feature alignment to obtain a better comprehension of their fundamental distinction. Inheriting the design philosophy and insights of feature-based and logit-based methods, we introduce a block-wise logit distillation framework to apply implicit logit-based feature alignment by gradually replacing teacher's blocks as intermediate stepping-stone models to bridge the gap between the student and the teacher. Our method obtains comparable or superior results to state-of-the-art distillation methods. This article demonstrates the great potential of combining logit and features, and we hope it will inspire future research to revisit KD from a higher vantage point.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"6 5","pages":"1143-1155"},"PeriodicalIF":0.0,"publicationDate":"2024-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143892432","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Kun Jiang;Min Hua;Xu He;Lu Dong;Quan Zhou;Hongming Xu;Changyin Sun
{"title":"Improving String Stability in Cooperative Adaptive Cruise Control Through Multiagent Reinforcement Learning With Potential-Driven Motivation","authors":"Kun Jiang;Min Hua;Xu He;Lu Dong;Quan Zhou;Hongming Xu;Changyin Sun","doi":"10.1109/TAI.2024.3511513","DOIUrl":"https://doi.org/10.1109/TAI.2024.3511513","url":null,"abstract":"Cooperative adaptive cruise control (CACC) is regarded as a promising technology for achieving efficient and safe collaboration among connected and automated vehicles (CAVs) in a platoon, and multiagent reinforcement learning (MARL) methods are emerging as an effective approach to implementing the CACC technology. However, most MARL methods do not sufficiently tackle the prevalent string stability problem, even when integrating communication mechanisms to improve agents’ understanding of CACC scenarios. This limitation arises because these methods typically learn communication mechanisms based solely on the information directly observable by the agents, neglecting potentially valuable information present in the environment. In this article, we propose a multiagent actor–critic with a potential-driven motivation (MAACPM) approach, which utilizes variational inference theory to infer the potential motivation representation space in the CACC task, providing a more favorable opportunity for adjusting driving behavior within the platoon. Furthermore, we quantify the specific impact of potential motivation on each vehicle by measuring the difference between policies with and without potential motivation. We then utilize this difference as a potential reward signal to incentivize the agent to grasp effective potential motivation. The proposed method was validated in two typical CACC scenarios, where we compared the performance of our MAACPM algorithm with other state-of-the-art MARL methods to demonstrate its effectiveness. Furthermore, we illustrate potential real-world applications of our method by comparing it with actual vehicle driving data.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"6 5","pages":"1114-1127"},"PeriodicalIF":0.0,"publicationDate":"2024-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143892479","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Monocular 3-D Reconstruction of Blast Furnace Burden Surface Based on Cross-Domain Generative Self-Supervised Network","authors":"Zhipeng Chen;Xinyi Wang;Ling Shen;Jinshi Liu;Jianjun He;Jilin Zhu;Weihua Gui","doi":"10.1109/TAI.2024.3511515","DOIUrl":"https://doi.org/10.1109/TAI.2024.3511515","url":null,"abstract":"Accurate acquisition of the 3-D topography of the blast furnace (BF) burden surface is crucial for optimizing the ironmaking process. However, traditional methods struggle in the high temperature, high pressure, and dusty environment of the BF top, and even advanced industrial endoscopes only capture monocular images, limiting multiview stereoscopic reconstruction. To address these challenges, we propose a novel 3-D reconstruction framework featuring a virtual–real multiview endoscope array for capturing multiview images and a cross-domain point cloud generation self-supervised network (XGSN). The XGSN leverages a progressive multimodal self-attention mechanism and ray-tracing projection to compensate for the lack of 3-D labels, producing a high-fidelity 3-D point cloud. Experimental results show that the proposed method achieves a significant improvement in burden surface reconstruction accuracy, delivering high-quality 3-D mapping with enhanced real-time processing capabilities, demonstrating its potential for challenging industrial environments.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"6 5","pages":"1386-1400"},"PeriodicalIF":0.0,"publicationDate":"2024-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143896350","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Quantum Multimodal Neural Network Model for Sentiment Analysis on Quantum Circuits","authors":"Jin Zheng;Qing Gao;Daoyi Dong;Jinhu Lü;Yue Deng","doi":"10.1109/TAI.2024.3511514","DOIUrl":"https://doi.org/10.1109/TAI.2024.3511514","url":null,"abstract":"This article proposes a quantum multimodal neural network (QMNN) model that can be implemented on parameterized quantum circuits (PQCs), providing a novel avenue for processing multimodal data and performing advanced multimodal sentiment analysis tasks. The comprehensive QMNN model is structured into four fundamental blocks: multimodal data preprocessing, unimodal feature extraction, multimodal feature fusion, and optimization. Through these blocks, multimodal data are initially preprocessed and encoded into quantum states. Subsequently, visual and textual features are extracted from the quantum states and are then integrated to learn the interactions between different modalities. Finally, the model parameters are fine-tuned to optimize the sentiment analysis performance. Simulation results confirm that QMNN surpasses state-of-the-art baselines, using significantly lower input dimensions and substantially fewer parameters than classical models. Furthermore, the entanglement, integrity, robustness, and scalability of the model are analyzed in depth. Internally, the strong entanglement within the multimodal fusion block enhances interactions between textual and visual features, and the integrity of the model reflects the indispensable contribution of each component to the overall performance. Externally, robustness ensures the model operates stably under noisy conditions and incomplete inputs, and scalability enables it to efficiently adapt to varying architectural depths and widths. The above simulation results and performance analyses showcase the comprehensive strength of our proposed model.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"6 5","pages":"1128-1142"},"PeriodicalIF":0.0,"publicationDate":"2024-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143892545","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Analysis of An Intellectual Mechanism of a Novel Crop Recommendation System Using Improved Heuristic Algorithm-Based Attention and Cascaded Deep Learning Network","authors":"Yaganteeswarudu Akkem;Saroj Kumar Biswas","doi":"10.1109/TAI.2024.3508654","DOIUrl":"https://doi.org/10.1109/TAI.2024.3508654","url":null,"abstract":"This article introduces an innovative crop recommendation system that leverages an attention-based cascaded deep learning network (AACNet) optimized by an improved migration algorithm (IMA). The system is designed to address the inefficiencies of traditional crop recommendation methods by providing precise, real-time suggestions tailored to specific agricultural factors such as weather, soil type, and time. The AACNet employs recurrent neural networks (RNN) and gated recurrent units (GRU) to analyze time-sensitive agricultural factors, such as weather patterns and soil conditions, while the attention mechanism prioritizes the most significant features for accurate crop recommendations. The IMA optimizes the deep learning network, enhancing the system’s accuracy, precision, recall, and execution time. Experimental results demonstrate that the proposed system outperforms traditional methods, marking a significant advancement in precision agriculture. The system’s potential to revolutionize farming decision-making processes by optimizing resource allocation, reducing costs, and increasing crop yields underscores its importance in global agricultural challenges. This research represents a transformative step towards informed, efficient, and sustainable farming practices.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"6 5","pages":"1100-1113"},"PeriodicalIF":0.0,"publicationDate":"2024-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143892365","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yifan Xu;Pourya Shamsolmoali;Masoume Zareapoor;Jie Yang
{"title":"ClusVPR: Efficient Visual Place Recognition With Clustering-Based Weighted Transformer","authors":"Yifan Xu;Pourya Shamsolmoali;Masoume Zareapoor;Jie Yang","doi":"10.1109/TAI.2024.3510479","DOIUrl":"https://doi.org/10.1109/TAI.2024.3510479","url":null,"abstract":"Visual place recognition (VPR) is a highly challenging task that has a wide range of applications, including robot navigation and self-driving vehicles. VPR is a difficult task due to duplicate regions and insufficient attention to small objects in complex scenes, resulting in recognition deviations. In this article, we present ClusVPR, a novel approach that tackles the specific issues of redundant information in duplicate regions and representations of small objects. Different from existing methods that rely on convolutional neural networks (CNNs) for feature map generation, ClusVPR introduces a unique paradigm called clustering-based weighted transformer network (CWTNet). CWTNet uses the power of clustering-based weighted feature maps and integrates global dependencies to effectively address visual deviations encountered in large-scale VPR problems. We also introduce the optimized-VLAD (OptLAD) layer, which significantly reduces the number of parameters and enhances model efficiency. This layer is specifically designed to aggregate the information obtained from scale-wise image patches. Additionally, our pyramid self-supervised strategy focuses on extracting representative and diverse features from scale-wise image patches rather than from entire images. This approach is essential for capturing a broader range of information required for robust VPR. Extensive experiments on four VPR datasets show our model's superior performance compared to existing models while being less complex.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"6 4","pages":"1038-1049"},"PeriodicalIF":0.0,"publicationDate":"2024-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143761486","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}