Peixuan Ge;Tao Yan;Pak Kin Wong;Zheng Li;In Neng Chan;Hon Ho Yu;Chon In Chan;Liang Yao;Ying Hu;Shan Gao
{"title":"Simultaneous Segmentation and Classification of Esophageal Lesions Using Attention Gating Pyramid Vision Transformer","authors":"Peixuan Ge;Tao Yan;Pak Kin Wong;Zheng Li;In Neng Chan;Hon Ho Yu;Chon In Chan;Liang Yao;Ying Hu;Shan Gao","doi":"10.1109/TETCI.2024.3485704","DOIUrl":"https://doi.org/10.1109/TETCI.2024.3485704","url":null,"abstract":"Automatic and accurate segmentation and classification of esophageal lesions are two essential tasks to assist endoscopists in Upper Gastrointestinal Endoscopy. However, there is no intelligent system that can diagnose more lesion types, handle multiple tasks simultaneously, and be more accurate in clinical work. Therefore, we present an innovative Multi-Task deep learning architecture named Attention Gating Pyramid Vision Transformer (AGPVT), which provides a solution for the accurate classification and precise segmentation of lesion types and regions simultaneously. The proposed AGPVT combines the benefits of cutting-edge deep learning model designs with Multi-Task Learning (MTL) in order to advance the field. Furthermore, a patch-wise multi-head attention gating method alongside a hybrid design MTL decoder, is employed as the core driving architecture of the AGPVT. Comprehensive experiments are conducted on a multicenter dataset which contains esophageal cancer, Barrett's esophagus, esophageal protruded lesions, esophagitis, and normal esophagus. Experimental results show that the proposed AGPVT achieves a classification accuracy of 96.84%, an IoU score of 85.61%, and a Dice score of 90.75%, outperforming existing methods and demonstrating its effectiveness in this domain.","PeriodicalId":13135,"journal":{"name":"IEEE Transactions on Emerging Topics in Computational Intelligence","volume":"9 2","pages":"1961-1975"},"PeriodicalIF":5.3,"publicationDate":"2024-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143706566","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Generative Probabilistic Meta-Learning for Few-Shot Image Classification","authors":"Meijun Fu;Xiaomin Wang;Jun Wang;Zhang Yi","doi":"10.1109/TETCI.2024.3483255","DOIUrl":"https://doi.org/10.1109/TETCI.2024.3483255","url":null,"abstract":"Meta-learning, a rapidly advancing area in computational intelligence, leverages prior knowledge from related tasks to facilitate the swift adaptation to new tasks with limited data. A critical challenge in meta-learning is the quantification of model uncertainty. In this paper, we propose a novel meta-learning method, Generative Probabilistic Meta-Learning (GPML), designed for few-shot image classification. GPML extends the Probably Approximately Correct-Bayes (PAC-Bayes) framework, initially formulated for single-task scenarios, to meta-learning across multiple tasks. This extension not only provides theoretical generalization guarantees for meta-learning but also effectively captures model uncertainty through variational parameters. To enhance the expressiveness of approximated posteriors in Bayesian inference, GPML incorporates implicit modeling, which defines probability distributions over task-specific parameters in a data-driven manner. This is achieved by designing a generative model structure that integrates task-dependent prior knowledge into the model inference process. We conduct extensive multidimensional performance evaluations on few-shot image classification tasks across various benchmarks, demonstrating that GPML outperforms existing state-of-the-art meta-learning methods. Additionally, ablation studies focusing on model components, the PAC-Bayes framework, and implicit modeling validate the performance improvements attributed to the proposed generative model structure, learning framework, and modeling approach.","PeriodicalId":13135,"journal":{"name":"IEEE Transactions on Emerging Topics in Computational Intelligence","volume":"9 2","pages":"1947-1960"},"PeriodicalIF":5.3,"publicationDate":"2024-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143706567","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Enhancing Accuracy-Privacy Trade-Off in Differentially Private Split Learning","authors":"Ngoc Duy Pham;Khoa T. Phan;Naveen Chilamkurti","doi":"10.1109/TETCI.2024.3485723","DOIUrl":"https://doi.org/10.1109/TETCI.2024.3485723","url":null,"abstract":"Split learning (SL) aims to protect user data privacy by distributing deep models between the client-server and keeping private data locally. Only processed or ‘smashed’ data can be transmitted from the clients to the server during the SL process. However, recently proposed model inversion attacks can recover original data from smashed data. To enhance privacy protection against such attacks, one strategy is to adopt differential privacy (DP), which involves safeguarding the smashed data at the expense of some accuracy loss. This paper presents the first investigation into the impact on accuracy when training multiple clients in SL with various privacy requirements. Subsequently, we propose an approach that reviews the DP noise distributions of other clients during client training to address the identified accuracy degradation. We also examine the application of DP to the local model of SL to gain insights into the trade-off between accuracy and privacy. Specifically, the findings reveal that introducing noise in the later local layers offers the most favorable balance between accuracy and privacy. Drawing from our insights in the shallower layers, we propose an approach to reduce the size of smashed data to minimize data leakage while maintaining higher accuracy, optimizing the accuracy-privacy trade-off. Additionally, smashed data of a smaller size reduces communication overhead on the client side, mitigating one of the notable drawbacks of SL. Intensive experiments on various datasets demonstrate that our proposed approaches provide an optimal trade-off for incorporating DP into SL, ultimately enhancing the training accuracy for multi-client SL with varying privacy requirements.","PeriodicalId":13135,"journal":{"name":"IEEE Transactions on Emerging Topics in Computational Intelligence","volume":"9 1","pages":"988-1000"},"PeriodicalIF":5.3,"publicationDate":"2024-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143361046","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Unsupervised Speaker Diarization in Distributed IoT Networks Using Federated Learning","authors":"Amit Kumar Bhuyan;Hrishikesh Dutta;Subir Biswas","doi":"10.1109/TETCI.2024.3482855","DOIUrl":"https://doi.org/10.1109/TETCI.2024.3482855","url":null,"abstract":"This paper presents a computationally efficient and distributed speaker diarization framework for networked IoT-style audio devices. The work proposes a Federated Learning model which can identify the participants in a conversation without the requirement of a large audio database for training. An unsupervised online update mechanism is proposed for the Federated Learning model which depends on cosine similarity of speaker embeddings. Moreover, the proposed diarization system solves the problem of speaker change detection via. unsupervised segmentation techniques using Hotelling's t-squared Statistic and Bayesian Information Criterion. In this new approach, speaker change detection is biased around detected quasi-silences, which reduces the severity of the trade-off between the missed detection and false detection rates. Additionally, the computational overhead due to frame-by-frame identification of speakers is reduced via. unsupervised clustering of speech segments. The results demonstrate the effectiveness of the proposed training method in the presence of non-IID speech data. It also shows a considerable improvement in the reduction of false and missed detection at the segmentation stage, while reducing the computational overhead. Improved accuracy and reduced computational cost makes the mechanism suitable for real-time speaker diarization across a distributed IoT audio network.","PeriodicalId":13135,"journal":{"name":"IEEE Transactions on Emerging Topics in Computational Intelligence","volume":"9 2","pages":"1934-1946"},"PeriodicalIF":5.3,"publicationDate":"2024-10-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143706564","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The Segmentation Method of Road Surface Covering Objects Based on CBAM UNET++","authors":"Yang Sen;Wang Zhenmin;Song Wenlong;Yang Changqun","doi":"10.1109/TETCI.2024.3462854","DOIUrl":"https://doi.org/10.1109/TETCI.2024.3462854","url":null,"abstract":"Dangerous road surface covering objects such as wet slippery, ice and snow will directly affect the safety performance. Therefore, the detection and visualization of road surface covering objects' status under complex weather and road conditions are of great significance to the safety of human driving and unmanned driving. However, the complex road conditions (vehicles and pedestrians blocking the road surface, the area of the measured coverage is small, and the ambient light changes drastically) limit the accuracy of road surface coverage objects' state detection in the natural environment. Given the above problems, this paper reconstructs the image prepossessing process in road ice and snow cover segmentation by introducing background extraction before image segmentation, and then proposes a road surface coverage objects segmentation method based on Convolutional Block Attention Module UNet++ (CBAM UNet++). First, through the performance comparison of different background extraction algorithms, the Content-adaptive Resizing Framework (CARF) background extraction algorithm is used to eliminate the interference of vehicles, pedestrians and other objects in complex road conditions. Then, the CBAM UNet++ model is established to segment the four types of road surface coverings objects in the outfield to improve detection accuracy under conditions of small area coverage objects and severe illumination changes. Experimental results indicate, after introducing background extraction, the segmentation accuracy under different lighting conditions can be improved by 5.6%--17.7%; Compared with traditional methods for segmenting objects on road surfaces, the CBAM UNet++ method demonstrates an average segmentation accuracy improvement of at least 6.5% under six different lighting conditions.","PeriodicalId":13135,"journal":{"name":"IEEE Transactions on Emerging Topics in Computational Intelligence","volume":"9 2","pages":"1924-1933"},"PeriodicalIF":5.3,"publicationDate":"2024-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143706747","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Chunyan Diao;Dafang Zhang;Wei Liang;Man Jiang;Kuanching Li
{"title":"A Novel Attention-Based Dynamic Multi-Graph Spatial-Temporal Graph Neural Network Model for Traffic Prediction","authors":"Chunyan Diao;Dafang Zhang;Wei Liang;Man Jiang;Kuanching Li","doi":"10.1109/TETCI.2024.3462513","DOIUrl":"https://doi.org/10.1109/TETCI.2024.3462513","url":null,"abstract":"Traffic flow prediction is a non-negligible part of intelligent transportation and mobility. Unfortunately, the unique non-linearity and complex spatial-ST-correlation of transport flow data suggest considerable challenges in prediction. The dynamic interaction of multiple spatial relations greatly influences traffic flow prediction. However, the existing spatial-temporal prediction algorithms are based on graph convolution to capture global or heterogeneous relationships, and simpler graph convolution models cannot accurately capture complex dynamic spatial relationships. To address the issues as mentioned above, this study proposes an attention-based multi-graph dynamic spatial-temporal prediction model ADMSTGCN to capture a variety of dynamic interaction relationships in traffic flow. First, we use a distance graph to explore the relationships between adjacent distances and use a semantic graph to mine spatial relationships between nodes that are far apart but have similar relationships, then fuse these two graphs to obtain a fusion graph with multiple spatial interaction relationships. The correlations between different neighbors are then further learned through a dynamic multi-graph spatial-temporal learning module that aggregates the features of different neighbors through gated graph convolution and attention mechanisms to capture various dynamic and complex spatial-temporal interactions. Experimental evaluations show that the framework proposed outperforms existing methods with better results in the analysis performed with publicly available datasets and also demonstrates the importance of capturing multiple interactions of spatial-temporal relationships.","PeriodicalId":13135,"journal":{"name":"IEEE Transactions on Emerging Topics in Computational Intelligence","volume":"9 2","pages":"1910-1923"},"PeriodicalIF":5.3,"publicationDate":"2024-10-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143706690","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Junzhe Zhang;Gexin Liu;Junteng Zhang;Dandan Ding;Zhan Ma
{"title":"DeepPCC: Learned Lossy Point Cloud Compression","authors":"Junzhe Zhang;Gexin Liu;Junteng Zhang;Dandan Ding;Zhan Ma","doi":"10.1109/TETCI.2024.3467192","DOIUrl":"https://doi.org/10.1109/TETCI.2024.3467192","url":null,"abstract":"We propose DeepPCC, an end-to-end learning-based approach for the lossy compression of large-scale object point clouds. For both geometry and attribute components, we introduce the Multiscale Neighborhood Information Aggregation (NIA) mechanism, which applies resolution downscaling progressively (<italic>i.e.</i>, dyadic downsampling of geometry and average pooling of attribute) and combines sparse convolution and local self-attention at each resolution scale for effective feature representation. Under a simple autoencoder structure, scale-wise NIA blocks are stacked as the analysis and synthesis transform in the encoder-decoder pair to best characterize spatial neighbors for accurate approximation of geometry occupancy probability and attribute intensity. Experiments demonstrate that DeepPCC remarkably outperforms state-of-the-art rules-based MPEG G-PCC and learning-based solutions both quantitatively and qualitatively, providing strong evidence that DeepPCC is a promising solution for emerging AI-based PCC.","PeriodicalId":13135,"journal":{"name":"IEEE Transactions on Emerging Topics in Computational Intelligence","volume":"9 2","pages":"1897-1909"},"PeriodicalIF":5.3,"publicationDate":"2024-10-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143706668","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Event Causal Relation Extraction in Brain Connectomics: A Model Utilizing Weighted Joint Constrained Learning","authors":"Lianfang Ma;Jianhui Chen;Jiajin Huang;Yiyu Yao;Ning Zhong","doi":"10.1109/TETCI.2024.3462173","DOIUrl":"https://doi.org/10.1109/TETCI.2024.3462173","url":null,"abstract":"Brain science research has entered the era of connectomics, characterized by a significant increase in published articles investigating brain structure and functional connections. Automatically and accurately extracting scientific evidence from these articles has become an urgent concern. Unlike early brain mechanism studies at the functional area level, brain connectomics studies feature more intricate experimental designs and yield complex findings. Traditional neuroimaging text mining techniques, operating at the term level, are insufficient for effectively extracting scientific evidence from brain connectomics articles. This paper addresses a key challenge in event-level neuroimaging text mining, i.e., event causal relation extraction in brain connectomics. We introduce a novel model named Brain Connectomics Event Relation Miner (BCERM), leveraging weighted joint constrained learning. By integrating a bidirectional long short-term memory (BiLSTM) network with a multi-layer perceptron (MLP), we develop a lightweight model for jointly extracting multiple event causal relations from brain connectomics articles. Given the scarcity of annotated brain connectomics corpora, we propose a weighted joint constrained learning framework. This framework integrates double consistency constraints, encompassing common sense and domain constraints, and combines them with adaptive weight learning to enhance the model's few-shot learning capability. Experimental evaluations on a real brain connectomics article dataset demonstrate that our method achieves an F-score of 70%, outperforming state-of-the-art event relation extraction methods in the low-resource environment.","PeriodicalId":13135,"journal":{"name":"IEEE Transactions on Emerging Topics in Computational Intelligence","volume":"9 2","pages":"1885-1896"},"PeriodicalIF":5.3,"publicationDate":"2024-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143706664","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Decentralized Triggering and Event-Based Integral Reinforcement Learning for Multiplayer Differential Game Systems","authors":"Chaoxu Mu;Ke Wang;Song Zhu;Guangbin Cai","doi":"10.1109/TETCI.2024.3372389","DOIUrl":"https://doi.org/10.1109/TETCI.2024.3372389","url":null,"abstract":"Multiplayer differential games are typically characterized by multiple control loops, where communication resources are periodically transmitted and control policies are updated in a time-triggered manner. In this paper, two different event-triggered mechanisms are proposed for a class of multiplayer nonzero-sum differential game systems. Specifically, by defining a global sampled state, a centralized triggering rule is devised to manage state sampling and control updating in a synchronized manner. By considering each player's preferences, the decentralized triggering rule is devised in which a local event generator produces the triggering sequence independently. On the other hand, with experience replay and integral reinforcement learning, an event-based adaptive learning scheme is developed, which is implemented by critic neural networks and only requires partial knowledge of system dynamics. The theoretical results indicate that both two triggering mechanisms can guarantee the asymptotic stability and weight convergence. Finally, simulation results on a three-player numerical system and a two-player supersonic transport system substantiate the effectiveness of two learning-based triggering mechanisms.","PeriodicalId":13135,"journal":{"name":"IEEE Transactions on Emerging Topics in Computational Intelligence","volume":"8 6","pages":"3727-3741"},"PeriodicalIF":5.3,"publicationDate":"2024-10-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142691785","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Dual-Scale Attributed Graph Transformer for Extracting Spatial-Temporal Features With Applications in Quality Index Prediction","authors":"Kesheng Zhang;Wen Yu;Tianyou Chai","doi":"10.1109/TETCI.2024.3462486","DOIUrl":"https://doi.org/10.1109/TETCI.2024.3462486","url":null,"abstract":"This paper presents a novel deep learning architecture, the Dual-scale Attribute Graph Transformer (DAGT), for extracting spatial-temporal features from attributed graph data. DAGT addresses the challenge of inconsistent sampling periods in industrial data streams by utilizing two key modules: 1) Dual-Scale Spatial-temporal Graph Convolution Network (DSGCN): This module captures both spatial and temporal information within attributed graphs, enabling effective feature extraction for tasks like quality index prediction. 2) Spatial-temporal Graph Attention Block (SGAB): This module employs an attention mechanism to selectively focus on crucial areas of the graph sequence. By assigning higher weights to regions with significant spatial-temporal features, SGAB refines the feature representation. The contributions of DAGT lie in the construction of a dual-scale adjacency matrix for efficient temporal and spatial dimensionality reduction and the design of a graph pooling module via spatial clustering. These innovations enhance the model's ability to learn from attributed graph sequences. The proposed method for quality index prediction is validated using real-world industrial data of the mineral processing process and various comparative experiments.","PeriodicalId":13135,"journal":{"name":"IEEE Transactions on Emerging Topics in Computational Intelligence","volume":"9 2","pages":"1873-1884"},"PeriodicalIF":5.3,"publicationDate":"2024-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143706663","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}