Mohammad Nadeem;Shahab Saquib Sohail;Dag Øivind Madsen;Ahmed Ibrahim Alzahrani;Javier Del Ser;Khan Muhammad
{"title":"A Multi-Modal Assessment Framework for Comparison of Specialized Deep Learning and General-Purpose Large Language Models","authors":"Mohammad Nadeem;Shahab Saquib Sohail;Dag Øivind Madsen;Ahmed Ibrahim Alzahrani;Javier Del Ser;Khan Muhammad","doi":"10.1109/TBDATA.2025.3536937","DOIUrl":"https://doi.org/10.1109/TBDATA.2025.3536937","url":null,"abstract":"Recent years have witnessed tremendous advancements in Al tools (e.g., ChatGPT, GPT-4, and Bard), driven by the growing power, reasoning, and efficiency of Large Language Models (LLMs). LLMs have been shown to excel in tasks ranging from poem writing and coding to essay generation and puzzle solving. Despite their proficiency in general queries, specialized tasks such as metaphor understanding and fake news detection often require finely tuned models, posing a comparison challenge with specialized Deep Learning (DL). We propose an assessment framework to compare task-specific intelligence with general-purpose LLMs on suicide and depression tendency identification. For this purpose, we trained two DL models on a suicide and depression detection dataset, followed by testing their performance on a test set. Afterward, the same test dataset is used to evaluate the performance of four LLMs (GPT-3.5, GPT-4, Google Bard, and MS Bing) using four classification metrics. The BERT-based DL model performed the best among all, with a testing accuracy of 94.61%, while GPT-4 was the runner-up with accuracy 92.5%. Results demonstrate that LLMs do not outperform the specialized DL models but are able to achieve comparable performance, making them a decent option for downstream tasks without specialized training. However, LLMs outperformed specialized models on the reduced dataset.","PeriodicalId":13106,"journal":{"name":"IEEE Transactions on Big Data","volume":"11 3","pages":"1001-1012"},"PeriodicalIF":7.5,"publicationDate":"2025-01-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143949116","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"SRGTNet: Subregion-Guided Transformer Hash Network for Fine-Grained Image Retrieval","authors":"Hongchun Lu;Songlin He;Xue Li;Min Han;Chase Wu","doi":"10.1109/TBDATA.2025.3533916","DOIUrl":"https://doi.org/10.1109/TBDATA.2025.3533916","url":null,"abstract":"Fine-grained image retrieval (FGIR) is a crucial task in computer vision, with broad applications in areas such as biodiversity monitoring, e-commerce, and medical diagnostics. However, capturing discriminative feature information to generate binary codes is difficult because of high intraclass variance and low interclass variance. To address this challenge, we (i) build a novel and highly reliable fine-grained deep hash learning framework for more accurate retrieval of fine-grained images. (ii) We propose a part significant region erasure method that forces the network to generate compact binary codes. (iii) We introduce a CNN-guided Transformer structure for use in fine-grained retrieval tasks to capture fine-grained images effectively in contextual feature relationships to mine more discriminative regional features. (iv) A multistage mixture loss is designed to optimize network training and enhance feature representation. Experiments were conducted on three publicly available fine-grained datasets. The results show that our method effectively improves the performance of fine-grained image retrieval.","PeriodicalId":13106,"journal":{"name":"IEEE Transactions on Big Data","volume":"11 5","pages":"2388-2400"},"PeriodicalIF":5.7,"publicationDate":"2025-01-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144989930","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Anomaly Detection in Multi-Level Model Space","authors":"Ao Chen;Xiren Zhou;Yizhan Fan;Huanhuan Chen","doi":"10.1109/TBDATA.2025.3534625","DOIUrl":"https://doi.org/10.1109/TBDATA.2025.3534625","url":null,"abstract":"Anomaly detection (AD) is gaining prominence, especially in situations with limited labeled data or unknown anomalies, demanding an efficient approach with minimal reliance on labeled data or prior knowledge. Building upon the framework of Learning in the Model Space (LMS), this paper proposes conducting AD through Learning in the Multi-Level Model Spaces (MLMS). LMS transforms the data from the data space to the model space by representing each data instance with a fitted model. In MLMS, to fully capture the dynamic characteristics within the data, multi-level details of the original data instance are decomposed. These details are individually fitted, resulting in a set of fitted models that capture the multi-level dynamic characteristics of the original instance. Representing each data instance with a set of fitted models, rather than a single one, transforms it from the data space into the multi-level model spaces. The pairwise difference measurement between model sets is introduced, fully considering the distance between fitted models and the intra-class aggregation of similar models at each level of detail. Subsequently, effective AD can be implemented in the multi-level model spaces, with or without sufficient multi-class labeled data. Experiments on multiple AD datasets demonstrate the effectiveness of the proposed method.","PeriodicalId":13106,"journal":{"name":"IEEE Transactions on Big Data","volume":"11 5","pages":"2376-2387"},"PeriodicalIF":5.7,"publicationDate":"2025-01-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144990026","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"MultiTec: A Data-Driven Multimodal Short Video Detection Framework for Healthcare Misinformation on TikTok","authors":"Lanyu Shang;Yang Zhang;Yawen Deng;Dong Wang","doi":"10.1109/TBDATA.2025.3533919","DOIUrl":"https://doi.org/10.1109/TBDATA.2025.3533919","url":null,"abstract":"With the prevalence of social media and short video sharing platforms (e.g., TikTok, YouTube Shorts), the proliferation of healthcare misinformation has become a widespread and concerning issue that threatens public health and undermines trust in mass media. This paper focuses on an important problem of detecting multimodal healthcare misinformation in short videos on TikTok. Our objective is to accurately identify misleading healthcare information that is jointly conveyed by the visual, audio, and textual content within the TikTok short videos. Three critical challenges exist in solving our problem: i) how to effectively extract information from distractive and manipulated visual content in short videos? ii) How to efficiently identify the interrelation of the heterogeneous visual and speech content in short videos? iii) How to accurately capture the complex dependency of the densely connected sequential content in short videos? To address the above challenges, we develop <italic>MultiTec</i>, a multimodal detector that explicitly explores the audio and visual content in short videos to investigate both the sequential relation of video elements and their inter-modality dependencies to jointly detect misinformation in healthcare videos on TikTok. To the best of our knowledge, MultiTec is the first modality-aware dual-attentive short video detection model for multimodal healthcare misinformation on TikTok. We evaluate MultiTec on two real-world healthcare video datasets collected from TikTok. Evaluation results show that MultiTec achieves substantial performance gains compared to state-of-the-art baselines in accurately detecting misleading healthcare short videos.","PeriodicalId":13106,"journal":{"name":"IEEE Transactions on Big Data","volume":"11 5","pages":"2471-2488"},"PeriodicalIF":5.7,"publicationDate":"2025-01-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10854802","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144934330","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yangwen Yu;Victor O. K. Li;Jacqueline C. K. Lam;Kelvin Chan;Qi Zhang
{"title":"CTDI: CNN-Transformer-Based Spatial-Temporal Missing Air Pollution Data Imputation","authors":"Yangwen Yu;Victor O. K. Li;Jacqueline C. K. Lam;Kelvin Chan;Qi Zhang","doi":"10.1109/TBDATA.2025.3533882","DOIUrl":"https://doi.org/10.1109/TBDATA.2025.3533882","url":null,"abstract":"Accurate and comprehensive air pollution data is essential for understanding and addressing environmental challenges. Missing data can impair accurate analysis and decision-making. This study presents a novel approach, named CNN-Transformer-based Spatial-Temporal Data Imputation (CTDI), for imputing missing air pollution data. Data pre-processing incorporates observed air pollution data and related urban data to produce 24-hour period tensors as input samples. 1-by-1 CNN layers capture the interaction between different types of input data. Deep learning transformer architecture is employed in a spatial-temporal (S-T) transformer module to capture long-range dependencies and extract complex relationships in both spatial and temporal dimensions. Hong Kong air pollution data is statistically analyzed and used to evaluate CTDI in its recovery of generated and actual patterns of missing data. Experimental results show that CTDI consistently outperforms existing imputation methods across all evaluated scenarios, including cases with higher rates of missing data, thereby demonstrating its robustness and effectiveness in enhancing air quality monitoring. Additionally, ablation experiments reveal that each component significantly contributes to the model's performance, with the temporal transformer proving particularly crucial under varying rates of missing data.","PeriodicalId":13106,"journal":{"name":"IEEE Transactions on Big Data","volume":"11 5","pages":"2443-2456"},"PeriodicalIF":5.7,"publicationDate":"2025-01-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144934506","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Enhancing the Transferability of Adversarial Examples With Random Diversity Ensemble and Variance Reduction Augmentation","authors":"Sensen Zhang;Haibo Hong;Mande Xie","doi":"10.1109/TBDATA.2025.3533892","DOIUrl":"https://doi.org/10.1109/TBDATA.2025.3533892","url":null,"abstract":"Currently, deep neural networks (DNNs) are susceptible to adversarial attacks, particularly when the network's structure and parameters are known, while most of the existing attacks do not perform satisfactorily in the presence of black-box settings. In this context, model augmentation is considered to be effective to improve the success rates of black-box attacks on adversarial examples. However, the existing model augmentation methods tend to rely on a single transformation, which limits the diversity of augmented model collections and thus affects the transferability of adversarial examples. In this paper, we first propose the random diversity ensemble method (RDE-MI-FGSM) to effectively enhance the diversity of the augmented model collection, thereby improving the transferability of the generated adversarial examples. Afterwards, we put forward the random diversity variance ensemble method (RDE-VRA-MI-FGSM), which adopts variance reduction augmentation (VRA) to improve the gradient variance of the enhanced model set and avoid falling into a poor local optimum, so as to further improve the transferability of adversarial examples. Furthermore, experimental results demonstrate that our approaches are compatible with many existing transfer-based attacks and can effectively improve the transferability of gradient-based adversarial attacks on the ImageNet dataset. Also, our proposals have achieved higher attack success rates even if the target model adopts advanced defenses. Specifically, we have achieved an average attack success rate of 91.4% on the defense model, which is higher than other baseline approaches.","PeriodicalId":13106,"journal":{"name":"IEEE Transactions on Big Data","volume":"11 5","pages":"2417-2430"},"PeriodicalIF":5.7,"publicationDate":"2025-01-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144990138","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Scalable Learning-Based Community-Preserving Graph Generation","authors":"Sheng Xiang;Chenhao Xu;Dawei Cheng;Ying Zhang","doi":"10.1109/TBDATA.2025.3533898","DOIUrl":"https://doi.org/10.1109/TBDATA.2025.3533898","url":null,"abstract":"Graph generation plays an essential role in understanding the formation of complex network structures across various fields, such as biological and social networks. Recent studies have shifted towards employing deep learning methods to grasp the topology of graphs. Yet, most current graph generators fail to adequately capture the community structure, which stands out as a critical and distinctive aspect of graphs. Additionally, these generators are generally limited to smaller graphs because of their inefficiencies and scaling challenges. This paper introduces the Community-Preserving Graph Adversarial Network (CPGAN), designed to effectively simulate graphs. CPGAN leverages graph convolution networks within its encoder and maintains shared parameters during generation to encapsulate community structure data and ensure permutation invariance. We also present the Scalable Community-Preserving Graph Attention Network (SCPGAN), aimed at enhancing the scalability of our model. SCPGAN considerably cuts down on inference and training durations, as well as GPU memory usage, through the use of an ego-graph sampling approach and a short-pipeline autoencoder framework. Tests conducted on six real-world graph datasets reveal that CPGAN manages a beneficial balance between efficiency and simulation quality when compared to leading-edge baselines. Moreover, SCPGAN marks substantial strides in model efficiency and scalability, successfully increasing the size of generated graphs to the 10 million node level while maintaining competitive quality, on par with other advanced learning models.","PeriodicalId":13106,"journal":{"name":"IEEE Transactions on Big Data","volume":"11 5","pages":"2457-2470"},"PeriodicalIF":5.7,"publicationDate":"2025-01-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144934444","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Data Exchange for the Metaverse With Accountable Decentralized TTPs and Incentive Mechanisms","authors":"Liang Zhang;Xingyu Wu;Yuhang Ma;Haibin Kan","doi":"10.1109/TBDATA.2025.3533924","DOIUrl":"https://doi.org/10.1109/TBDATA.2025.3533924","url":null,"abstract":"As a global virtual environment, the metaverse poses various challenges regarding data storage, sharing, interoperability, and privacy preservation. Typically, a trusted third party (TTP) is considered necessary in these scenarios. However, relying on a single TTP may introduce biases, compromise privacy, or lead to single-point-of-failure problem. To address these challenges and enable secure data exchange in the metaverse, we propose a system based on decentralized TTPs and the Ethereum blockchain. First, we use the threshold ElGamal cryptosystem to create the decentralized TTPs, employing verifiable secret sharing (VSS) to force owners to share data honestly. Second, we leverage the Ethereum blockchain to serve as the public communication channel, automatic verification machine, and smart contract engine. Third, we apply discrete logarithm equality (DLEQ) algorithms to generate non-interactive zero knowledge (NIZK) proofs when encrypted data is uploaded to the blockchain. Fourth, we present an incentive mechanism to benefit data owners and TTPs from data-sharing activities, as well as a penalty policy if malicious behavior is detected. Consequently, we construct a data exchange framework for the metaverse, in which all involved entities are accountable. Finally, we perform comprehensive experiments to demonstrate the feasibility and analyze the properties of the proposed system.","PeriodicalId":13106,"journal":{"name":"IEEE Transactions on Big Data","volume":"11 5","pages":"2431-2442"},"PeriodicalIF":5.7,"publicationDate":"2025-01-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144990069","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Multiplex Hypergraph Attribute-Based Graph Collaborative Filtering for Cold-Start POI Recommendation","authors":"Simon Nandwa Anjiri;Derui Ding;Yan Song;Ying Sun","doi":"10.1109/TBDATA.2025.3533908","DOIUrl":"https://doi.org/10.1109/TBDATA.2025.3533908","url":null,"abstract":"Within the scope of location-based services and personalized recommendations, the challenges of recommending new and unvisited points of interest (POIs) to mobile users are compounded by the sparsity of check-in data. Traditional recommendation models often overlook user and POI attributes, which exacerbates data sparsity and cold-start problems. To address this issue, a novel multiplex hypergraph attribute-based graph collaborative filtering is proposed for POI recommendation to create a robust recommendation system capable of handling sparse data and cold-start scenarios. Specifically, a multiplex network hypergraph is first constructed to capture complex relationships between users, POIs, and attributes based on the similarities of attributes, visit frequencies, and preferences. Then, an adaptive variational graph auto-encoder adversarial network is developed to accurately infer the users’/POIs’ preference embeddings from their attribute distributions, which reflect complex attribute dependencies and latent structures within the data. Moreover, a dual graph neural network variant based on both Graphsage K-nearest neighbor networks and gated recurrent units are created to effectively capture various attributes of different modalities in a neighborhood, including temporal dependencies in user preferences and spatial attributes of POIs. Finally, experiments conducted on Foursquare and Yelp datasets reveal the superiority and robustness of the developed model compared to some typical state-of-the-art approaches and adequately illustrate the effectiveness of the issues with cold-start users and POIs.","PeriodicalId":13106,"journal":{"name":"IEEE Transactions on Big Data","volume":"11 5","pages":"2401-2416"},"PeriodicalIF":5.7,"publicationDate":"2025-01-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144990139","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ezekiel B. Ouedraogo;Ammar Hawbani;Xingfu Wang;Zhi Liu;Liang Zhao;Mohammed A. A. Al-qaness;Saeed Hamood Alsamhi
{"title":"Digital Twin Data Management: A Comprehensive Review","authors":"Ezekiel B. Ouedraogo;Ammar Hawbani;Xingfu Wang;Zhi Liu;Liang Zhao;Mohammed A. A. Al-qaness;Saeed Hamood Alsamhi","doi":"10.1109/TBDATA.2025.3533891","DOIUrl":"https://doi.org/10.1109/TBDATA.2025.3533891","url":null,"abstract":"Digital Twins are virtual representations of physical assets and systems that rely on effective Data Management to integrate, process, and analyze diverse data sources. This article comprehensively examines Data Management challenges, architectures, techniques, and applications in the context of Digital Twins. It explores key issues such as data heterogeneity, quality assurance, scalability, security, and interoperability. The paper outlines architectural approaches like centralized, distributed, cloud-based, and blockchain solutions and Data Management techniques for modeling, integration, fusion, quality management, and visualization. Domain-specific considerations across manufacturing, smart cities, healthcare, and other sectors are discussed. Finally, open research challenges related to standards, real-time data processing, intelligent Data Management, and ethical aspects are highlighted. By synthesizing the state-of-the-art, this review serves as a valuable reference for developing robust Data Management strategies that enable Digital Twin deployments.","PeriodicalId":13106,"journal":{"name":"IEEE Transactions on Big Data","volume":"11 5","pages":"2224-2243"},"PeriodicalIF":5.7,"publicationDate":"2025-01-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144990260","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}