Guang Yang;Jing Zhang;Giorgos Papanastasiou;Ge Wang;Dacheng Tao
{"title":"Editorial Emerging Horizons: The Rise of Large Language Models and Cross-Modal Generative AI","authors":"Guang Yang;Jing Zhang;Giorgos Papanastasiou;Ge Wang;Dacheng Tao","doi":"10.1109/TBDATA.2025.3537217","DOIUrl":"https://doi.org/10.1109/TBDATA.2025.3537217","url":null,"abstract":"","PeriodicalId":13106,"journal":{"name":"IEEE Transactions on Big Data","volume":"11 3","pages":"896-897"},"PeriodicalIF":7.5,"publicationDate":"2025-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11003991","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143949196","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Exploring the Trade-Offs: Unified Large Language Models vs Local Fine-Tuned Models for Highly-Specific Radiology NLI Task","authors":"Zihao Wu;Lu Zhang;Chao Cao;Xiaowei Yu;Zhengliang Liu;Lin Zhao;Yiwei Li;Haixing Dai;Chong Ma;Gang Li;Wei Liu;Quanzheng Li;Dinggang Shen;Xiang Li;Dajiang Zhu;Tianming Liu","doi":"10.1109/TBDATA.2025.3536928","DOIUrl":"https://doi.org/10.1109/TBDATA.2025.3536928","url":null,"abstract":"Recently, ChatGPT and GPT-4 have emerged and gained immense global attention due to their unparalleled performance in language processing. Despite demonstrating impressive capability in various open-domain tasks, their adequacy in highly specific fields like radiology remains untested. Radiology presents unique linguistic phenomena distinct from open-domain data due to its specificity and complexity. Assessing the performance of large language models (LLMs) in such specific domains is crucial not only for a thorough evaluation of their overall performance but also for providing valuable insights into future model design directions: whether model design should be generic or domain-specific. To this end, in this study, we evaluate the performance of ChatGPT/GPT-4 on a radiology natural language inference (NLI) task and compare it to other models fine-tuned specifically on task-related data samples. We also conduct a comprehensive investigation on ChatGPT/GPT-4’s reasoning ability by introducing varying levels of inference difficulty. Our results show that 1) ChatGPT and GPT-4 outperform other LLMs in the radiology NLI task and 2) other specifically fine-tuned Bert-based models require significant amounts of data samples to achieve comparable performance to ChatGPT/GPT-4. These findings not only demonstrate the feasibility and promise of constructing a generic model capable of addressing various tasks across different domains, but also highlight several key factors crucial for developing a unified model, particularly in a medical context, paving the way for future artificial general intelligence (AGI) systems. We release our code and data to the research community.","PeriodicalId":13106,"journal":{"name":"IEEE Transactions on Big Data","volume":"11 3","pages":"1027-1041"},"PeriodicalIF":7.5,"publicationDate":"2025-02-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143949276","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Large Language Model-Informed ECG Dual Attention Network for Heart Failure Risk Prediction","authors":"Chen Chen;Lei Li;Marcel Beetz;Abhirup Banerjee;Ramneek Gupta;Vicente Grau","doi":"10.1109/TBDATA.2025.3536922","DOIUrl":"https://doi.org/10.1109/TBDATA.2025.3536922","url":null,"abstract":"Heart failure (HF) poses a significant public health challenge, with a rising global mortality rate. Early detection and prevention of HF could significantly reduce its impact. We introduce a novel methodology for predicting HF risk using 12-lead electrocardiograms (ECGs). We present a novel, lightweight dual attention ECG network designed to capture complex ECG features essential for early HF risk prediction, despite the notable imbalance between low and high-risk groups. This network incorporates a cross-lead attention module and 12 lead-specific temporal attention modules, focusing on cross-lead interactions and each lead's local dynamics. To further alleviate model overfitting, we leverage a large language model (LLM) with a public ECG-Report dataset for pretraining on an ECG-Report alignment task. The network is then fine-tuned for HF risk prediction using two specific cohorts from the U.K. Biobank study, focusing on patients with hypertension (UKB-HYP) and those who have had a myocardial infarction (UKB-MI). The results reveal that LLM-informed pre-training substantially enhances HF risk prediction in these cohorts. The dual attention design not only improves interpretability but also predictive accuracy, outperforming existing competitive methods with C-index scores of 0.6349 for UKB-HYP and 0.5805 for UKB-MI. This demonstrates our method's potential in advancing HF risk assessment with clinical complex ECG data.","PeriodicalId":13106,"journal":{"name":"IEEE Transactions on Big Data","volume":"11 3","pages":"948-960"},"PeriodicalIF":7.5,"publicationDate":"2025-01-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10858425","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143949113","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Luis Ibañez-Lissen;Lorena González-Manzano;José M. de Fuentes;Manuel Goyanes
{"title":"Use of Transfer Learning for Affordable In-Context Fake Review Generation","authors":"Luis Ibañez-Lissen;Lorena González-Manzano;José M. de Fuentes;Manuel Goyanes","doi":"10.1109/TBDATA.2025.3536927","DOIUrl":"https://doi.org/10.1109/TBDATA.2025.3536927","url":null,"abstract":"Fake content is a noteworthy threat which is managed by assorted means. This is a serious problem for online shopping platforms whose products can be affected by negative or positive reviews. Artificial intelligence is commonly applied for fake review generation, being transfer learning a promising approach to reduce training requirements. However, the feasibility of generating in-context fake reviews using transfer learning has not been explored yet. This paper analyses the suitability of a couple of transformers (T5 and BART) to generate realistic in-context fake reviews. Results show that 1) the diversity of generated reviews is comparable to existing works; 2) human-based detection is close to random; 3) just reviews generated with one of the used transformers can be detected with 38% precision; and 1 h of training and 8 k real reviews are needed to produce realistic fake reviews.","PeriodicalId":13106,"journal":{"name":"IEEE Transactions on Big Data","volume":"11 3","pages":"976-987"},"PeriodicalIF":7.5,"publicationDate":"2025-01-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10858443","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143949266","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Hierarchical Multi-Relational Graph Representation Learning for Large-Scale Prediction of Drug-Drug Interactions","authors":"Mengying Jiang;Guizhong Liu;Yuanchao Su;Weiqiang Jin;Biao Zhao","doi":"10.1109/TBDATA.2025.3536924","DOIUrl":"https://doi.org/10.1109/TBDATA.2025.3536924","url":null,"abstract":"Most existing methods for predicting drug-drug interactions (DDI) predominantly concentrate on capturing the explicit relationships among drugs, overlooking the valuable implicit correlations present between drug pairs (DPs), which leads to weak predictions. To address this issue, this paper introduces a hierarchical multi-relational graph representation learning (HMGRL) approach. Within the framework of HMGRL, we leverage a wealth of drug-related heterogeneous data sources to construct heterogeneous graphs, where nodes represent drugs and edges denote clear and various associations. The relational graph convolutional network (RGCN) is employed to capture diverse explicit relationships between drugs from these heterogeneous graphs. Additionally, a multi-view differentiable spectral clustering (MVDSC) module is developed to capture multiple valuable implicit correlations between DPs. Within the MVDSC, we utilize multiple DP features to construct graphs, where nodes represent DPs and edges denote different implicit correlations. Subsequently, multiple DP representations are generated through graph cutting, each emphasizing distinct implicit correlations. The graph-cutting strategy enables our HMGRL to identify strongly connected communities of graphs, thereby reducing the fusion of irrelevant features. By combining every representation view of a DP, we create high-level DP representations for predicting DDIs. Two genuine datasets spanning three distinct tasks are adopted to gauge the efficacy of our HMGRL. Experimental outcomes unequivocally indicate that HMGRL surpasses several leading-edge methods in performance.","PeriodicalId":13106,"journal":{"name":"IEEE Transactions on Big Data","volume":"11 3","pages":"961-975"},"PeriodicalIF":7.5,"publicationDate":"2025-01-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143949322","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"TinyLVLM-eHub: Towards Comprehensive and Efficient Evaluation for Large Vision-Language Models","authors":"Wenqi Shao;Meng Lei;Yutao Hu;Peng Gao;Peng Xu;Kaipeng Zhang;Fanqing Meng;Siyuan Huang;Hongsheng Li;Yu Qiao;Ping Luo","doi":"10.1109/TBDATA.2025.3536930","DOIUrl":"https://doi.org/10.1109/TBDATA.2025.3536930","url":null,"abstract":"Large Vision-Language Models (LVLMs) have made significant strides in various multimodal tasks. Notably, GPT4V, Claude, Gemini, and others showcase exceptional multimodal capabilities, marked by profound comprehension and reasoning skills. This study introduces a comprehensive and efficient evaluation framework, TinyLVLM-eHub, to assess LVLMs’ performance, including proprietary models. TinyLVLM-eHub covers six key multimodal capabilities, such as visual perception, knowledge acquisition, reasoning, commonsense understanding, object hallucination, and embodied intelligence. The benchmark, utilizing 2.1K image-text pairs, provides a user-friendly and accessible platform for LVLM evaluation. The evaluation employs the ChatGPT Ensemble Evaluation (CEE) method, which improves alignment with human evaluation compared to word-matching approaches. Results reveal that closed-source API models like GPT4V and GeminiPro-V excel in most capabilities compared to previous open-source LVLMs, though they show some vulnerability in object hallucination. This evaluation underscores areas for LVLM improvement in real-world applications and serves as a foundational assessment for future multimodal advancements.","PeriodicalId":13106,"journal":{"name":"IEEE Transactions on Big Data","volume":"11 3","pages":"933-947"},"PeriodicalIF":7.5,"publicationDate":"2025-01-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143949200","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"CHEAT: A Large-Scale Dataset for Detecting CHatGPT-writtEn AbsTracts","authors":"Peipeng Yu;Jiahan Chen;Xuan Feng;Zhihua Xia","doi":"10.1109/TBDATA.2025.3536929","DOIUrl":"https://doi.org/10.1109/TBDATA.2025.3536929","url":null,"abstract":"The powerful ability of ChatGPT has caused widespread concern in the academic community. Malicious users could synthesize dummy academic content through ChatGPT, which is extremely harmful to academic rigor and originality. The need to develop ChatGPT-written content detection algorithms calls for large-scale datasets. In this paper, we initially investigate the possible negative impact of ChatGPT on academia, and present a large-scale CHatGPT-writtEn AbsTract dataset (CHEAT) to support the development of detection algorithms. In particular, the ChatGPT-written abstract dataset contains 35,304 synthetic abstracts, with <inline-formula><tex-math>$Generation$</tex-math></inline-formula>, <inline-formula><tex-math>$Polish$</tex-math></inline-formula>, and <inline-formula><tex-math>$Fusion$</tex-math></inline-formula> as prominent representatives. Based on these data, we perform a thorough analysis of the existing text synthesis detection algorithms. We show that ChatGPT-written abstracts are detectable with well-trained detectors, while the detection difficulty increases with more human guidance involved.","PeriodicalId":13106,"journal":{"name":"IEEE Transactions on Big Data","volume":"11 3","pages":"898-906"},"PeriodicalIF":7.5,"publicationDate":"2025-01-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143949213","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"To Write or Not to Write as a Machine? That’s the Question","authors":"Robiert Sepúlveda-Torres;Iván Martínez-Murillo;Estela Saquete;Elena Lloret;Manuel Palomar","doi":"10.1109/TBDATA.2025.3536938","DOIUrl":"https://doi.org/10.1109/TBDATA.2025.3536938","url":null,"abstract":"Considering the potential of tools such as ChatGPT or Gemini to generate texts in a similar way to a human would do, having reliable detectors of AI –AI-generated content (AIGC)– is vital to combat the misuse and the surrounding negative consequences of those tools. Most research on AIGC detection has focused on the English language, often overlooking other languages that also have tools capable of generating human-like texts, such is the case of the Spanish language. This paper proposes a novel multilingual and multi-task approach for detecting machine versus human-generated text. The first task classifies whether a text is written by a machine or by a human, which is the research objective of this paper. The second task consists in detect the language of the text. To evaluate the results of our approach, this study has framed the scope of the AuTexTification shared task and also we have collected a different dataset in Spanish. The experiments carried out in Spanish and English show that our approach is very competitive concerning the state of the art, as well as it can generalize better, thus being able to detect an AI-generated text in multiple domains.","PeriodicalId":13106,"journal":{"name":"IEEE Transactions on Big Data","volume":"11 3","pages":"1042-1053"},"PeriodicalIF":7.5,"publicationDate":"2025-01-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10858399","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143949173","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Terrain Scene Generation Using a Lightweight Vector Quantized Generative Adversarial Network","authors":"Yan Wang;Huiyu Zhou;Xinghui Dong","doi":"10.1109/TBDATA.2025.3536926","DOIUrl":"https://doi.org/10.1109/TBDATA.2025.3536926","url":null,"abstract":"Natural terrain scene images play important roles in the geographical research and application. However, it is challenging to collect a large set of terrain scene images. Recently, great progress has been made in image generation. Although impressive results can be achieved, the efficiency of the state-of-the-art methods, e.g., the Vector Quantized Generative Adversarial Network (VQGAN), is still dissatisfying. The VQGAN confronts two issues, i.e., high space complexity and heavy computational demand. To efficiently fulfill the terrain scene generation task, we first collect a Natural Terrain Scene Data Set (NTSD), which contains 36,672 images divided into 38 classes. Then we propose a Lightweight VQGAN (Lit-VQGAN), which uses the fewer parameters and has the lower computational complexity, compared with the VQGAN. A lightweight super-resolution network is further adopted, to speedily derive a high-resolution image from the image that the Lit-VQGAN generates. The Lit-VQGAN can be trained and tested on the NTSD. To our knowledge, either the NTSD or the Lit-VQGAN has not been exploited before.<sup>1</sup> Experimental results show that the Lit-VQGAN is more efficient and effective than the VQGAN for the image generation task. These promising results should be due to the lightweight yet effective networks that we design.","PeriodicalId":13106,"journal":{"name":"IEEE Transactions on Big Data","volume":"11 3","pages":"988-1000"},"PeriodicalIF":7.5,"publicationDate":"2025-01-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143949212","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Adapt Anything: Tailor Any Image Classifier Across Domains and Categories Using Text-to-Image Diffusion Models","authors":"Weijie Chen;Haoyu Wang;Shicai Yang;Lei Zhang;Wei Wei;Yanning Zhang;Luojun Lin;Di Xie;Yueting Zhuang","doi":"10.1109/TBDATA.2025.3536933","DOIUrl":"https://doi.org/10.1109/TBDATA.2025.3536933","url":null,"abstract":"We study a novel problem in this paper, that is, if a modern text-to-image diffusion model can tailor any image classifier across domains and categories. Existing domain adaption works exploit both source and target data for domain alignment so as to transfer the knowledge from the labeled source data to the unlabeled target data. However, as the development of text-to-image diffusion models, we wonder if the high-fidelity synthetic data can serve as a surrogate of the source data in real world. In this way, we do not need to collect and annotate the source data for each image classification task in a one-for-one manner. Instead, we utilize only one off-the-shelf text-to-image model to synthesize images with labels derived from text prompts, and then leverage them as a bridge to dig out the knowledge from the task-agnostic text-to-image generator to the task-oriented image classifier via domain adaptation. Such a one-for-all adaptation paradigm allows us to adapt anything in the world using only one text-to-image generator as well as any unlabeled target data. Extensive experiments validate the feasibility of this idea, which even surprisingly surpasses the state-of-the-art domain adaptation works using the source data collected and annotated in real world.","PeriodicalId":13106,"journal":{"name":"IEEE Transactions on Big Data","volume":"11 3","pages":"1013-1026"},"PeriodicalIF":7.5,"publicationDate":"2025-01-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143949114","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}