{"title":"Higher-Order Smoothness Enhanced Graph Collaborative Filtering","authors":"Ling Huang;Zhi-Yuan Li;Zhen-Yu He;Yuefang Gao","doi":"10.1109/TBDATA.2024.3453758","DOIUrl":"https://doi.org/10.1109/TBDATA.2024.3453758","url":null,"abstract":"Graph Neural Networks (GNNs) based recommendations have shown significant performance improvement by explicitly modeling the user-item interactions as a bipartite graph. However, the existing GNNs-based recommendation methods suffer from the over-smoothing problem caused by utilizing the uniform distance of the reception field. To address this issue, we propose to explicitly incorporate the higher-order smoothness information into the node representation learning, and propose a new GNNs-based recommendation model named \u0000<underline>H</u>\u0000igher-order \u0000<underline>S</u>\u0000moothness enhanced \u0000<underline>G</u>\u0000raph \u0000<underline>C</u>\u0000ollaborative \u0000<underline>F</u>\u0000iltering (HS-GCF). The proposed model is mainly composed of two parts, namely lower-order module and higher-order module. The lower-order module guarantees that the lower-order smoothness is well obtained by using the user-item interactions. The higher-order module uses the latent group assumption to restrict too much noise introduced by the uniform distance property, which we call the higher-order smoothness information. Experiments are conducted on three real-world public datasets, and the experimental results show the performance improvements compared with several state-of-the-art methods and verify the importance of explicitly incorporating the higher-order smoothness information into the node representation learning.","PeriodicalId":13106,"journal":{"name":"IEEE Transactions on Big Data","volume":"10 6","pages":"731-741"},"PeriodicalIF":7.5,"publicationDate":"2024-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142600150","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Mixup Virtual Adversarial Training for Robust Vision Transformers","authors":"Weili Shi;Sheng Li","doi":"10.1109/TBDATA.2024.3453754","DOIUrl":"https://doi.org/10.1109/TBDATA.2024.3453754","url":null,"abstract":"Inspired by the success of transformers in natural language processing, vision transformers have been proposed to address a wide range of computer vision tasks, such as image classification, object detection and image segmentation, and they have achieved very promising performance. However, the robustness of vision transformers has been relatively under-explored. Recent studies have revealed that pre-trained vision transformers are also vulnerable to white-box adversarial attacks on the downstream image classification tasks. The adversarial attacks (e.g., FGSM and PGD) designed for convolutional neural networks (CNNs) can also cause severe performance drop for vision transformers. In this paper, we evaluate the robustness of vision transformers fine-tuned with the off-the-shelf methods under adversarial attacks on CIFAR-10 and CIFAR-100 and further propose a data-augmented virtual adversarial training approach called MixVAT, which is able to enhance the robustness of pre-trained vision transformers against adversarial attacks on the downstream tasks with the unlabelled data. Extensive results on multiple datasets demonstrate the superiority of our approach over baselines on adversarial robustness, without compromising generalization ability of the model.","PeriodicalId":13106,"journal":{"name":"IEEE Transactions on Big Data","volume":"11 3","pages":"1309-1320"},"PeriodicalIF":7.5,"publicationDate":"2024-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143949112","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Denoised Graph Collaborative Filtering via Neighborhood Similarity and Dynamic Thresholding","authors":"Haibo Ye;Lijun Zhang;Yuan Yao;Sheng-Jun Huang","doi":"10.1109/TBDATA.2024.3453765","DOIUrl":"https://doi.org/10.1109/TBDATA.2024.3453765","url":null,"abstract":"Graph collaborative filtering (GCF) has achieved great success in recommender systems due to its ability in mining high-order collaborative signals from historical user-item interactions. However, GCF's performance could be severely affected by the intrinsic noise within the user-item interactions. To this end, several denoised GCF frameworks have been proposed, whose heart is to estimate and handle the reliability of existing interactions. However, most of them suffer from two limitations: 1) the reliability computation itself is noisy, and 2) the reliability threshold is difficult to determine. To address the two limitations, in this paper, we propose a new \u0000<underline>N</u>\u0000eighborhood-\u0000<underline>i</u>\u0000nformed \u0000<underline>Den</u>\u0000oising framework NiDen for GCF. Specifically, for an existing user-item interaction, NiDen first estimates its reliability by employing the neighborhood information of the user and the item, and then determines whether the interaction is noisy or not via a dynamic thresholding strategy. After that, NiDen mitigates the negative impact of noise by both structure denoising and sample re-weighting. We instantiate NiDen on two representative GCF models and conduct extensive experiments on four widely-used datasets. The results show that NiDen achieves the best performance compared to the existing denoising methods, especially on datasets with heavy noise.","PeriodicalId":13106,"journal":{"name":"IEEE Transactions on Big Data","volume":"10 6","pages":"683-693"},"PeriodicalIF":7.5,"publicationDate":"2024-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142600149","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"ALAD: A New Unsupervised Time Series Anomaly Detection Paradigm Based on Activation Learning","authors":"Fengqian Ding;Bo Li;Xianye Ben;Jia Zhao;Hongchao Zhou","doi":"10.1109/TBDATA.2024.3453762","DOIUrl":"https://doi.org/10.1109/TBDATA.2024.3453762","url":null,"abstract":"Time series anomaly detection has been received growing interest in industrial and academic communities due to its substantial theoretical value and practical significance in reality. Recent advanced methods for time series anomaly detection are based on deep learning techniques, since they have shown their superiority in some specific situations. However, most existing deep learning-based anomaly detection methods require predefined, specific tasks of reconstruction or prediction, necessitating task-specific loss functions. Designing such anomaly-aware loss functions poses a significant challenge due to the ambiguity in defining ground-truth anomalies. Moreover, these methods often rely on complex network architectures that tend to lead to over-generalization, resulting in even abnormal data being well reconstructed or fitted. To mitigate this situation, grounded in activation learning theory, we propose a novel unsupervised time series anomaly detection paradigm termed ALAD. ALAD utilizes a straightforward fully connected network architecture, measuring the typicality of input patterns through the sum of the squared output. Despite its simplicity, ALAD achieves competitive performance compared to state-of-the-art models trained using backpropagation. By utilizing various real-world and synthetic datasets, experimental results have confirmed the effectiveness and feasibility of the proposed paradigm. This work also demonstrates that biologically-plausible local learning can sometimes outperform backpropagation in real-world scenarios.","PeriodicalId":13106,"journal":{"name":"IEEE Transactions on Big Data","volume":"11 3","pages":"1285-1297"},"PeriodicalIF":7.5,"publicationDate":"2024-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143949128","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Debiased Cross Contrastive Quantization for Unsupervised Image Retrieval","authors":"Zipeng Chen;Yuan-Gen Wang;Lin-Cheng Li","doi":"10.1109/TBDATA.2024.3453751","DOIUrl":"https://doi.org/10.1109/TBDATA.2024.3453751","url":null,"abstract":"Contrastive quantization (applying vector quantization to contrastive learning) has achieved great success in large-scale image retrieval because of its advantage of high computational efficiency and small storage space. This article designs a novel optimization framework to simultaneously optimize the cross quantization and the debiased contrastive learning, termed Debiased Cross Contrastive Quantization (DCCQ). The proposed framework is implemented in an end-to-end network, resulting in both reduced quantization error and deletion of many false negative samples. Specifically, to increase the distinguishability between codewords, DCCQ introduces the codeword similarity loss and soft quantization entropy loss for network training. Furthermore, the memory bank strategy and multi-crop image augmentation strategy are employed to promote the effectiveness and efficiency of contrastive learning. Extensive experiments on three large-scale real image benchmark datasets show that the proposed DCCQ yields state-of-the-art results.","PeriodicalId":13106,"journal":{"name":"IEEE Transactions on Big Data","volume":"11 3","pages":"1298-1308"},"PeriodicalIF":7.5,"publicationDate":"2024-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143949338","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Multi-Granularity Feature Interaction and Multi-Region Selection Based Triplet Visual Question Answering","authors":"Heng Liu;Boyue Wang;Yanfeng Sun;Junbin Gao;Xiaoyan Li;Yongli Hu;Baocai Yin","doi":"10.1109/TBDATA.2024.3453750","DOIUrl":"https://doi.org/10.1109/TBDATA.2024.3453750","url":null,"abstract":"Accurately locating the question-related regions in one given image is crucial for visual question answering (VQA). The current approaches suffer two limitations: (1) Dividing one image into multiple regions may lose parts of semantic information and original relationships between regions; (2) Choosing only one or all image regions to predict the answer may correspondingly result in the insufficiency or redundancy of information. Therefore, how to effectively mine the relationship between image regions and choose the relevant image regions are vital. In this paper, we propose a novel <b>M</b>ulti-granularity feature interaction and <b>M</b>ulti-region selection-based triplet VQA model (M2TVQA). To tackle the first limitation, we propose the multi-granularity feature interaction strategy that adaptively supplements the global coarse-granularity features with the regional fine-granularity features. To overcome the second limitation, we design the Top-<inline-formula><tex-math>$K$</tex-math></inline-formula> learning strategy to adaptively select <inline-formula><tex-math>$K$</tex-math></inline-formula> most relevant image regions to the question, even if the selected regions are far away in space. Such a strategy can select as many relevant image regions as possible and reduce introducing noise. Finally, we construct the multi-modality triplet to predict the answer of VQA. Extended experiments on two public outside knowledge datasets OK-VQA and KRVQA verify the effectiveness of the proposed model.","PeriodicalId":13106,"journal":{"name":"IEEE Transactions on Big Data","volume":"11 3","pages":"1346-1356"},"PeriodicalIF":7.5,"publicationDate":"2024-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143949094","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Efficient Forward and Backward Private Conjunctive Searchable Encryption With Comprehensive Verification","authors":"Yue Ge;Ying Gao;Lin Qi;Jiankai Qiu","doi":"10.1109/TBDATA.2024.3442540","DOIUrl":"https://doi.org/10.1109/TBDATA.2024.3442540","url":null,"abstract":"Dynamic searchable symmetric encryption (DSSE) enables the retrieval and update of massive encrypted data and is thus widely applied in cloud storage. Malicious cloud servers may tamper with outsourced data or return incorrect search results, making verification indispensable. However, current verifiable conjunctive DSSE with forward and backward privacy cannot simultaneously achieve accurate search and empty result verification. Given the above problems, we propose a novel forward and backward private conjunctive DSSE with comprehensive verification called VCFB. VCFB introduces the notion of random blinding factors and secure dynamic cross-tags to achieve accurate conjunctive search with sublinear overhead. We design the search state and construct chain structures to ensure forward security. The new verification algorithm based on bilinear-map dynamic accumulators can guarantee the verifiability of search results even if an empty result is returned. We use a sample check method to verify the new dynamic cross-tags, reducing computation costs. We precisely define the leakage function of VCFB and give detailed security proof, demonstrating that VCFB is forward and backward private under the malicious server model. Experimental results show that VCFB ensures efficient and accurate conjunctive search, outperforming the similar verifiable conjunctive DSSE scheme regarding indexing storage and update performance.","PeriodicalId":13106,"journal":{"name":"IEEE Transactions on Big Data","volume":"11 3","pages":"1259-1272"},"PeriodicalIF":7.5,"publicationDate":"2024-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143949321","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"MMFFHS: Multi-Modal Feature Fusion for Hate Speech Detection on Social Media","authors":"Pradeep Kumar Roy","doi":"10.1109/TBDATA.2024.3445372","DOIUrl":"https://doi.org/10.1109/TBDATA.2024.3445372","url":null,"abstract":"Millions of people use social media platforms such as Facebook, YouTube, and Twitter to stay updated on news, enjoy entertainment, and share personal moments with peers. These platforms have now become medium channels for spreading rumors, posting hate speech, cyberbullying, etc. Hate speech frequently appears on social media platforms nowadays. Sometimes, it impairs readers’ mental and emotional health and societal order. Therefore, timely detection is required to prevent the spread of hate speech posts on social media platforms. The researchers have reported some research works on textual hate speech detection. However, social media posts are not limited to text; images and text with images are also used in the posts, termed multimodal data. The text-based model may not be efficient enough to handle the multimodal data. Therefore, this study introduces a reliable architecture that utilizes deep and transfer learning frameworks to classify multimodal social media posts into hate and non-hate. The proposed model is compatible with text, images, and images with text-based social posts to categorize hate and non-hate. The proposed framework MMFFHS, a feature-fusion-based model, performed better than the existing models by achieving 70.26% accuracy.","PeriodicalId":13106,"journal":{"name":"IEEE Transactions on Big Data","volume":"11 3","pages":"1247-1258"},"PeriodicalIF":7.5,"publicationDate":"2024-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143949277","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Exploring New Frontiers in Agricultural NLP: Investigating the Potential of Large Language Models for Food Applications","authors":"Saed Rezayi;Zhengliang Liu;Zihao Wu;Chandra Dhakal;Bao Ge;Haixing Dai;Gengchen Mai;Ninghao Liu;Chen Zhen;Tianming Liu;Sheng Li","doi":"10.1109/TBDATA.2024.3442542","DOIUrl":"https://doi.org/10.1109/TBDATA.2024.3442542","url":null,"abstract":"This paper explores new frontiers in agricultural natural language processing (NLP) by investigating the effectiveness of food-related text corpora for pretraining transformer-based language models. Specifically, we focus on semantic matching, establishing mappings between food descriptions and nutrition data through fine-tuning AgriBERT with the FoodOn ontology. Our work introduces an expanded comparison with state-of-the-art language models such as GPT-4, Mistral-large, Claude 3 Sonnet, and Gemini 1.0 Ultra. This exploratory investigation, rather than a direct comparison, aims to understand how AgriBERT, a domain-specific, fine-tuned, open-source model, complements the broad knowledge and generative abilities of these advanced LLMs in addressing the unique challenges of the agricultural sector. We also experiment with other applications, such as cuisine prediction from ingredients, expanding our research to include various NLP tasks beyond semantic matching. Overall, this paper underscores the potential of integrating domain-specific models like AgriBERT with advanced LLMs to enhance the performance and applicability of agricultural NLP applications.","PeriodicalId":13106,"journal":{"name":"IEEE Transactions on Big Data","volume":"11 3","pages":"1235-1246"},"PeriodicalIF":7.5,"publicationDate":"2024-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143949096","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Learning-Based Distributed Spatio-Temporal $k$k Nearest Neighbors Join","authors":"Ruiyuan Li;Jiajun Li;Minxin Zhou;Rubin Wang;Huajun He;Chao Chen;Jie Bao;Yu Zheng","doi":"10.1109/TBDATA.2024.3442539","DOIUrl":"https://doi.org/10.1109/TBDATA.2024.3442539","url":null,"abstract":"The rapid development of positioning technology produces an extremely large volume of spatio-temporal data with various geometry types such as point, line string, polygon, or a mixed combination of them. As one of the most fundamental but time-consuming operations, <inline-formula><tex-math>$k$</tex-math></inline-formula> nearest neighbors join (<inline-formula><tex-math>$k$</tex-math></inline-formula>NN join) has attracted much attention. However, most existing works for <inline-formula><tex-math>$k$</tex-math></inline-formula>NN join either ignore temporal information or consider only point data. Besides, most of them do not automatically adapt to the different features of spatio-temporal data. This paper proposes to address a novel and useful problem, i.e., ST-<inline-formula><tex-math>$k$</tex-math></inline-formula>NN join, which considers both <i>spatial closeness</i> and <i>temporal concurrency</i>. To support ST-<inline-formula><tex-math>$k$</tex-math></inline-formula>NN join over a large amount of spatio-temporal data with any geometry types efficiently, we propose a novel distributed solution based on Apache Spark. Specifically, our method adopts a two-round join framework. In the first round join, we propose a new spatio-temporal partitioning method that achieves spatio-temporal locality and load balance at the same time. We also propose a lightweight index structure, i.e., Time Range Count Index (TRC-index), to enable efficient ST-<inline-formula><tex-math>$k$</tex-math></inline-formula>NN join. In the second round join, to reduce the data transmission among different machines, we remove duplicates based on spatio-temporal reference points before shuffling local results. Furthermore, we design a set of models based on Bayesian optimization to automatically determine the values for the introduced parameters. Extensive experiments are conducted using three real big datasets, showing that our method is much more scalable and achieves 9X faster than baselines, and that the proposed models can always predict appropriate parameters for different datasets.","PeriodicalId":13106,"journal":{"name":"IEEE Transactions on Big Data","volume":"11 2","pages":"861-878"},"PeriodicalIF":7.5,"publicationDate":"2024-08-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143629659","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}