Sirish Prabakar, Haiquan Chen, Zhe Jiang, Carl Yang, Weikuan Yu, Da Yan
{"title":"LENS: label sparsity-tolerant adversarial learning on spatial deceptive reviews","authors":"Sirish Prabakar, Haiquan Chen, Zhe Jiang, Carl Yang, Weikuan Yu, Da Yan","doi":"10.1007/s10707-024-00529-5","DOIUrl":null,"url":null,"abstract":"<p>Online businesses and websites have recently become the main target of fake reviews, where fake reviews are intentionally composed to manipulate the business ratings positively or negatively. Most of existing works to detect fake reviews are supervised methods, whose performance highly depends on the amount, quality, and variety of the labeled data, which are often non-trivial to obtain in practice. In this paper, we propose a semi-supervised label sparsity-tolerant framework, LENS, for fake review detection by mining spatial knowledge and learning distributions of embedded topics. LENS builds on two key observations. (1) Spatial knowledge revealed in spatial entities and their co-occurring latent topic distributions may indicate the review authenticity. (2) Distributions of the embedded topics (the contextual distribution) may exhibit important patterns to differentiate between real and fake reviews. Specifically, LENS first extracts embeddings for spatial named entities using a knowledge base trained from Wikipedia webpages. Second, LENS represents each input token as a distribution over the learned latent topics in the embedded topic space. To bypass the differentiation difficulty, LENS builds on two discriminators in the actor-critic architecture using reinforcement learning. Extensive experiments using the real-world spatial and non-spatial datasets show that LENS consistently outperformed the state-of-the-art semi-supervised fake review detection methods on few labels at all different labeling rates for real and fake reviews, respectively, in a label-starving setting.</p>","PeriodicalId":55109,"journal":{"name":"Geoinformatica","volume":"18 1","pages":""},"PeriodicalIF":2.2000,"publicationDate":"2024-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Geoinformatica","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s10707-024-00529-5","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
Online businesses and websites have recently become the main target of fake reviews, where fake reviews are intentionally composed to manipulate the business ratings positively or negatively. Most of existing works to detect fake reviews are supervised methods, whose performance highly depends on the amount, quality, and variety of the labeled data, which are often non-trivial to obtain in practice. In this paper, we propose a semi-supervised label sparsity-tolerant framework, LENS, for fake review detection by mining spatial knowledge and learning distributions of embedded topics. LENS builds on two key observations. (1) Spatial knowledge revealed in spatial entities and their co-occurring latent topic distributions may indicate the review authenticity. (2) Distributions of the embedded topics (the contextual distribution) may exhibit important patterns to differentiate between real and fake reviews. Specifically, LENS first extracts embeddings for spatial named entities using a knowledge base trained from Wikipedia webpages. Second, LENS represents each input token as a distribution over the learned latent topics in the embedded topic space. To bypass the differentiation difficulty, LENS builds on two discriminators in the actor-critic architecture using reinforcement learning. Extensive experiments using the real-world spatial and non-spatial datasets show that LENS consistently outperformed the state-of-the-art semi-supervised fake review detection methods on few labels at all different labeling rates for real and fake reviews, respectively, in a label-starving setting.
期刊介绍:
GeoInformatica is located at the confluence of two rapidly advancing domains: Computer Science and Geographic Information Science; nowadays, Earth studies use more and more sophisticated computing theory and tools, and computer processing of Earth observations through Geographic Information Systems (GIS) attracts a great deal of attention from governmental, industrial and research worlds.
This journal aims to promote the most innovative results coming from the research in the field of computer science applied to geographic information systems. Thus, GeoInformatica provides an effective forum for disseminating original and fundamental research and experience in the rapidly advancing area of the use of computer science for spatial studies.