{"title":"Ensemble techniques for detecting profile cloning attacks in online social networks.","authors":"Irfan Mohiuddin, Ahmad Almogren","doi":"10.7717/peerj-cs.3182","DOIUrl":null,"url":null,"abstract":"<p><p>Detecting cloned and impersonated profiles on online social networks (OSNs) has become an increasingly critical challenge, particularly with the proliferation of AI-generated content that closely emulates human communication patterns. Traditional identity deception detection methods are proving inadequate against adversaries who exploit large language models (LLMs) to craft syntactically accurate and semantically plausible fake profiles. This article focuses on the detection of profile cloning on LinkedIn by introducing a multi-stage, content-based detection framework that classifies profiles into four distinct categories: legitimate profiles, human-cloned profiles, LLM-generated legitimate profiles, and LLM-generated cloned profiles. The proposed framework integrates multiple analytical layers, including semantic representation learning through attention-based section embedding aggregation, linguistic style modeling using stylometric-perplexity features, anomaly scoring <i>via</i> cluster-based outlier detection, and ensemble classification through out-of-fold stacking. Experiments conducted on a publicly available dataset comprising 3,600 profiles demonstrate that the proposed meta-ensemble model consistently outperforms competitive baselines, achieving macro-averaged accuracy, precision, recall, and F1-scores above 96%. These results highlight the effectiveness of leveraging a combination of semantic, stylistic, and probabilistic signals to detect both human-crafted and artificial intelligence (AI)-generated impersonation attempts. Overall, this work presents a robust and scalable content-driven methodology for identity deception detection in contemporary OSNs.</p>","PeriodicalId":54224,"journal":{"name":"PeerJ Computer Science","volume":"11 ","pages":"e3182"},"PeriodicalIF":2.5000,"publicationDate":"2025-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12453747/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"PeerJ Computer Science","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.7717/peerj-cs.3182","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/1 0:00:00","PubModel":"eCollection","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Detecting cloned and impersonated profiles on online social networks (OSNs) has become an increasingly critical challenge, particularly with the proliferation of AI-generated content that closely emulates human communication patterns. Traditional identity deception detection methods are proving inadequate against adversaries who exploit large language models (LLMs) to craft syntactically accurate and semantically plausible fake profiles. This article focuses on the detection of profile cloning on LinkedIn by introducing a multi-stage, content-based detection framework that classifies profiles into four distinct categories: legitimate profiles, human-cloned profiles, LLM-generated legitimate profiles, and LLM-generated cloned profiles. The proposed framework integrates multiple analytical layers, including semantic representation learning through attention-based section embedding aggregation, linguistic style modeling using stylometric-perplexity features, anomaly scoring via cluster-based outlier detection, and ensemble classification through out-of-fold stacking. Experiments conducted on a publicly available dataset comprising 3,600 profiles demonstrate that the proposed meta-ensemble model consistently outperforms competitive baselines, achieving macro-averaged accuracy, precision, recall, and F1-scores above 96%. These results highlight the effectiveness of leveraging a combination of semantic, stylistic, and probabilistic signals to detect both human-crafted and artificial intelligence (AI)-generated impersonation attempts. Overall, this work presents a robust and scalable content-driven methodology for identity deception detection in contemporary OSNs.
期刊介绍:
PeerJ Computer Science is the new open access journal covering all subject areas in computer science, with the backing of a prestigious advisory board and more than 300 academic editors.