Detecting Spam Tweets Using Lightweight Detectors on Real-Time Basis and Update the Models Periodically in Batch Mode

2019 International Conference on Emerging Trends in Science and Engineering (ICESE) Pub Date : 2019-09-01 DOI:10.1109/ICESE46178.2019.9194614

K. Reddy, R. Reddy, P. V. Reddy

{"title":"Detecting Spam Tweets Using Lightweight Detectors on Real-Time Basis and Update the Models Periodically in Batch Mode","authors":"K. Reddy, R. Reddy, P. V. Reddy","doi":"10.1109/ICESE46178.2019.9194614","DOIUrl":null,"url":null,"abstract":"the majority of accessible technique for spam detection on Twitter purpose to recognize and block user who put up spam tweet. Here this document, we suggest a semi-supervised spam detection structure for spam discovery at tweet-stage. Planned structure contains of two essential modules: spam discovery module working in concurrent mode and method update module working in batch mode. The spam detection module consists of 4 frivolous detectors: 1) blacklisted area detector to label tweets include blacklisted URLs; 2) close to-reproduction detector to label tweets which can be close to-duplicates of expectantly relabeled tweets; 3) dependable ham detector in the direction of label tweets which are published by means of trusted customers and that do not incorporate spammy words; and 4) multiclassifier primarily based totally detector labels the closing tweets. The data needful thru the detection detail is up to date in batch mode primarily based on the tweets which can be categorized inside the preceding moment in time windowpane. Experiment on top of a massive-scale records set display that the method adaptively learn styles of recent spam actions and hold appropriate accuracy for spam detection in a tweet torrent.","PeriodicalId":137459,"journal":{"name":"2019 International Conference on Emerging Trends in Science and Engineering (ICESE)","volume":"53 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 International Conference on Emerging Trends in Science and Engineering (ICESE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICESE46178.2019.9194614","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

the majority of accessible technique for spam detection on Twitter purpose to recognize and block user who put up spam tweet. Here this document, we suggest a semi-supervised spam detection structure for spam discovery at tweet-stage. Planned structure contains of two essential modules: spam discovery module working in concurrent mode and method update module working in batch mode. The spam detection module consists of 4 frivolous detectors: 1) blacklisted area detector to label tweets include blacklisted URLs; 2) close to-reproduction detector to label tweets which can be close to-duplicates of expectantly relabeled tweets; 3) dependable ham detector in the direction of label tweets which are published by means of trusted customers and that do not incorporate spammy words; and 4) multiclassifier primarily based totally detector labels the closing tweets. The data needful thru the detection detail is up to date in batch mode primarily based on the tweets which can be categorized inside the preceding moment in time windowpane. Experiment on top of a massive-scale records set display that the method adaptively learn styles of recent spam actions and hold appropriate accuracy for spam detection in a tweet torrent.

查看原文本刊更多论文

利用轻量级检测器实时检测垃圾推文，并以批处理方式定期更新模型

在Twitter上，大多数可访问的垃圾邮件检测技术的目的是识别和阻止发布垃圾邮件的用户。在本文中，我们提出了一种半监督的垃圾邮件检测结构，用于在tweet阶段发现垃圾邮件。规划结构包含两个基本模块:工作在并发模式下的垃圾邮件发现模块和工作在批处理模式下的方法更新模块。垃圾邮件检测模块由4个无关紧要的检测器组成:1)黑名单区域检测器，用于标记推文包括黑名单url;2)近似再现检测器对推文进行标记，这些推文可能是期望重新标记的推文的近似重复;3)可靠的火腿检测器在标签推文的方向，这些推文是由可信任的客户发布的，不包含垃圾文字;4)基于完全检测器的多分类器对关闭推文进行标记。通过检测细节所需的数据以批处理模式是最新的，主要基于tweets，这些tweets可以在前一刻的时间窗口窗格内进行分类。在大规模记录集上的实验表明，该方法自适应学习最近垃圾邮件行为的风格，并在tweet洪流中保持适当的垃圾邮件检测准确性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2019 International Conference on Emerging Trends in Science and Engineering (ICESE)

自引率

0.00%

发文量