基线语义垃圾邮件过滤

2011 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology Pub Date : 2011-08-22 DOI:10.1109/WI-IAT.2011.133

Christian F. Hempelmann, Vikas Mehra

{"title":"基线语义垃圾邮件过滤","authors":"Christian F. Hempelmann, Vikas Mehra","doi":"10.1109/WI-IAT.2011.133","DOIUrl":null,"url":null,"abstract":"This paper presents a meaning-based method to distinguish text without or with little semantic content from text that has meaning which can be processed. The basic method assumes that a semantic analyzer will be able to produce less output from semantically less grammatical input text. The method was pilot-tested on a corpus of blog spam. Future improvements, including a method to distinguish semantically unified from semantically disparate text are sketched. The tested method, but even more the projected improvements, open up the way to taking the spam filtering arms race to a new level that is very costly to spam producers.","PeriodicalId":128421,"journal":{"name":"2011 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology","volume":"818 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Baseline Semantic Spam Filtering\",\"authors\":\"Christian F. Hempelmann, Vikas Mehra\",\"doi\":\"10.1109/WI-IAT.2011.133\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper presents a meaning-based method to distinguish text without or with little semantic content from text that has meaning which can be processed. The basic method assumes that a semantic analyzer will be able to produce less output from semantically less grammatical input text. The method was pilot-tested on a corpus of blog spam. Future improvements, including a method to distinguish semantically unified from semantically disparate text are sketched. The tested method, but even more the projected improvements, open up the way to taking the spam filtering arms race to a new level that is very costly to spam producers.\",\"PeriodicalId\":128421,\"journal\":{\"name\":\"2011 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology\",\"volume\":\"818 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2011-08-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2011 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/WI-IAT.2011.133\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2011 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/WI-IAT.2011.133","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

摘要

本文提出了一种基于意义的文本识别方法，用于区分没有或很少语义内容的文本和具有可处理意义的文本。基本方法假定语义分析器能够从语义较少的语法输入文本中产生较少的输出。该方法在一个博客垃圾语料库上进行了试点测试。未来的改进，包括一种方法来区分语义统一和语义不同的文本概述。经过测试的方法，更重要的是预期的改进，将把垃圾邮件过滤军备竞赛提升到一个新的水平，这对垃圾邮件生产者来说是非常昂贵的。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Baseline Semantic Spam Filtering

This paper presents a meaning-based method to distinguish text without or with little semantic content from text that has meaning which can be processed. The basic method assumes that a semantic analyzer will be able to produce less output from semantically less grammatical input text. The method was pilot-tested on a corpus of blog spam. Future improvements, including a method to distinguish semantically unified from semantically disparate text are sketched. The tested method, but even more the projected improvements, open up the way to taking the spam filtering arms race to a new level that is very costly to spam producers.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2011 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology

自引率

0.00%

发文量