{"title":"WESPACT: — Detection of web spamdexing with decision trees in GA perspective","authors":"S. Jayanthi, S. Sasikala","doi":"10.1109/ICPRIME.2012.6208376","DOIUrl":null,"url":null,"abstract":"Internet today is huge, dynamic, self-organized, and strongly interlinked. Web spam can significantly worsen the quality of search engine results. The motivation of the paper is based on the logical perspective of approaching the web spam problem as cancer caused to the internet, and the solution could be derived by formulating the algorithms based on genetic algorithm (GA) based on content and link attributes. Web mining tools GATree [15] and PermutMatrix [14] has been used to simulate the experiments. JAVA is used to develop program that analyze and report the spamdexing instance. This paper proposes an algorithm WESPACT, to detect the web spam. This algorithm performs well as shown through experiments.","PeriodicalId":148511,"journal":{"name":"International Conference on Pattern Recognition, Informatics and Medical Engineering (PRIME-2012)","volume":"55 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-03-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Conference on Pattern Recognition, Informatics and Medical Engineering (PRIME-2012)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICPRIME.2012.6208376","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 7
Abstract
Internet today is huge, dynamic, self-organized, and strongly interlinked. Web spam can significantly worsen the quality of search engine results. The motivation of the paper is based on the logical perspective of approaching the web spam problem as cancer caused to the internet, and the solution could be derived by formulating the algorithms based on genetic algorithm (GA) based on content and link attributes. Web mining tools GATree [15] and PermutMatrix [14] has been used to simulate the experiments. JAVA is used to develop program that analyze and report the spamdexing instance. This paper proposes an algorithm WESPACT, to detect the web spam. This algorithm performs well as shown through experiments.