{"title":"Adremover:用于拦截广告的改进的机器学习方法","authors":"Thu Vo, Chetan Jaiswal","doi":"10.1109/UEMCON47517.2019.8993052","DOIUrl":null,"url":null,"abstract":"There is an explosion in the advertisements over web nowadays. Most of the websites we visit contain ads, even Facebook, Google and Twitter. Sometimes, it could also appear that someone is spying on us because there are incidents like ads that show up with the content exactly what you have been searching not long ago. Such events happen as a result of Web Tracking. Initially, ads were meant to support businesses and companies to market their products and persuade the users to purchase them. Web Tracker were meant to track the user interaction with the website so it can improve the user experience. However, some of these have allowed ads as malvertisements, which may take advantage of these functionalities to steal the users sensitive information. To counter this tsunami of ads (malware), several ad blockers were created and are freely available as a browser extension and serve the purpose of blocking ads and trackers and most of them use the hand-crafted filter lists, some of them apply the machine learning approach. However, because of outdated filter-list or white-list and also inability to identify brand new ad signatures, most of them do not provide the depth of functionality and intelligence required to block all the undesirable content. After applying several machine learning approaches and comparing them, we propose our tool, AdRemover, that is a novel approach using the list of URLs which contains Ad and Non-Ad as the dataset. The classification applying Random Forest exceeds 98% with the splitted 20% testing data and 80% training data from our dataset.","PeriodicalId":187022,"journal":{"name":"2019 IEEE 10th Annual Ubiquitous Computing, Electronics & Mobile Communication Conference (UEMCON)","volume":"117 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"ADREMOVER: THE IMPROVED MACHINE LEARNING APPROACH FOR BLOCKING ADS\",\"authors\":\"Thu Vo, Chetan Jaiswal\",\"doi\":\"10.1109/UEMCON47517.2019.8993052\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"There is an explosion in the advertisements over web nowadays. Most of the websites we visit contain ads, even Facebook, Google and Twitter. Sometimes, it could also appear that someone is spying on us because there are incidents like ads that show up with the content exactly what you have been searching not long ago. Such events happen as a result of Web Tracking. Initially, ads were meant to support businesses and companies to market their products and persuade the users to purchase them. Web Tracker were meant to track the user interaction with the website so it can improve the user experience. However, some of these have allowed ads as malvertisements, which may take advantage of these functionalities to steal the users sensitive information. To counter this tsunami of ads (malware), several ad blockers were created and are freely available as a browser extension and serve the purpose of blocking ads and trackers and most of them use the hand-crafted filter lists, some of them apply the machine learning approach. However, because of outdated filter-list or white-list and also inability to identify brand new ad signatures, most of them do not provide the depth of functionality and intelligence required to block all the undesirable content. After applying several machine learning approaches and comparing them, we propose our tool, AdRemover, that is a novel approach using the list of URLs which contains Ad and Non-Ad as the dataset. The classification applying Random Forest exceeds 98% with the splitted 20% testing data and 80% training data from our dataset.\",\"PeriodicalId\":187022,\"journal\":{\"name\":\"2019 IEEE 10th Annual Ubiquitous Computing, Electronics & Mobile Communication Conference (UEMCON)\",\"volume\":\"117 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 IEEE 10th Annual Ubiquitous Computing, Electronics & Mobile Communication Conference (UEMCON)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/UEMCON47517.2019.8993052\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 IEEE 10th Annual Ubiquitous Computing, Electronics & Mobile Communication Conference (UEMCON)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/UEMCON47517.2019.8993052","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
ADREMOVER: THE IMPROVED MACHINE LEARNING APPROACH FOR BLOCKING ADS
There is an explosion in the advertisements over web nowadays. Most of the websites we visit contain ads, even Facebook, Google and Twitter. Sometimes, it could also appear that someone is spying on us because there are incidents like ads that show up with the content exactly what you have been searching not long ago. Such events happen as a result of Web Tracking. Initially, ads were meant to support businesses and companies to market their products and persuade the users to purchase them. Web Tracker were meant to track the user interaction with the website so it can improve the user experience. However, some of these have allowed ads as malvertisements, which may take advantage of these functionalities to steal the users sensitive information. To counter this tsunami of ads (malware), several ad blockers were created and are freely available as a browser extension and serve the purpose of blocking ads and trackers and most of them use the hand-crafted filter lists, some of them apply the machine learning approach. However, because of outdated filter-list or white-list and also inability to identify brand new ad signatures, most of them do not provide the depth of functionality and intelligence required to block all the undesirable content. After applying several machine learning approaches and comparing them, we propose our tool, AdRemover, that is a novel approach using the list of URLs which contains Ad and Non-Ad as the dataset. The classification applying Random Forest exceeds 98% with the splitted 20% testing data and 80% training data from our dataset.