{"title":"Detecting Advertisement Module Network Behavior with Graph Modeling","authors":"Hiroki Kuzuno, Kenichi Magata","doi":"10.1109/AsiaJCIS.2014.26","DOIUrl":null,"url":null,"abstract":"Android applications are widely used and many are 'free' applications which include advertisement (ad) modules that provide ad services and track user behavior statistics. However, these ad modules often collect users' personal information and device identification numbers along with usage statistics, which is a violation of privacy. In our analysis of 1,188 Android applications' network traffic, we identified 797 applications that included 45 previously known ad modules. We analyzed these ad modules' network behavior, and found that they have characteristic network traffic patterns for acquiring ad content, specifically images. In order to accurately differentiate between ad modules' network traffic and valid application network traffic, we propose a novel method based on the distance between network traffic graphs mapping the relationships between HTTP session data (such as HTML or Java Script). This distance describes the similarity between the sessions. Using this method, we can detect ad modules' traffic by comparing session graphs with the graphs of already known ad modules. In our evaluation, we generated 20,903 graphs of applications. We separated the application graphs into those generated by known ad modules (4,698 graphs), those we manually identified as ad modules (2,000 graphs), and standard application traffic. We then applied 1,000 graphs of known ad graphs to the other graph sets (the remaining 3,698 known ad graphs and the 2,000 manually classified ad graphs) to see how accurately they could be used to identify ad graphs. Our approach showed a 76% detection rate for known ad graphs, and a 96% detection rate for manually classified ad graphs.","PeriodicalId":354543,"journal":{"name":"2014 Ninth Asia Joint Conference on Information Security","volume":"24 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 Ninth Asia Joint Conference on Information Security","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/AsiaJCIS.2014.26","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3
Abstract
Android applications are widely used and many are 'free' applications which include advertisement (ad) modules that provide ad services and track user behavior statistics. However, these ad modules often collect users' personal information and device identification numbers along with usage statistics, which is a violation of privacy. In our analysis of 1,188 Android applications' network traffic, we identified 797 applications that included 45 previously known ad modules. We analyzed these ad modules' network behavior, and found that they have characteristic network traffic patterns for acquiring ad content, specifically images. In order to accurately differentiate between ad modules' network traffic and valid application network traffic, we propose a novel method based on the distance between network traffic graphs mapping the relationships between HTTP session data (such as HTML or Java Script). This distance describes the similarity between the sessions. Using this method, we can detect ad modules' traffic by comparing session graphs with the graphs of already known ad modules. In our evaluation, we generated 20,903 graphs of applications. We separated the application graphs into those generated by known ad modules (4,698 graphs), those we manually identified as ad modules (2,000 graphs), and standard application traffic. We then applied 1,000 graphs of known ad graphs to the other graph sets (the remaining 3,698 known ad graphs and the 2,000 manually classified ad graphs) to see how accurately they could be used to identify ad graphs. Our approach showed a 76% detection rate for known ad graphs, and a 96% detection rate for manually classified ad graphs.