{"title":"基于Hadoop的离线流量分析系统","authors":"Yuan-yuan QIAO , Zhen-ming LEI , Lun YUAN , Min-jie GUO","doi":"10.1016/S1005-8885(13)60096-5","DOIUrl":null,"url":null,"abstract":"<div><p>Offline network traffic analysis is very important for an in-depth study upon the understanding of network conditions and characteristics, such as user behavior and abnormal traffic. With the rapid growth of the amount of information on the Internet, the traditional stand-alone analysis tools face great challenges in storage capacity and computing efficiency, but which is the advantages for Hadoop cluster. In this paper, we designed an offline traffic analysis system based on Hadoop (OTASH), and proposed a MapReduce-based algorithm for Top<em>N</em> user statistics. In addition, we studied the computing performance and failure tolerance in OTASH. From the experiments we drew the conclusion that OTASH is suitable for handling large amounts of flow data, and are competent to calculate in the case of single node failure.</p></div>","PeriodicalId":35359,"journal":{"name":"Journal of China Universities of Posts and Telecommunications","volume":"20 5","pages":"Pages 97-103"},"PeriodicalIF":0.0000,"publicationDate":"2013-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/S1005-8885(13)60096-5","citationCount":"14","resultStr":"{\"title\":\"Offline traffic analysis system based on Hadoop\",\"authors\":\"Yuan-yuan QIAO , Zhen-ming LEI , Lun YUAN , Min-jie GUO\",\"doi\":\"10.1016/S1005-8885(13)60096-5\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Offline network traffic analysis is very important for an in-depth study upon the understanding of network conditions and characteristics, such as user behavior and abnormal traffic. With the rapid growth of the amount of information on the Internet, the traditional stand-alone analysis tools face great challenges in storage capacity and computing efficiency, but which is the advantages for Hadoop cluster. In this paper, we designed an offline traffic analysis system based on Hadoop (OTASH), and proposed a MapReduce-based algorithm for Top<em>N</em> user statistics. In addition, we studied the computing performance and failure tolerance in OTASH. From the experiments we drew the conclusion that OTASH is suitable for handling large amounts of flow data, and are competent to calculate in the case of single node failure.</p></div>\",\"PeriodicalId\":35359,\"journal\":{\"name\":\"Journal of China Universities of Posts and Telecommunications\",\"volume\":\"20 5\",\"pages\":\"Pages 97-103\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2013-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://sci-hub-pdf.com/10.1016/S1005-8885(13)60096-5\",\"citationCount\":\"14\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of China Universities of Posts and Telecommunications\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1005888513600965\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"Computer Science\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of China Universities of Posts and Telecommunications","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1005888513600965","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"Computer Science","Score":null,"Total":0}
Offline network traffic analysis is very important for an in-depth study upon the understanding of network conditions and characteristics, such as user behavior and abnormal traffic. With the rapid growth of the amount of information on the Internet, the traditional stand-alone analysis tools face great challenges in storage capacity and computing efficiency, but which is the advantages for Hadoop cluster. In this paper, we designed an offline traffic analysis system based on Hadoop (OTASH), and proposed a MapReduce-based algorithm for TopN user statistics. In addition, we studied the computing performance and failure tolerance in OTASH. From the experiments we drew the conclusion that OTASH is suitable for handling large amounts of flow data, and are competent to calculate in the case of single node failure.