A Hadoop Based Weblog Analysis System

Chen Hau Wang, Ching TsorngTsai, Chia-Chen Fan, S. Yuan
{"title":"A Hadoop Based Weblog Analysis System","authors":"Chen Hau Wang, Ching TsorngTsai, Chia-Chen Fan, S. Yuan","doi":"10.1109/U-MEDIA.2014.9","DOIUrl":null,"url":null,"abstract":"In recent years, cloud computing has been an important issue in the field of research. Cloud computing employs distributed storage and distributed computing technology to achieve a large number of stored data, as well as fast data analysis and processing. As the rapid development of Internet technology, digital data showing explosive growth, the face of massive data processing, the traditional text software and relational database technology has been facing a bottleneck, presented the results are not very satisfactory. For this problem, the concept of cloud computing is a more appropriate choice. In this paper, based on the architecture of Hadoop with HDFS (Hadoop Distributed File System) and Hadoop MapReduce software framework and Pig Latin language, we design and implement an enterprise Web log analysis system. Experimental results, by analyzing daily Web log records, we get Application Server traffic trends, performance of program statistical reports, and performance reports of different intervals and different actions of program by user request. The main purpose of this system is to assist system administrators to quickly capture and analyze data hidden in the massive potential value, thus providing an important basis for business decisions.","PeriodicalId":174849,"journal":{"name":"2014 7th International Conference on Ubi-Media Computing and Workshops","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-07-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"15","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 7th International Conference on Ubi-Media Computing and Workshops","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/U-MEDIA.2014.9","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 15

Abstract

In recent years, cloud computing has been an important issue in the field of research. Cloud computing employs distributed storage and distributed computing technology to achieve a large number of stored data, as well as fast data analysis and processing. As the rapid development of Internet technology, digital data showing explosive growth, the face of massive data processing, the traditional text software and relational database technology has been facing a bottleneck, presented the results are not very satisfactory. For this problem, the concept of cloud computing is a more appropriate choice. In this paper, based on the architecture of Hadoop with HDFS (Hadoop Distributed File System) and Hadoop MapReduce software framework and Pig Latin language, we design and implement an enterprise Web log analysis system. Experimental results, by analyzing daily Web log records, we get Application Server traffic trends, performance of program statistical reports, and performance reports of different intervals and different actions of program by user request. The main purpose of this system is to assist system administrators to quickly capture and analyze data hidden in the massive potential value, thus providing an important basis for business decisions.
基于Hadoop的博客分析系统
近年来,云计算一直是研究领域的一个重要问题。云计算采用分布式存储和分布式计算技术,实现大量数据的存储,以及快速的数据分析和处理。随着互联网技术的飞速发展,数字数据呈现爆发式增长,面对海量数据的处理,传统的文本软件和关系数据库技术已经面临瓶颈,呈现出的结果并不十分令人满意。对于这个问题,云计算的概念是一个更合适的选择。本文基于Hadoop与HDFS (Hadoop分布式文件系统)的架构和Hadoop MapReduce软件框架以及Pig Latin语言,设计并实现了一个企业Web日志分析系统。实验结果表明,通过分析每天的Web日志记录,我们得到了Application Server的流量趋势、程序性能统计报表,以及根据用户请求不同间隔时间和程序不同动作的性能报表。本系统的主要目的是帮助系统管理员快速捕获和分析隐藏在海量潜在价值中的数据,从而为业务决策提供重要依据。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信