虚拟校园活动日志的海量处理

F. Xhafa, Juan Jose Ruiz, S. Caballé, Evjola Spaho, L. Barolli, Rozeta Miho
{"title":"虚拟校园活动日志的海量处理","authors":"F. Xhafa, Juan Jose Ruiz, S. Caballé, Evjola Spaho, L. Barolli, Rozeta Miho","doi":"10.1109/EIDWT.2012.64","DOIUrl":null,"url":null,"abstract":"Online web-based application that heavily require user interaction, either among users or among users and the application, generate huge amounts of data. Recording such user interaction data, usually in the form of log data files, could be very useful for different purposes such as user modelling, user activity analysis, data analytics, security, monitoring, etc. However, such data is not ready to be analysed due log files are to be pre-processed and cleaned up from redundant and futile information. Due to the large amounts of data generated daily, the massive processing is a foremost step in extracting useful information from log data files. In this work we study the viability of massive processing of log data files of a real Virtual Campus using different distributed infrastructures. More precisely, we study the time performance of processing daily log files of a Virtual Campus using cluster computing(under Open Grid Engine) and Planet Lab platform. The study reveals the complexity and challenges of massive processing in the big data era.","PeriodicalId":222292,"journal":{"name":"2012 Third International Conference on Emerging Intelligent Data and Web Technologies","volume":"45 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"Massive Processing of Activity Logs of a Virtual Campus\",\"authors\":\"F. Xhafa, Juan Jose Ruiz, S. Caballé, Evjola Spaho, L. Barolli, Rozeta Miho\",\"doi\":\"10.1109/EIDWT.2012.64\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Online web-based application that heavily require user interaction, either among users or among users and the application, generate huge amounts of data. Recording such user interaction data, usually in the form of log data files, could be very useful for different purposes such as user modelling, user activity analysis, data analytics, security, monitoring, etc. However, such data is not ready to be analysed due log files are to be pre-processed and cleaned up from redundant and futile information. Due to the large amounts of data generated daily, the massive processing is a foremost step in extracting useful information from log data files. In this work we study the viability of massive processing of log data files of a real Virtual Campus using different distributed infrastructures. More precisely, we study the time performance of processing daily log files of a Virtual Campus using cluster computing(under Open Grid Engine) and Planet Lab platform. The study reveals the complexity and challenges of massive processing in the big data era.\",\"PeriodicalId\":222292,\"journal\":{\"name\":\"2012 Third International Conference on Emerging Intelligent Data and Web Technologies\",\"volume\":\"45 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2012-09-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2012 Third International Conference on Emerging Intelligent Data and Web Technologies\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/EIDWT.2012.64\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2012 Third International Conference on Emerging Intelligent Data and Web Technologies","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/EIDWT.2012.64","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4

摘要

基于web的在线应用程序非常需要用户之间或用户与应用程序之间的交互,从而生成大量数据。记录这些用户交互数据(通常以日志数据文件的形式)对于用户建模、用户活动分析、数据分析、安全性、监控等不同目的可能非常有用。但是,由于要对日志文件进行预处理并清除冗余和无用的信息,因此这些数据还没有准备好进行分析。由于每天产生的数据量很大,从日志数据文件中提取有用信息的首要步骤就是海量处理。本文研究了在不同的分布式基础架构下,对真实虚拟校园日志数据文件进行海量处理的可行性。更准确地说,我们研究了使用集群计算(在开放网格引擎下)和Planet Lab平台处理虚拟校园日常日志文件的时间性能。该研究揭示了大数据时代大规模处理的复杂性和挑战。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Massive Processing of Activity Logs of a Virtual Campus
Online web-based application that heavily require user interaction, either among users or among users and the application, generate huge amounts of data. Recording such user interaction data, usually in the form of log data files, could be very useful for different purposes such as user modelling, user activity analysis, data analytics, security, monitoring, etc. However, such data is not ready to be analysed due log files are to be pre-processed and cleaned up from redundant and futile information. Due to the large amounts of data generated daily, the massive processing is a foremost step in extracting useful information from log data files. In this work we study the viability of massive processing of log data files of a real Virtual Campus using different distributed infrastructures. More precisely, we study the time performance of processing daily log files of a Virtual Campus using cluster computing(under Open Grid Engine) and Planet Lab platform. The study reveals the complexity and challenges of massive processing in the big data era.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信