RBAS: A Real-Time User Behavior Analysis System for Internet TV in Cloud Computing

C. Zhu, Guang Cheng, Xiaojun Guo, Yuxiang Wang
{"title":"RBAS: A Real-Time User Behavior Analysis System for Internet TV in Cloud Computing","authors":"C. Zhu, Guang Cheng, Xiaojun Guo, Yuxiang Wang","doi":"10.1145/2935663.2935664","DOIUrl":null,"url":null,"abstract":"The characteristic of Internet TV user behavior is quite essential for designers to optimize resource schedule and improve user experience. With the rapid development of Internet, both Internet TV users and STB (set top boxes) models are booming. This brings a large amount of behavior data which requires matching computing and storage resource to process. Therefore, scalable Internet TV user behavior analysis becomes more difficult. As a solution, cloud computing framework such as Hive is emerged. But limited by performance, it's not an appropriate choice for interactive analysis or real-time data exploration. In this paper, we present a real-time Internet TV user behavior analysis system with advantages of high concurrency, low latency and good transportability. Firstly, we design an event capture scheme, consisted of agents embedded in STBs and capture server clusters, to capture every manipulation performed by users. Secondly, we develop a SQL-on-Hadoop engine with distributed transactional management to decrease the response time. The engine has excellent query performance and ability to interactively query various data sources in different Hadoop formats. Lastly, we evaluate RBAS in a commercial Internet TV platform of 16 million registered users. The results show that, with a 32-node cluster, the system can effectively process 10.2 TB of behavior data every day, which is about 40x faster than original Hive-based system.","PeriodicalId":305382,"journal":{"name":"Proceedings of the 11th International Conference on Future Internet Technologies","volume":"26 12 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 11th International Conference on Future Internet Technologies","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2935663.2935664","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

The characteristic of Internet TV user behavior is quite essential for designers to optimize resource schedule and improve user experience. With the rapid development of Internet, both Internet TV users and STB (set top boxes) models are booming. This brings a large amount of behavior data which requires matching computing and storage resource to process. Therefore, scalable Internet TV user behavior analysis becomes more difficult. As a solution, cloud computing framework such as Hive is emerged. But limited by performance, it's not an appropriate choice for interactive analysis or real-time data exploration. In this paper, we present a real-time Internet TV user behavior analysis system with advantages of high concurrency, low latency and good transportability. Firstly, we design an event capture scheme, consisted of agents embedded in STBs and capture server clusters, to capture every manipulation performed by users. Secondly, we develop a SQL-on-Hadoop engine with distributed transactional management to decrease the response time. The engine has excellent query performance and ability to interactively query various data sources in different Hadoop formats. Lastly, we evaluate RBAS in a commercial Internet TV platform of 16 million registered users. The results show that, with a 32-node cluster, the system can effectively process 10.2 TB of behavior data every day, which is about 40x faster than original Hive-based system.
RBAS:基于云计算的互联网电视实时用户行为分析系统
网络电视用户行为的特点对设计师优化资源调度、提高用户体验具有重要意义。随着互联网的快速发展,无论是网络电视用户还是机顶盒模式都在蓬勃发展。这带来了大量的行为数据,需要匹配的计算和存储资源来处理。因此,可扩展的网络电视用户行为分析变得更加困难。作为解决方案,Hive等云计算框架应运而生。但是由于性能的限制,它不是交互式分析或实时数据探索的合适选择。本文提出了一种高并发、低延迟、可移植性好的实时网络电视用户行为分析系统。首先,我们设计了一个事件捕获方案,该方案由嵌入在机顶盒中的代理和捕获服务器集群组成,以捕获用户执行的每一个操作。其次,我们开发了一个基于分布式事务管理的SQL-on-Hadoop引擎,以减少响应时间。该引擎具有出色的查询性能和交互式查询不同Hadoop格式的各种数据源的能力。最后,我们在拥有1600万注册用户的商业互联网电视平台上对RBAS进行了评估。结果表明,在32个节点的集群下,系统每天可以有效处理10.2 TB的行为数据,比原来基于hive的系统快40倍左右。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信