Learning to Coordinate Video Codec with Transport Protocol for Mobile Video Telephony

The 25th Annual International Conference on Mobile Computing and Networking Pub Date : 2019-08-05 DOI:10.1145/3300061.3345430

Anfu Zhou, Huanhuan Zhang, Guangyuan Su, Leilei Wu, Ruoxuan Ma, Zhen Meng, Xinyu Zhang, Xiufeng Xie, Huadong Ma, Xiaojiang Chen

{"title":"Learning to Coordinate Video Codec with Transport Protocol for Mobile Video Telephony","authors":"Anfu Zhou, Huanhuan Zhang, Guangyuan Su, Leilei Wu, Ruoxuan Ma, Zhen Meng, Xinyu Zhang, Xiufeng Xie, Huadong Ma, Xiaojiang Chen","doi":"10.1145/3300061.3345430","DOIUrl":null,"url":null,"abstract":"Despite the pervasive use of real-time video telephony services, the users' quality of experience (QoE) remains unsatisfactory, especially over the mobile Internet. Previous work studied the problem via controlled experiments, while a systematic and in-depth investigation in the wild is still missing. To bridge the gap, we conduct a large-scale measurement campaign on \\appname, an operational mobile video telephony service. Our measurement logs fine-grained performance metrics over 1 million video call sessions. Our analysis shows that the application-layer video codec and transport-layer protocols remain highly uncoordinated, which represents one major reason for the low QoE. We thus propose \\name, a machine learning based framework to resolve the issue. Instead of blindly following the transport layer's estimation of network capacity, \\name reviews historical logs of both layers, and extracts high-level features of codec/network dynamics, based on which it determines the highest bitrates for forthcoming video frames without incurring congestion. To attain the ability, we train \\name with the aforementioned massive data traces using a custom-designed imitation learning algorithm, which enables \\name to learn from past experience. We have implemented and incorporated \\name into \\appname. Our experiments show that \\name outperforms state-of-the-art solutions, improving video quality while reducing stalling time by multi-folds under various practical scenarios.","PeriodicalId":223523,"journal":{"name":"The 25th Annual International Conference on Mobile Computing and Networking","volume":"95 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"53","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"The 25th Annual International Conference on Mobile Computing and Networking","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3300061.3345430","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 53

Abstract

Despite the pervasive use of real-time video telephony services, the users' quality of experience (QoE) remains unsatisfactory, especially over the mobile Internet. Previous work studied the problem via controlled experiments, while a systematic and in-depth investigation in the wild is still missing. To bridge the gap, we conduct a large-scale measurement campaign on \appname, an operational mobile video telephony service. Our measurement logs fine-grained performance metrics over 1 million video call sessions. Our analysis shows that the application-layer video codec and transport-layer protocols remain highly uncoordinated, which represents one major reason for the low QoE. We thus propose \name, a machine learning based framework to resolve the issue. Instead of blindly following the transport layer's estimation of network capacity, \name reviews historical logs of both layers, and extracts high-level features of codec/network dynamics, based on which it determines the highest bitrates for forthcoming video frames without incurring congestion. To attain the ability, we train \name with the aforementioned massive data traces using a custom-designed imitation learning algorithm, which enables \name to learn from past experience. We have implemented and incorporated \name into \appname. Our experiments show that \name outperforms state-of-the-art solutions, improving video quality while reducing stalling time by multi-folds under various practical scenarios.

查看原文本刊更多论文

学习协调视频编解码器与移动视频电话传输协议

尽管实时视频电话服务的普及，但用户的体验质量(QoE)仍然令人不满意，特别是在移动互联网上。以前的工作是通过对照实验研究这个问题的，而在野外进行的系统和深入的调查仍然缺失。为了弥补这一差距，我们在正在运营的移动视频电话服务appname上开展了大规模的测量活动。我们的测量记录了超过100万个视频通话会话的细粒度性能指标。我们的分析表明，应用层视频编解码器和传输层协议仍然高度不协调，这是低QoE的一个主要原因。因此，我们提出\name，一个基于机器学习的框架来解决这个问题。\name不是盲目地跟随传输层对网络容量的估计，而是回顾两层的历史日志，并提取编解码器/网络动态的高级特征，基于这些特征，它确定即将到来的视频帧的最高比特率，而不会引起拥塞。为了获得这种能力，我们使用定制设计的模仿学习算法对\name进行了上述大量数据痕迹的训练，使\name能够从过去的经验中学习。我们已经实现并将\name合并到\appname中。我们的实验表明，\name优于最先进的解决方案，在各种实际场景下提高了视频质量，同时将失速时间缩短了数倍。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

The 25th Annual International Conference on Mobile Computing and Networking

自引率

0.00%

发文量