Lutong Qin;Lei Zhang;Chengrun Li;Chaoda Song;Dongzhou Cheng;Shuoyuan Wang;Hao Wu;Aiguo Song
{"title":"Towards Better Accuracy-Efficiency Trade-Offs: Dynamic Activity Inference via Collaborative Learning From Various Width-Resolution Configurations","authors":"Lutong Qin;Lei Zhang;Chengrun Li;Chaoda Song;Dongzhou Cheng;Shuoyuan Wang;Hao Wu;Aiguo Song","doi":"10.1109/TAI.2024.3489532","DOIUrl":null,"url":null,"abstract":"Recently, deep neural networks have triumphed over a large variety of human activity recognition (HAR) applications on resource-constrained mobile devices. However, most existing works are static and ignore the fact that the computational budget usually changes drastically across various devices, which prevent real-world HAR deployment. It still remains a major challenge: how to adaptively and instantly tradeoff accuracy and latency at runtime for on-device activity inference using time series sensor data? To address this issue, this article introduces a new collaborative learning scheme by training a set of subnetworks executed at varying network widths when fueled with different sensor input resolutions as data augmentation, which can instantly switch on-the-fly at different width-resolution configurations for flexible and dynamic activity inference under varying resource budgets. Particularly, it offers a promising performance-boosting solution by utilizing self-distillation to transfer the unique knowledge among multiple width-resolution configuration, which can capture stronger feature representations for activity recognition. Extensive experiments and ablation studies on three public HAR benchmark datasets validate the effectiveness and efficiency of our approach. A real implementation is evaluated on a mobile device. This discovery opens up the possibility to directly access accuracy-latency spectrum of deep learning models in versatile real-world HAR deployments. Code is available at \n<uri>https://github.com/Lutong-Qin/Collaborative_HAR</uri>\n.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"5 12","pages":"6723-6738"},"PeriodicalIF":0.0000,"publicationDate":"2024-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on artificial intelligence","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10742433/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
摘要
最近,深度神经网络在资源受限的移动设备上的各种人类活动识别(HAR)应用中大放异彩。然而,现有的大多数作品都是静态的,忽略了计算预算通常会在不同设备上发生巨大变化这一事实,从而阻碍了真实世界中的人类活动识别部署。如何在运行时利用时间序列传感器数据自适应地即时权衡设备上活动推理的准确性和延迟,仍然是一个重大挑战。为了解决这个问题,本文介绍了一种新的协作学习方案,即在使用不同传感器输入分辨率作为数据增强时,通过训练一组以不同网络宽度执行的子网络,在不同的宽度分辨率配置下即时切换,从而在不同的资源预算下实现灵活、动态的活动推断。特别是,它提供了一种很有前景的性能提升解决方案,利用自蒸发功能在多种宽度分辨率配置之间转移独特的知识,从而为活动识别捕捉到更强的特征表征。在三个公共 HAR 基准数据集上进行的广泛实验和消融研究验证了我们方法的有效性和效率。我们还在移动设备上评估了实际实施情况。这一发现为在多用途真实 HAR 部署中直接获取深度学习模型的准确性-延迟谱提供了可能性。代码见 https://github.com/Lutong-Qin/Collaborative_HAR。
Towards Better Accuracy-Efficiency Trade-Offs: Dynamic Activity Inference via Collaborative Learning From Various Width-Resolution Configurations
Recently, deep neural networks have triumphed over a large variety of human activity recognition (HAR) applications on resource-constrained mobile devices. However, most existing works are static and ignore the fact that the computational budget usually changes drastically across various devices, which prevent real-world HAR deployment. It still remains a major challenge: how to adaptively and instantly tradeoff accuracy and latency at runtime for on-device activity inference using time series sensor data? To address this issue, this article introduces a new collaborative learning scheme by training a set of subnetworks executed at varying network widths when fueled with different sensor input resolutions as data augmentation, which can instantly switch on-the-fly at different width-resolution configurations for flexible and dynamic activity inference under varying resource budgets. Particularly, it offers a promising performance-boosting solution by utilizing self-distillation to transfer the unique knowledge among multiple width-resolution configuration, which can capture stronger feature representations for activity recognition. Extensive experiments and ablation studies on three public HAR benchmark datasets validate the effectiveness and efficiency of our approach. A real implementation is evaluated on a mobile device. This discovery opens up the possibility to directly access accuracy-latency spectrum of deep learning models in versatile real-world HAR deployments. Code is available at
https://github.com/Lutong-Qin/Collaborative_HAR
.