t-READi:变压器驱动的鲁棒高效多模态自动驾驶推理

IF 7.7 2区 计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS
Pengfei Hu;Yuhang Qian;Tianyue Zheng;Ang Li;Zhe Chen;Yue Gao;Xiuzhen Cheng;Jun Luo
{"title":"t-READi:变压器驱动的鲁棒高效多模态自动驾驶推理","authors":"Pengfei Hu;Yuhang Qian;Tianyue Zheng;Ang Li;Zhe Chen;Yue Gao;Xiuzhen Cheng;Jun Luo","doi":"10.1109/TMC.2024.3462437","DOIUrl":null,"url":null,"abstract":"Given the wide adoption of multimodal sensors (e.g., camera, lidar, radar) by \n<italic>autonomous vehicle</i>\ns (AVs), deep analytics to fuse their outputs for a robust perception become imperative. However, existing fusion methods often make two assumptions rarely holding in practice: i) similar data distributions for all inputs and ii) constant availability for all sensors. Because, for example, lidars have various resolutions and failures of radars may occur, such variability often results in significant performance degradation in fusion. To this end, we present t-READi, an adaptive inference system that accommodates the variability of multimodal sensory data and thus enables robust and efficient perception. t-READi identifies variation-sensitive yet \n<italic>structure-specific</i>\n model parameters; it then adapts only these parameters while keeping the rest intact. t-READi also leverages a cross-modality contrastive learning method to compensate for the loss from missing modalities. Both functions are implemented to maintain compatibility with existing multimodal deep fusion methods. The extensive experiments evidently demonstrate that compared with the status quo approaches, t-READi not only improves the average inference accuracy by more than 6% but also reduces the inference latency by almost 15× with the cost of only 5% extra memory overhead in the worst case under realistic data and modal variations.","PeriodicalId":50389,"journal":{"name":"IEEE Transactions on Mobile Computing","volume":"24 1","pages":"135-149"},"PeriodicalIF":7.7000,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"t-READi: Transformer-Powered Robust and Efficient Multimodal Inference for Autonomous Driving\",\"authors\":\"Pengfei Hu;Yuhang Qian;Tianyue Zheng;Ang Li;Zhe Chen;Yue Gao;Xiuzhen Cheng;Jun Luo\",\"doi\":\"10.1109/TMC.2024.3462437\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Given the wide adoption of multimodal sensors (e.g., camera, lidar, radar) by \\n<italic>autonomous vehicle</i>\\ns (AVs), deep analytics to fuse their outputs for a robust perception become imperative. However, existing fusion methods often make two assumptions rarely holding in practice: i) similar data distributions for all inputs and ii) constant availability for all sensors. Because, for example, lidars have various resolutions and failures of radars may occur, such variability often results in significant performance degradation in fusion. To this end, we present t-READi, an adaptive inference system that accommodates the variability of multimodal sensory data and thus enables robust and efficient perception. t-READi identifies variation-sensitive yet \\n<italic>structure-specific</i>\\n model parameters; it then adapts only these parameters while keeping the rest intact. t-READi also leverages a cross-modality contrastive learning method to compensate for the loss from missing modalities. Both functions are implemented to maintain compatibility with existing multimodal deep fusion methods. The extensive experiments evidently demonstrate that compared with the status quo approaches, t-READi not only improves the average inference accuracy by more than 6% but also reduces the inference latency by almost 15× with the cost of only 5% extra memory overhead in the worst case under realistic data and modal variations.\",\"PeriodicalId\":50389,\"journal\":{\"name\":\"IEEE Transactions on Mobile Computing\",\"volume\":\"24 1\",\"pages\":\"135-149\"},\"PeriodicalIF\":7.7000,\"publicationDate\":\"2024-09-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Mobile Computing\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10684049/\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Mobile Computing","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10684049/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0

摘要

鉴于自动驾驶汽车(av)广泛采用多模态传感器(如摄像头、激光雷达、雷达),深度分析以融合其输出以获得强大的感知变得势在必行。然而,现有的融合方法通常有两个在实践中很少被采用的假设:1)所有输入的数据分布相似;2)所有传感器的可用性不变。例如,由于激光雷达具有不同的分辨率,并且可能发生雷达故障,因此这种可变性通常会导致融合性能的显著下降。为此,我们提出了t-READi,一种适应多模态感官数据可变性的自适应推理系统,从而实现鲁棒和高效的感知。t-READi识别变化敏感但结构特定的模型参数;然后,它只适应这些参数,而保持其余参数不变。t-READi还利用跨模态对比学习方法来弥补模态缺失带来的损失。实现这两个函数是为了保持与现有多模态深度融合方法的兼容性。大量的实验表明,与现有方法相比,在真实数据和模态变化的最坏情况下,t-READi不仅将平均推理精度提高了6%以上,而且将推理延迟降低了近15倍,而成本仅为5%的额外内存开销。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
t-READi: Transformer-Powered Robust and Efficient Multimodal Inference for Autonomous Driving
Given the wide adoption of multimodal sensors (e.g., camera, lidar, radar) by autonomous vehicle s (AVs), deep analytics to fuse their outputs for a robust perception become imperative. However, existing fusion methods often make two assumptions rarely holding in practice: i) similar data distributions for all inputs and ii) constant availability for all sensors. Because, for example, lidars have various resolutions and failures of radars may occur, such variability often results in significant performance degradation in fusion. To this end, we present t-READi, an adaptive inference system that accommodates the variability of multimodal sensory data and thus enables robust and efficient perception. t-READi identifies variation-sensitive yet structure-specific model parameters; it then adapts only these parameters while keeping the rest intact. t-READi also leverages a cross-modality contrastive learning method to compensate for the loss from missing modalities. Both functions are implemented to maintain compatibility with existing multimodal deep fusion methods. The extensive experiments evidently demonstrate that compared with the status quo approaches, t-READi not only improves the average inference accuracy by more than 6% but also reduces the inference latency by almost 15× with the cost of only 5% extra memory overhead in the worst case under realistic data and modal variations.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
IEEE Transactions on Mobile Computing
IEEE Transactions on Mobile Computing 工程技术-电信学
CiteScore
12.90
自引率
2.50%
发文量
403
审稿时长
6.6 months
期刊介绍: IEEE Transactions on Mobile Computing addresses key technical issues related to various aspects of mobile computing. This includes (a) architectures, (b) support services, (c) algorithm/protocol design and analysis, (d) mobile environments, (e) mobile communication systems, (f) applications, and (g) emerging technologies. Topics of interest span a wide range, covering aspects like mobile networks and hosts, mobility management, multimedia, operating system support, power management, online and mobile environments, security, scalability, reliability, and emerging technologies such as wearable computers, body area networks, and wireless sensor networks. The journal serves as a comprehensive platform for advancements in mobile computing research.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信