A Contrastive-Learning-Based Method for Alert-Scene Categorization

Shaochi Hu, Hanwei Fan, Biao Gao, Huijing Zhao
{"title":"A Contrastive-Learning-Based Method for Alert-Scene Categorization","authors":"Shaochi Hu, Hanwei Fan, Biao Gao, Huijing Zhao","doi":"10.1109/iv51971.2022.9827387","DOIUrl":null,"url":null,"abstract":"Whether it’s a driver warning or an autonomous driving system, ADAS needs to decide when to alert the driver of danger or take over control. This research formulates the problem as an alert-scene categorization one and proposes a method using contrastive learning. Given a front-view video of a driving scene, a set of anchor points is marked by a human driver, where an anchor point indicates that the semantic attribute of the current scene is different from that of the previous one. The anchor frames are then used to generate contrastive image pairs to train a feature encoder and obtain a scene similarity measure, so as to expand the distance of the scenes of different categories in the feature space. Each scene category is explicitly modeled to capture the meta pattern on the distribution of scene similarity values, which is then used to infer scene categories. Experiments are conducted using front-view videos that were collected during driving at a cluttered dynamic campus. The scenes are categorized into no alert, longitudinal alert, and lateral alert. The results are studied at both feature encoding, category modeling, and reasoning aspects. By comparing precision with two full supervised end-to-end baseline models, the proposed method demonstrates competitive or superior performance. However, it remains still questions: how to generate ground truth data and how to evaluate performance in ambiguous situations, which leads to future works.","PeriodicalId":184622,"journal":{"name":"2022 IEEE Intelligent Vehicles Symposium (IV)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE Intelligent Vehicles Symposium (IV)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/iv51971.2022.9827387","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Whether it’s a driver warning or an autonomous driving system, ADAS needs to decide when to alert the driver of danger or take over control. This research formulates the problem as an alert-scene categorization one and proposes a method using contrastive learning. Given a front-view video of a driving scene, a set of anchor points is marked by a human driver, where an anchor point indicates that the semantic attribute of the current scene is different from that of the previous one. The anchor frames are then used to generate contrastive image pairs to train a feature encoder and obtain a scene similarity measure, so as to expand the distance of the scenes of different categories in the feature space. Each scene category is explicitly modeled to capture the meta pattern on the distribution of scene similarity values, which is then used to infer scene categories. Experiments are conducted using front-view videos that were collected during driving at a cluttered dynamic campus. The scenes are categorized into no alert, longitudinal alert, and lateral alert. The results are studied at both feature encoding, category modeling, and reasoning aspects. By comparing precision with two full supervised end-to-end baseline models, the proposed method demonstrates competitive or superior performance. However, it remains still questions: how to generate ground truth data and how to evaluate performance in ambiguous situations, which leads to future works.
基于对比学习的报警场景分类方法
无论是驾驶员警告还是自动驾驶系统,ADAS都需要决定何时提醒驾驶员注意危险或接管控制权。本研究将该问题表述为一个警觉性场景分类问题,并提出了一种使用对比学习的方法。给定一个驾驶场景的前视视频,人类驾驶员标记一组锚点,锚点表示当前场景的语义属性与前一个场景不同。然后利用锚帧生成对比图像对来训练特征编码器,获得场景相似度度量,从而扩大不同类别场景在特征空间中的距离。每个场景类别都被显式建模,以捕获场景相似值分布的元模式,然后用于推断场景类别。实验使用在一个混乱的动态校园中驾驶时收集的前视视频进行。场景分为无警戒、纵向警戒和横向警戒。结果从特征编码、类别建模和推理三个方面进行了研究。通过与两种完全监督的端到端基线模型的精度比较,表明该方法具有竞争力或优越的性能。然而,仍然存在一些问题:如何生成地面真实数据以及如何在模糊情况下评估性能,这导致了未来的工作。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信