UnMoDE:基于特征解纠缠的驾驶员注视估计的不确定性建模

IF 8.4 1区 工程技术 Q1 ENGINEERING, CIVIL
Daosong Hu;Mingyue Cui;Kai Huang
{"title":"UnMoDE:基于特征解纠缠的驾驶员注视估计的不确定性建模","authors":"Daosong Hu;Mingyue Cui;Kai Huang","doi":"10.1109/TITS.2025.3556553","DOIUrl":null,"url":null,"abstract":"Gaze estimation can be used for assessing the attention level of drivers. Current works predominantly focus on enhancing model accuracy, often overlooking the influence of input sample and label uncertainty. In this paper, we propose a framework for uncertainty modeling in driver gaze estimation via feature disentanglement, referred to as UnMoDE. Our approach begins by extracting facial information into distinct feature spaces using an asymmetric dual-branch encoder to obtain gaze features. Subsequently, a multi-layer perceptron (MLP) is employed to project gaze features and labels into an embedding space, representing them as Gaussian distributions. The uncertainty is described using a covariance matrix. Random sampling is applied to derive samples from the gaze embedding distribution to estimate the most probable embedding representation. This estimated representation is then used to regress the gaze direction and is projected back into the gaze feature space, along with identity information, to facilitate facial reconstruction. Extensive experimental evaluations demonstrate that UnMoDE significantly outperforms baseline and state-of-the-art methods on the latest benchmark datasets collected for drivers, particularly in reducing the number of samples with significant errors.","PeriodicalId":13416,"journal":{"name":"IEEE Transactions on Intelligent Transportation Systems","volume":"26 7","pages":"10612-10622"},"PeriodicalIF":8.4000,"publicationDate":"2025-04-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"UnMoDE: Uncertainty Modeling for Driver Gaze Estimation via Feature Disentanglement\",\"authors\":\"Daosong Hu;Mingyue Cui;Kai Huang\",\"doi\":\"10.1109/TITS.2025.3556553\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Gaze estimation can be used for assessing the attention level of drivers. Current works predominantly focus on enhancing model accuracy, often overlooking the influence of input sample and label uncertainty. In this paper, we propose a framework for uncertainty modeling in driver gaze estimation via feature disentanglement, referred to as UnMoDE. Our approach begins by extracting facial information into distinct feature spaces using an asymmetric dual-branch encoder to obtain gaze features. Subsequently, a multi-layer perceptron (MLP) is employed to project gaze features and labels into an embedding space, representing them as Gaussian distributions. The uncertainty is described using a covariance matrix. Random sampling is applied to derive samples from the gaze embedding distribution to estimate the most probable embedding representation. This estimated representation is then used to regress the gaze direction and is projected back into the gaze feature space, along with identity information, to facilitate facial reconstruction. Extensive experimental evaluations demonstrate that UnMoDE significantly outperforms baseline and state-of-the-art methods on the latest benchmark datasets collected for drivers, particularly in reducing the number of samples with significant errors.\",\"PeriodicalId\":13416,\"journal\":{\"name\":\"IEEE Transactions on Intelligent Transportation Systems\",\"volume\":\"26 7\",\"pages\":\"10612-10622\"},\"PeriodicalIF\":8.4000,\"publicationDate\":\"2025-04-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Intelligent Transportation Systems\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10964529/\",\"RegionNum\":1,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ENGINEERING, CIVIL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Intelligent Transportation Systems","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10964529/","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, CIVIL","Score":null,"Total":0}
引用次数: 0

摘要

凝视估计可用于评估驾驶员的注意力水平。目前的工作主要集中在提高模型的准确性,往往忽略了输入样本和标签不确定性的影响。在本文中,我们提出了一种基于特征解纠缠的不确定性建模框架,称为UnMoDE。我们的方法首先使用非对称双分支编码器将面部信息提取到不同的特征空间中以获得凝视特征。随后,使用多层感知器(MLP)将注视特征和标签投影到嵌入空间中,并将其表示为高斯分布。不确定性用协方差矩阵来描述。采用随机抽样方法从凝视嵌入分布中提取样本,以估计最可能的嵌入表示。然后使用该估计的表示来回归凝视方向,并与身份信息一起投影回凝视特征空间,以促进面部重建。大量的实验评估表明,在为驾驶员收集的最新基准数据集上,UnMoDE显著优于基线和最先进的方法,特别是在减少具有重大错误的样本数量方面。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
UnMoDE: Uncertainty Modeling for Driver Gaze Estimation via Feature Disentanglement
Gaze estimation can be used for assessing the attention level of drivers. Current works predominantly focus on enhancing model accuracy, often overlooking the influence of input sample and label uncertainty. In this paper, we propose a framework for uncertainty modeling in driver gaze estimation via feature disentanglement, referred to as UnMoDE. Our approach begins by extracting facial information into distinct feature spaces using an asymmetric dual-branch encoder to obtain gaze features. Subsequently, a multi-layer perceptron (MLP) is employed to project gaze features and labels into an embedding space, representing them as Gaussian distributions. The uncertainty is described using a covariance matrix. Random sampling is applied to derive samples from the gaze embedding distribution to estimate the most probable embedding representation. This estimated representation is then used to regress the gaze direction and is projected back into the gaze feature space, along with identity information, to facilitate facial reconstruction. Extensive experimental evaluations demonstrate that UnMoDE significantly outperforms baseline and state-of-the-art methods on the latest benchmark datasets collected for drivers, particularly in reducing the number of samples with significant errors.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
IEEE Transactions on Intelligent Transportation Systems
IEEE Transactions on Intelligent Transportation Systems 工程技术-工程:电子与电气
CiteScore
14.80
自引率
12.90%
发文量
1872
审稿时长
7.5 months
期刊介绍: The theoretical, experimental and operational aspects of electrical and electronics engineering and information technologies as applied to Intelligent Transportation Systems (ITS). Intelligent Transportation Systems are defined as those systems utilizing synergistic technologies and systems engineering concepts to develop and improve transportation systems of all kinds. The scope of this interdisciplinary activity includes the promotion, consolidation and coordination of ITS technical activities among IEEE entities, and providing a focus for cooperative activities, both internally and externally.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信