{"title":"UnMoDE:基于特征解纠缠的驾驶员注视估计的不确定性建模","authors":"Daosong Hu;Mingyue Cui;Kai Huang","doi":"10.1109/TITS.2025.3556553","DOIUrl":null,"url":null,"abstract":"Gaze estimation can be used for assessing the attention level of drivers. Current works predominantly focus on enhancing model accuracy, often overlooking the influence of input sample and label uncertainty. In this paper, we propose a framework for uncertainty modeling in driver gaze estimation via feature disentanglement, referred to as UnMoDE. Our approach begins by extracting facial information into distinct feature spaces using an asymmetric dual-branch encoder to obtain gaze features. Subsequently, a multi-layer perceptron (MLP) is employed to project gaze features and labels into an embedding space, representing them as Gaussian distributions. The uncertainty is described using a covariance matrix. Random sampling is applied to derive samples from the gaze embedding distribution to estimate the most probable embedding representation. This estimated representation is then used to regress the gaze direction and is projected back into the gaze feature space, along with identity information, to facilitate facial reconstruction. Extensive experimental evaluations demonstrate that UnMoDE significantly outperforms baseline and state-of-the-art methods on the latest benchmark datasets collected for drivers, particularly in reducing the number of samples with significant errors.","PeriodicalId":13416,"journal":{"name":"IEEE Transactions on Intelligent Transportation Systems","volume":"26 7","pages":"10612-10622"},"PeriodicalIF":8.4000,"publicationDate":"2025-04-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"UnMoDE: Uncertainty Modeling for Driver Gaze Estimation via Feature Disentanglement\",\"authors\":\"Daosong Hu;Mingyue Cui;Kai Huang\",\"doi\":\"10.1109/TITS.2025.3556553\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Gaze estimation can be used for assessing the attention level of drivers. Current works predominantly focus on enhancing model accuracy, often overlooking the influence of input sample and label uncertainty. In this paper, we propose a framework for uncertainty modeling in driver gaze estimation via feature disentanglement, referred to as UnMoDE. Our approach begins by extracting facial information into distinct feature spaces using an asymmetric dual-branch encoder to obtain gaze features. Subsequently, a multi-layer perceptron (MLP) is employed to project gaze features and labels into an embedding space, representing them as Gaussian distributions. The uncertainty is described using a covariance matrix. Random sampling is applied to derive samples from the gaze embedding distribution to estimate the most probable embedding representation. This estimated representation is then used to regress the gaze direction and is projected back into the gaze feature space, along with identity information, to facilitate facial reconstruction. Extensive experimental evaluations demonstrate that UnMoDE significantly outperforms baseline and state-of-the-art methods on the latest benchmark datasets collected for drivers, particularly in reducing the number of samples with significant errors.\",\"PeriodicalId\":13416,\"journal\":{\"name\":\"IEEE Transactions on Intelligent Transportation Systems\",\"volume\":\"26 7\",\"pages\":\"10612-10622\"},\"PeriodicalIF\":8.4000,\"publicationDate\":\"2025-04-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Intelligent Transportation Systems\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10964529/\",\"RegionNum\":1,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ENGINEERING, CIVIL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Intelligent Transportation Systems","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10964529/","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, CIVIL","Score":null,"Total":0}
UnMoDE: Uncertainty Modeling for Driver Gaze Estimation via Feature Disentanglement
Gaze estimation can be used for assessing the attention level of drivers. Current works predominantly focus on enhancing model accuracy, often overlooking the influence of input sample and label uncertainty. In this paper, we propose a framework for uncertainty modeling in driver gaze estimation via feature disentanglement, referred to as UnMoDE. Our approach begins by extracting facial information into distinct feature spaces using an asymmetric dual-branch encoder to obtain gaze features. Subsequently, a multi-layer perceptron (MLP) is employed to project gaze features and labels into an embedding space, representing them as Gaussian distributions. The uncertainty is described using a covariance matrix. Random sampling is applied to derive samples from the gaze embedding distribution to estimate the most probable embedding representation. This estimated representation is then used to regress the gaze direction and is projected back into the gaze feature space, along with identity information, to facilitate facial reconstruction. Extensive experimental evaluations demonstrate that UnMoDE significantly outperforms baseline and state-of-the-art methods on the latest benchmark datasets collected for drivers, particularly in reducing the number of samples with significant errors.
期刊介绍:
The theoretical, experimental and operational aspects of electrical and electronics engineering and information technologies as applied to Intelligent Transportation Systems (ITS). Intelligent Transportation Systems are defined as those systems utilizing synergistic technologies and systems engineering concepts to develop and improve transportation systems of all kinds. The scope of this interdisciplinary activity includes the promotion, consolidation and coordination of ITS technical activities among IEEE entities, and providing a focus for cooperative activities, both internally and externally.