{"title":"Feature Importance in Pedestrian Intention Prediction: A Context-Aware Review","authors":"Mohsen Azarmi, Mahdi Rezaei, He Wang, Ali Arabian","doi":"arxiv-2409.07645","DOIUrl":null,"url":null,"abstract":"Recent advancements in predicting pedestrian crossing intentions for\nAutonomous Vehicles using Computer Vision and Deep Neural Networks are\npromising. However, the black-box nature of DNNs poses challenges in\nunderstanding how the model works and how input features contribute to final\npredictions. This lack of interpretability delimits the trust in model\nperformance and hinders informed decisions on feature selection,\nrepresentation, and model optimisation; thereby affecting the efficacy of\nfuture research in the field. To address this, we introduce Context-aware\nPermutation Feature Importance (CAPFI), a novel approach tailored for\npedestrian intention prediction. CAPFI enables more interpretability and\nreliable assessments of feature importance by leveraging subdivided scenario\ncontexts, mitigating the randomness of feature values through targeted\nshuffling. This aims to reduce variance and prevent biased estimations in\nimportance scores during permutations. We divide the Pedestrian Intention\nEstimation (PIE) dataset into 16 comparable context sets, measure the baseline\nperformance of five distinct neural network architectures for intention\nprediction in each context, and assess input feature importance using CAPFI. We\nobserved nuanced differences among models across various contextual\ncharacteristics. The research reveals the critical role of pedestrian bounding\nboxes and ego-vehicle speed in predicting pedestrian intentions, and potential\nprediction biases due to the speed feature through cross-context permutation\nevaluation. We propose an alternative feature representation by considering\nproximity change rate for rendering dynamic pedestrian-vehicle locomotion,\nthereby enhancing the contributions of input features to intention prediction.\nThese findings underscore the importance of contextual features and their\ndiversity to develop accurate and robust intent-predictive models.","PeriodicalId":501289,"journal":{"name":"arXiv - EE - Image and Video Processing","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - EE - Image and Video Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.07645","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Recent advancements in predicting pedestrian crossing intentions for
Autonomous Vehicles using Computer Vision and Deep Neural Networks are
promising. However, the black-box nature of DNNs poses challenges in
understanding how the model works and how input features contribute to final
predictions. This lack of interpretability delimits the trust in model
performance and hinders informed decisions on feature selection,
representation, and model optimisation; thereby affecting the efficacy of
future research in the field. To address this, we introduce Context-aware
Permutation Feature Importance (CAPFI), a novel approach tailored for
pedestrian intention prediction. CAPFI enables more interpretability and
reliable assessments of feature importance by leveraging subdivided scenario
contexts, mitigating the randomness of feature values through targeted
shuffling. This aims to reduce variance and prevent biased estimations in
importance scores during permutations. We divide the Pedestrian Intention
Estimation (PIE) dataset into 16 comparable context sets, measure the baseline
performance of five distinct neural network architectures for intention
prediction in each context, and assess input feature importance using CAPFI. We
observed nuanced differences among models across various contextual
characteristics. The research reveals the critical role of pedestrian bounding
boxes and ego-vehicle speed in predicting pedestrian intentions, and potential
prediction biases due to the speed feature through cross-context permutation
evaluation. We propose an alternative feature representation by considering
proximity change rate for rendering dynamic pedestrian-vehicle locomotion,
thereby enhancing the contributions of input features to intention prediction.
These findings underscore the importance of contextual features and their
diversity to develop accurate and robust intent-predictive models.