{"title":"Small group pedestrian crossing behaviour prediction using temporal angular 2D skeletal pose","authors":"Hanugra Aulia Sidharta , Berlian Al Kindhi , Eko Mulyanto Yuniarno , Mauridhi Hery Purnomo","doi":"10.1016/j.array.2024.100341","DOIUrl":null,"url":null,"abstract":"<div><p>A pedestrian is classified as a Vulnerable Road User (VRU) because they do not have the protective equipment that would make them fatal if they were involved in an accident. An accident can happen while a pedestrian is on the road, especially when crossing the road. To ensure pedestrian safety, it is necessary to understand and predict pedestrian behaviour when crossing the road. We propose pedestrian intention prediction using a 2D pose estimation approach with temporal angle as a feature. Based on visual observation of the Joint Attention in Autonomous Driving (JAAD) dataset, we found that pedestrians tend to walk together in small groups while waiting to cross, and then this group is disbanded on the opposite side of the road. Thus, we propose to perform prediction with small group of pedestrians, based on pedestrian statistical data, we define a small group of pedestrians as consisting of 4 pedestrians. Another problem raised is 2D pose estimation is processing each pedestrian index individually, which creates ambiguous pedestrian index in consecutive frame. We propose Multi Input Single Output (MISO), which has capabilities to process multiple pedestrians together, and use summation layer at the end of the model to solve the ambiguous pedestrian index problem without performing tracking on each pedestrian. The performance of our proposed model achieves model accuracy of 0.9306 with prediction performance of 0.8317.</p></div>","PeriodicalId":8417,"journal":{"name":"Array","volume":"22 ","pages":"Article 100341"},"PeriodicalIF":2.3000,"publicationDate":"2024-03-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2590005624000079/pdfft?md5=255bf8dee6ebbdca068e698762cee29a&pid=1-s2.0-S2590005624000079-main.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Array","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2590005624000079","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}
引用次数: 0
Abstract
A pedestrian is classified as a Vulnerable Road User (VRU) because they do not have the protective equipment that would make them fatal if they were involved in an accident. An accident can happen while a pedestrian is on the road, especially when crossing the road. To ensure pedestrian safety, it is necessary to understand and predict pedestrian behaviour when crossing the road. We propose pedestrian intention prediction using a 2D pose estimation approach with temporal angle as a feature. Based on visual observation of the Joint Attention in Autonomous Driving (JAAD) dataset, we found that pedestrians tend to walk together in small groups while waiting to cross, and then this group is disbanded on the opposite side of the road. Thus, we propose to perform prediction with small group of pedestrians, based on pedestrian statistical data, we define a small group of pedestrians as consisting of 4 pedestrians. Another problem raised is 2D pose estimation is processing each pedestrian index individually, which creates ambiguous pedestrian index in consecutive frame. We propose Multi Input Single Output (MISO), which has capabilities to process multiple pedestrians together, and use summation layer at the end of the model to solve the ambiguous pedestrian index problem without performing tracking on each pedestrian. The performance of our proposed model achieves model accuracy of 0.9306 with prediction performance of 0.8317.