Vishal Chauhan , Anubhav , Chia-Ming Chang , Xiang Su , Jin Nakazato , Ehsan Javanmardi , Alex Orsholits , Takeo Igarashi , Kantaro Fujiwara , Manabu Tsukada
{"title":"Towards the future of pedestrian–AV interaction: Human perception vs. LLM insights on Smart Pole Interaction Unit in shared spaces","authors":"Vishal Chauhan , Anubhav , Chia-Ming Chang , Xiang Su , Jin Nakazato , Ehsan Javanmardi , Alex Orsholits , Takeo Igarashi , Kantaro Fujiwara , Manabu Tsukada","doi":"10.1016/j.ijhcs.2025.103628","DOIUrl":null,"url":null,"abstract":"<div><div>As autonomous vehicles (AVs) reshape urban mobility, establishing effective communication between pedestrians and self-driving vehicles has become a critical safety imperative. This work investigates the integration of Smart Pole Interaction Units (SPIUs) as external human–machine interfaces (eHMIs) in shared spaces and introduces an innovative approach to enhance pedestrian–AV interactions. To provide subjective evidence on SPIU usability, we conduct a group design study (“Humans”) involving 25 participants (aged 18–40). We evaluate user preferences and interaction patterns using group discussion materials, revealing that 90% of the participants strongly prefer real-time multi-AV interactions facilitated by SPIU over conventional eHMI systems, where a pedestrian must look at multiple AVs individually. Furthermore, they emphasize inclusive design through multi-sensory communication channels—visual, auditory, and tactile signals—specifically addressing the needs of vulnerable road users (VRUs), including those with impairments. To complement these non-expert, real-world insights, we employ three leading Large Language Models (LLMs) (ChatGPT-4, Gemini-Pro, and Claude 3.5 Sonnet) as “experts” due to their extensive training data. Using the advantages of the multimodal vision-language processing capabilities of these LLMs, identical questions (text and images) used in human discussions are posed to generate text responses for pedestrian–AV interaction scenarios. Responses generated from LLMs and recorded conversations from human group discussions are used to extract the most frequent words. A keyword frequency analysis from both humans and LLMs is performed with three categories, Context, Safety, and Important. Our findings indicate that LLMs employ safety-related keywords 30% more frequently than human participants, suggesting a more structured, safety-centric approach. Among LLMs, ChatGPT-4 demonstrates superior response latency, Claude shows a closer alignment with human responses, and Gemini-Pro provides structured and contextually relevant insights. Our results from “Humans” and “LLMs” establish SPIU as a promising system for facilitating trust-building and safety-ensuring interactions among pedestrians, AVs, and delivery robots. Integrating diverse stakeholder feedback, we propose a prototype SPIU design to advance pedestrian–AV interactions in shared urban spaces, positioning SPIU as crucial infrastructure hubs for safe and trustworthy navigation.</div></div>","PeriodicalId":54955,"journal":{"name":"International Journal of Human-Computer Studies","volume":"205 ","pages":"Article 103628"},"PeriodicalIF":5.1000,"publicationDate":"2025-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Human-Computer Studies","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1071581925001855","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, CYBERNETICS","Score":null,"Total":0}
引用次数: 0
Abstract
As autonomous vehicles (AVs) reshape urban mobility, establishing effective communication between pedestrians and self-driving vehicles has become a critical safety imperative. This work investigates the integration of Smart Pole Interaction Units (SPIUs) as external human–machine interfaces (eHMIs) in shared spaces and introduces an innovative approach to enhance pedestrian–AV interactions. To provide subjective evidence on SPIU usability, we conduct a group design study (“Humans”) involving 25 participants (aged 18–40). We evaluate user preferences and interaction patterns using group discussion materials, revealing that 90% of the participants strongly prefer real-time multi-AV interactions facilitated by SPIU over conventional eHMI systems, where a pedestrian must look at multiple AVs individually. Furthermore, they emphasize inclusive design through multi-sensory communication channels—visual, auditory, and tactile signals—specifically addressing the needs of vulnerable road users (VRUs), including those with impairments. To complement these non-expert, real-world insights, we employ three leading Large Language Models (LLMs) (ChatGPT-4, Gemini-Pro, and Claude 3.5 Sonnet) as “experts” due to their extensive training data. Using the advantages of the multimodal vision-language processing capabilities of these LLMs, identical questions (text and images) used in human discussions are posed to generate text responses for pedestrian–AV interaction scenarios. Responses generated from LLMs and recorded conversations from human group discussions are used to extract the most frequent words. A keyword frequency analysis from both humans and LLMs is performed with three categories, Context, Safety, and Important. Our findings indicate that LLMs employ safety-related keywords 30% more frequently than human participants, suggesting a more structured, safety-centric approach. Among LLMs, ChatGPT-4 demonstrates superior response latency, Claude shows a closer alignment with human responses, and Gemini-Pro provides structured and contextually relevant insights. Our results from “Humans” and “LLMs” establish SPIU as a promising system for facilitating trust-building and safety-ensuring interactions among pedestrians, AVs, and delivery robots. Integrating diverse stakeholder feedback, we propose a prototype SPIU design to advance pedestrian–AV interactions in shared urban spaces, positioning SPIU as crucial infrastructure hubs for safe and trustworthy navigation.
期刊介绍:
The International Journal of Human-Computer Studies publishes original research over the whole spectrum of work relevant to the theory and practice of innovative interactive systems. The journal is inherently interdisciplinary, covering research in computing, artificial intelligence, psychology, linguistics, communication, design, engineering, and social organization, which is relevant to the design, analysis, evaluation and application of innovative interactive systems. Papers at the boundaries of these disciplines are especially welcome, as it is our view that interdisciplinary approaches are needed for producing theoretical insights in this complex area and for effective deployment of innovative technologies in concrete user communities.
Research areas relevant to the journal include, but are not limited to:
• Innovative interaction techniques
• Multimodal interaction
• Speech interaction
• Graphic interaction
• Natural language interaction
• Interaction in mobile and embedded systems
• Interface design and evaluation methodologies
• Design and evaluation of innovative interactive systems
• User interface prototyping and management systems
• Ubiquitous computing
• Wearable computers
• Pervasive computing
• Affective computing
• Empirical studies of user behaviour
• Empirical studies of programming and software engineering
• Computer supported cooperative work
• Computer mediated communication
• Virtual reality
• Mixed and augmented Reality
• Intelligent user interfaces
• Presence
...