He Zhang, Nicholas J Falletta, Jingyi Xie, Rui Yu, Sooyeon Lee, Syed Masum Billah, John M Carroll
{"title":"通过多模态交互增强视障人士的旅行体验:NaviGPT,人工智能驱动的实时移动导航系统。","authors":"He Zhang, Nicholas J Falletta, Jingyi Xie, Rui Yu, Sooyeon Lee, Syed Masum Billah, John M Carroll","doi":"10.1145/3688828.3699636","DOIUrl":null,"url":null,"abstract":"<p><p>Assistive technologies for people with visual impairments (PVI) have made significant advancements, particularly with the integration of artificial intelligence (AI) and real-time sensor technologies. However, current solutions often require PVI to switch between multiple apps and tools for tasks like image recognition, navigation, and obstacle detection, which can hinder a seamless and efficient user experience. In this paper, we present NaviGPT, a high-fidelity prototype that integrates LiDAR-based obstacle detection, vibration feedback, and large language model (LLM) responses to provide a comprehensive and real-time navigation aid for PVI. Unlike existing applications such as Be My AI and Seeing AI, NaviGPT combines image recognition and contextual navigation guidance into a single system, offering continuous feedback on the user's surroundings without the need for app-switching. Meanwhile, NaviGPT compensates for the response delays of LLM by using location and sensor data, aiming to provide practical and efficient navigation support for PVI in dynamic environments.</p>","PeriodicalId":88878,"journal":{"name":"GROUP ... : proceedings of the International ACM SIGCHI Conference on Supporting Group Work. ACM SIGCHI International Conference on Supporting Group Work","volume":"2025 Companion","pages":"29-35"},"PeriodicalIF":0.0000,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11727231/pdf/","citationCount":"0","resultStr":"{\"title\":\"Enhancing the Travel Experience for People with Visual Impairments through Multimodal Interaction: NaviGPT, A Real-Time AI-Driven Mobile Navigation System.\",\"authors\":\"He Zhang, Nicholas J Falletta, Jingyi Xie, Rui Yu, Sooyeon Lee, Syed Masum Billah, John M Carroll\",\"doi\":\"10.1145/3688828.3699636\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Assistive technologies for people with visual impairments (PVI) have made significant advancements, particularly with the integration of artificial intelligence (AI) and real-time sensor technologies. However, current solutions often require PVI to switch between multiple apps and tools for tasks like image recognition, navigation, and obstacle detection, which can hinder a seamless and efficient user experience. In this paper, we present NaviGPT, a high-fidelity prototype that integrates LiDAR-based obstacle detection, vibration feedback, and large language model (LLM) responses to provide a comprehensive and real-time navigation aid for PVI. Unlike existing applications such as Be My AI and Seeing AI, NaviGPT combines image recognition and contextual navigation guidance into a single system, offering continuous feedback on the user's surroundings without the need for app-switching. Meanwhile, NaviGPT compensates for the response delays of LLM by using location and sensor data, aiming to provide practical and efficient navigation support for PVI in dynamic environments.</p>\",\"PeriodicalId\":88878,\"journal\":{\"name\":\"GROUP ... : proceedings of the International ACM SIGCHI Conference on Supporting Group Work. ACM SIGCHI International Conference on Supporting Group Work\",\"volume\":\"2025 Companion\",\"pages\":\"29-35\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2025-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11727231/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"GROUP ... : proceedings of the International ACM SIGCHI Conference on Supporting Group Work. ACM SIGCHI International Conference on Supporting Group Work\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3688828.3699636\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2025/1/12 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"GROUP ... : proceedings of the International ACM SIGCHI Conference on Supporting Group Work. ACM SIGCHI International Conference on Supporting Group Work","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3688828.3699636","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/12 0:00:00","PubModel":"Epub","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
摘要
针对视障人士的辅助技术取得了重大进展,特别是人工智能(AI)和实时传感器技术的集成。然而,目前的解决方案通常需要PVI在多个应用程序和工具之间切换,以完成图像识别、导航和障碍物检测等任务,这可能会阻碍无缝和高效的用户体验。在本文中,我们提出了NaviGPT,一个高保真原型,集成了基于激光雷达的障碍物检测,振动反馈和大语言模型(LLM)响应,为PVI提供全面和实时的导航辅助。与Be My AI和Seeing AI等现有应用程序不同,NaviGPT将图像识别和上下文导航引导结合到一个系统中,无需切换应用程序就可以对用户周围环境提供持续的反馈。同时,NaviGPT利用位置和传感器数据补偿LLM的响应延迟,旨在为动态环境下的PVI提供实用高效的导航支持。
Enhancing the Travel Experience for People with Visual Impairments through Multimodal Interaction: NaviGPT, A Real-Time AI-Driven Mobile Navigation System.
Assistive technologies for people with visual impairments (PVI) have made significant advancements, particularly with the integration of artificial intelligence (AI) and real-time sensor technologies. However, current solutions often require PVI to switch between multiple apps and tools for tasks like image recognition, navigation, and obstacle detection, which can hinder a seamless and efficient user experience. In this paper, we present NaviGPT, a high-fidelity prototype that integrates LiDAR-based obstacle detection, vibration feedback, and large language model (LLM) responses to provide a comprehensive and real-time navigation aid for PVI. Unlike existing applications such as Be My AI and Seeing AI, NaviGPT combines image recognition and contextual navigation guidance into a single system, offering continuous feedback on the user's surroundings without the need for app-switching. Meanwhile, NaviGPT compensates for the response delays of LLM by using location and sensor data, aiming to provide practical and efficient navigation support for PVI in dynamic environments.