{"title":"Seeing eye drone: a deep learning, vision-based UAV for assisting the visually impaired with mobility","authors":"L. Grewe, Garrett Stevenson","doi":"10.1145/3321408.3321414","DOIUrl":null,"url":null,"abstract":"Seeing Eye Drone assists low-vision persons with environment awareness performing exploration and obstacle detection. The modalities of 3D (stereo) and 2D vision on a drone are compared for this task. Different deep-learning systems are developed including 2D only and 3D+2D networks. Comparisons of retrained networks versus training from scratch are also made and approximately 34,000 samples were collected for training and the resulting SSD CNN architecture is used to determine a user's location and direction of travel. A second network identifies locations of common objects in the scene. The object locations are then compared with the user location/heading and depth data to determine whether they represent obstacles. Obstacles determined to be in the user's region of interest are communicated to the visually-impaired user via Text-to-Speech. Real data from outdoor drone flights that communicate with an Android based application are shown.","PeriodicalId":364264,"journal":{"name":"Proceedings of the ACM Turing Celebration Conference - China","volume":"21 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-05-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the ACM Turing Celebration Conference - China","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3321408.3321414","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 8
Abstract
Seeing Eye Drone assists low-vision persons with environment awareness performing exploration and obstacle detection. The modalities of 3D (stereo) and 2D vision on a drone are compared for this task. Different deep-learning systems are developed including 2D only and 3D+2D networks. Comparisons of retrained networks versus training from scratch are also made and approximately 34,000 samples were collected for training and the resulting SSD CNN architecture is used to determine a user's location and direction of travel. A second network identifies locations of common objects in the scene. The object locations are then compared with the user location/heading and depth data to determine whether they represent obstacles. Obstacles determined to be in the user's region of interest are communicated to the visually-impaired user via Text-to-Speech. Real data from outdoor drone flights that communicate with an Android based application are shown.