Proceedings of the 2nd Workshop on Multimedia for Accessible Human Computer Interfaces最新文献

筛选
英文 中文
A Refreshable Tactile Display Effectively Supports Cognitive Mapping Followed by Orientation and Mobility Tasks: A Comparative Multi-modal Study Involving Blind and Low-vision Participants 一个可刷新的触觉显示有效地支持认知映射随后的定向和移动任务:一项涉及盲人和低视力参与者的比较多模态研究
L. Brayda, Fabrizio Leo, Caterina Baccelliere, Claudia Vigini, E. Cocchi
{"title":"A Refreshable Tactile Display Effectively Supports Cognitive Mapping Followed by Orientation and Mobility Tasks: A Comparative Multi-modal Study Involving Blind and Low-vision Participants","authors":"L. Brayda, Fabrizio Leo, Caterina Baccelliere, Claudia Vigini, E. Cocchi","doi":"10.1145/3347319.3356840","DOIUrl":"https://doi.org/10.1145/3347319.3356840","url":null,"abstract":"We investigate the role of refreshable tactile display in supporting the learning of cognitive maps, followed by actual exploration of a real environment that matches that map. We test both blind and low-vision persons and compare displaying maps in three information modes: with a pin array matrix, with raised paper and with verbal descriptions. We find that the pin matrix leads to a better way of externalizing a cognitive map and reduces the performance gap between blind and low-vision people. The entire evaluation is performed by participants in autonomy and suggests that refreshable tactile displays may be used to train blind persons in orientation and mobility tasks.","PeriodicalId":420165,"journal":{"name":"Proceedings of the 2nd Workshop on Multimedia for Accessible Human Computer Interfaces","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-10-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130975522","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
HaptWrap: Augmenting Non-Visual Travel via Visual-to-Tactile Mapping of Objects in Motion HaptWrap:通过运动中物体的视觉到触觉映射来增强非视觉旅行
Bryan Duarte, T. McDaniel, Abhik Chowdhury, Sana Gill, S. Panchanathan
{"title":"HaptWrap: Augmenting Non-Visual Travel via Visual-to-Tactile Mapping of Objects in Motion","authors":"Bryan Duarte, T. McDaniel, Abhik Chowdhury, Sana Gill, S. Panchanathan","doi":"10.1145/3347319.3356835","DOIUrl":"https://doi.org/10.1145/3347319.3356835","url":null,"abstract":"Access to real-time situational information at a distance, including the relative position and motion of surrounding objects, is essential for an individual to travel safely and independently. For blind and low vision travelers, access to critical environmental information is unattainable if it is positioned beyond the reach of their preferred mobility aid or outside their path of travel. Due to its cost and versatility, and the dynamic information which can be aggregated through its use, the long white cane remains the most widely used mobility aid for non-visual travelers. Physical characteristics such as texture, slope, and position can be identified with the long white cane, but only when the traveler is within close proximity to an object. In this work, we introduce a wearable technology to augment non-visual travel methods by communicating spatial information at a distance. We propose a vibrotactile device, the HaptWrap, equipped with vibration motors capable of communicating an object's position relative to the user's orientation, as well as its relative variations in position as the object moves about the user. An experiment supports the use of haptics to represent objects in motion around an individual as a substitute modality for vision.","PeriodicalId":420165,"journal":{"name":"Proceedings of the 2nd Workshop on Multimedia for Accessible Human Computer Interfaces","volume":"133 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-10-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117194119","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Emotion Recognition with Simulated Phosphene Vision 模拟光幻视的情绪识别
Caroline Bollen, R. van Wezel, M. V. van Gerven, Yağmur Güçlütürk
{"title":"Emotion Recognition with Simulated Phosphene Vision","authors":"Caroline Bollen, R. van Wezel, M. V. van Gerven, Yağmur Güçlütürk","doi":"10.1145/3347319.3356836","DOIUrl":"https://doi.org/10.1145/3347319.3356836","url":null,"abstract":"Electrical stimulation of retina, optic nerve or cortex is found to elicit visual sensations, known as phosphenes. This allows visual prosthetics to partially restore vision by representing the visual field as a phosphene pattern. Since the resolution and performance of visual prostheses are limited, only a fraction of the information in a visual scene can be represented by phosphenes. Here, we propose a simple yet powerful image processing strategy for recognizing facial expressions with prosthetic vision, supporting communication and social interaction in the blind. A psychophysical study was conducted to investigate whether a landmark-based representation of facial expressions could improve emotion detection with prosthetic vision. Our approach was compared to edge detection, which is commonly used in current retinal prosthetic devices. Additionally, the relationship between the number of phosphenes and accuracy of emotion recognition was studied. The landmark model improved accuracy of emotion recognition, regardless of the number of phosphenes. Secondly, the accuracy improved with an increasing number of phosphenes up to a saturation point. The performance saturated with fewer phosphenes with the landmark model than with edge detection. These results suggest that landmark-based image pre-processing allows for a more efficient use of the limited information that can be stored in a phosphene pattern, providing a route towards more meaningful and higher-quality perceptual experience in subjects with prosthetic vision.","PeriodicalId":420165,"journal":{"name":"Proceedings of the 2nd Workshop on Multimedia for Accessible Human Computer Interfaces","volume":"194-199 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-10-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130685623","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Gaze Detection and Prediction Using Data from Infrared Cameras 基于红外相机数据的凝视检测与预测
Yingxuan Zhu, Wenyou Sun, T. Yuan, Jian Li
{"title":"Gaze Detection and Prediction Using Data from Infrared Cameras","authors":"Yingxuan Zhu, Wenyou Sun, T. Yuan, Jian Li","doi":"10.1145/3347319.3356838","DOIUrl":"https://doi.org/10.1145/3347319.3356838","url":null,"abstract":"Knowing the point of gaze on a screen can benefit a variety of applications and improve user experiences. Some electronic devices with infrared cameras can generate 3D point cloud for user identification. We propose a paradigm to use 3D point cloud and eye images for gaze detection and prediction. Our method fuses 3D point cloud with eye images by image registration methods. We develop a cost function to detect saggital plane from point cloud data, and reconstruct a symmetric face by saggital plane. Symmetric face data increase the accuracy of gaze detection. We use long-short term memory models to track head and eye movement, and predict next point of gaze. Our method utilizes the existing hardware setup and provides options to improve user experiences.","PeriodicalId":420165,"journal":{"name":"Proceedings of the 2nd Workshop on Multimedia for Accessible Human Computer Interfaces","volume":"114 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-10-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124780055","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Continuous Sign Language Recognition Based on Pseudo-supervised Learning 基于伪监督学习的连续手语识别
Xiankun Pei, Dan Guo, Ye Zhao
{"title":"Continuous Sign Language Recognition Based on Pseudo-supervised Learning","authors":"Xiankun Pei, Dan Guo, Ye Zhao","doi":"10.1145/3347319.3356837","DOIUrl":"https://doi.org/10.1145/3347319.3356837","url":null,"abstract":"Continuous sign language recognition task is challenging for the reason that the ordered words have no exact temporal locations in the video. Aiming at this problem, we propose a method based on pseudo-supervised learning. First, we use a 3D residual convolutional network (3D-ResNet) pre-trained on the UCF101 dataset to extract visual features. Second, we employ a sequence model with connectionist temporal classification (CTC) loss for learning the mapping between the visual features and sentence-level labels, which can be used to generate clip-level pseudo-labels. Since the CTC objective function has limited effects on visual features extracted from early 3D-ResNet, we fine-tune the 3D-ResNet by feeding the clip-level pseudo-labels and video clips to obtain better feature representation. The feature extractor and the sequence model are optimized alternately with CTC loss. The effectiveness of the proposed method is verified on the large datasets RWTH-PHOENIX-Weather-2014.","PeriodicalId":420165,"journal":{"name":"Proceedings of the 2nd Workshop on Multimedia for Accessible Human Computer Interfaces","volume":"145 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-10-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122641702","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Semantic Enhanced Encoder-Decoder Network (SEN) for Video Captioning 语义增强的视频字幕编解码器网络(SEN)
Yuling Gui, Dan Guo, Ye Zhao
{"title":"Semantic Enhanced Encoder-Decoder Network (SEN) for Video Captioning","authors":"Yuling Gui, Dan Guo, Ye Zhao","doi":"10.1145/3347319.3356839","DOIUrl":"https://doi.org/10.1145/3347319.3356839","url":null,"abstract":"Video captioning is a challenging problem in neural networks, computer vision, and natural language processing. It aims to translate a given video into a sequence of words which can be understood by humans. The dynamic information in videos and the complexity in linguistic cause the difficulty of this task. This paper proposes a semantic enhanced encoder-decoder network to tackle this problem. To explore a more abundant variety of video information, it implements a three path fusion strategy in the encoder side which combines complementary features. In the decoding stage, the model adopts an attention mechanism to consider the different contributions of the fused features. In both the encoder and decoder side, the video information is well obtained. Furthermore, we use the idea of reinforcement learning to calculate rewards based on semantic designed computation. Experimental results on Microsoft Video Description Corpus (MSVD) dataset show the effectiveness of the proposed approach.","PeriodicalId":420165,"journal":{"name":"Proceedings of the 2nd Workshop on Multimedia for Accessible Human Computer Interfaces","volume":"61 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-10-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115901058","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Proceedings of the 2nd Workshop on Multimedia for Accessible Human Computer Interfaces 第二届无障碍人机界面多媒体研讨会论文集
{"title":"Proceedings of the 2nd Workshop on Multimedia for Accessible Human Computer Interfaces","authors":"","doi":"10.1145/3347319","DOIUrl":"https://doi.org/10.1145/3347319","url":null,"abstract":"","PeriodicalId":420165,"journal":{"name":"Proceedings of the 2nd Workshop on Multimedia for Accessible Human Computer Interfaces","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121452798","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信