使用多层次方法对视障人士的图像进行超声处理

M. Banf, V. Blanz
{"title":"使用多层次方法对视障人士的图像进行超声处理","authors":"M. Banf, V. Blanz","doi":"10.1145/2459236.2459264","DOIUrl":null,"url":null,"abstract":"This paper presents a system that strives to give visually impaired persons direct perceptual access to images via an acoustic signal. The user explores the image actively on a touch screen and receives auditory feedback about the image content at the current position. The design of such a system involves two major challenges: what is the most useful and relevant image information, and how can as much information as possible be captured in an audio signal. We address both problems, and propose a general approach that combines low-level information, such as color, edges, and roughness, with mid- and high-level information obtained from Machine Learning algorithms. This includes object recognition and the classification of regions into the categories \"man made\" versus \"natural\". We argue that this multi-level approach gives users direct access to what is where in the image, yet it still exploits the potential of recent developments in Computer Vision and Machine Learning.","PeriodicalId":407457,"journal":{"name":"International Conference on Adaptive Hypermedia and Adaptive Web-Based Systems","volume":"2014 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-03-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"27","resultStr":"{\"title\":\"Sonification of images for the visually impaired using a multi-level approach\",\"authors\":\"M. Banf, V. Blanz\",\"doi\":\"10.1145/2459236.2459264\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper presents a system that strives to give visually impaired persons direct perceptual access to images via an acoustic signal. The user explores the image actively on a touch screen and receives auditory feedback about the image content at the current position. The design of such a system involves two major challenges: what is the most useful and relevant image information, and how can as much information as possible be captured in an audio signal. We address both problems, and propose a general approach that combines low-level information, such as color, edges, and roughness, with mid- and high-level information obtained from Machine Learning algorithms. This includes object recognition and the classification of regions into the categories \\\"man made\\\" versus \\\"natural\\\". We argue that this multi-level approach gives users direct access to what is where in the image, yet it still exploits the potential of recent developments in Computer Vision and Machine Learning.\",\"PeriodicalId\":407457,\"journal\":{\"name\":\"International Conference on Adaptive Hypermedia and Adaptive Web-Based Systems\",\"volume\":\"2014 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2013-03-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"27\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Conference on Adaptive Hypermedia and Adaptive Web-Based Systems\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2459236.2459264\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Conference on Adaptive Hypermedia and Adaptive Web-Based Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2459236.2459264","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 27

摘要

本文提出了一个系统,努力给视障人士直接感性访问图像通过一个声音信号。用户在触摸屏上主动探索图像,并在当前位置接收关于图像内容的听觉反馈。这样一个系统的设计涉及两个主要的挑战:什么是最有用的和相关的图像信息,以及如何在音频信号中捕获尽可能多的信息。我们解决了这两个问题,并提出了一种将低级信息(如颜色、边缘和粗糙度)与从机器学习算法中获得的中高级信息相结合的通用方法。这包括物体识别和将区域分为“人造”和“自然”两类。我们认为,这种多层次的方法使用户可以直接访问图像中的位置,但它仍然利用了计算机视觉和机器学习的最新发展潜力。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Sonification of images for the visually impaired using a multi-level approach
This paper presents a system that strives to give visually impaired persons direct perceptual access to images via an acoustic signal. The user explores the image actively on a touch screen and receives auditory feedback about the image content at the current position. The design of such a system involves two major challenges: what is the most useful and relevant image information, and how can as much information as possible be captured in an audio signal. We address both problems, and propose a general approach that combines low-level information, such as color, edges, and roughness, with mid- and high-level information obtained from Machine Learning algorithms. This includes object recognition and the classification of regions into the categories "man made" versus "natural". We argue that this multi-level approach gives users direct access to what is where in the image, yet it still exploits the potential of recent developments in Computer Vision and Machine Learning.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信