使用多层次方法对视障人士的图像进行超声处理

International Conference on Adaptive Hypermedia and Adaptive Web-Based Systems Pub Date : 2013-03-07 DOI:10.1145/2459236.2459264

M. Banf, V. Blanz

{"title":"使用多层次方法对视障人士的图像进行超声处理","authors":"M. Banf, V. Blanz","doi":"10.1145/2459236.2459264","DOIUrl":null,"url":null,"abstract":"This paper presents a system that strives to give visually impaired persons direct perceptual access to images via an acoustic signal. The user explores the image actively on a touch screen and receives auditory feedback about the image content at the current position. The design of such a system involves two major challenges: what is the most useful and relevant image information, and how can as much information as possible be captured in an audio signal. We address both problems, and propose a general approach that combines low-level information, such as color, edges, and roughness, with mid- and high-level information obtained from Machine Learning algorithms. This includes object recognition and the classification of regions into the categories \"man made\" versus \"natural\". We argue that this multi-level approach gives users direct access to what is where in the image, yet it still exploits the potential of recent developments in Computer Vision and Machine Learning.","PeriodicalId":407457,"journal":{"name":"International Conference on Adaptive Hypermedia and Adaptive Web-Based Systems","volume":"2014 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-03-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"27","resultStr":"{\"title\":\"Sonification of images for the visually impaired using a multi-level approach\",\"authors\":\"M. Banf, V. Blanz\",\"doi\":\"10.1145/2459236.2459264\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper presents a system that strives to give visually impaired persons direct perceptual access to images via an acoustic signal. The user explores the image actively on a touch screen and receives auditory feedback about the image content at the current position. The design of such a system involves two major challenges: what is the most useful and relevant image information, and how can as much information as possible be captured in an audio signal. We address both problems, and propose a general approach that combines low-level information, such as color, edges, and roughness, with mid- and high-level information obtained from Machine Learning algorithms. This includes object recognition and the classification of regions into the categories \\\"man made\\\" versus \\\"natural\\\". We argue that this multi-level approach gives users direct access to what is where in the image, yet it still exploits the potential of recent developments in Computer Vision and Machine Learning.\",\"PeriodicalId\":407457,\"journal\":{\"name\":\"International Conference on Adaptive Hypermedia and Adaptive Web-Based Systems\",\"volume\":\"2014 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2013-03-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"27\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Conference on Adaptive Hypermedia and Adaptive Web-Based Systems\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2459236.2459264\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Conference on Adaptive Hypermedia and Adaptive Web-Based Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2459236.2459264","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 27

摘要

本文提出了一个系统，努力给视障人士直接感性访问图像通过一个声音信号。用户在触摸屏上主动探索图像，并在当前位置接收关于图像内容的听觉反馈。这样一个系统的设计涉及两个主要的挑战:什么是最有用的和相关的图像信息，以及如何在音频信号中捕获尽可能多的信息。我们解决了这两个问题，并提出了一种将低级信息(如颜色、边缘和粗糙度)与从机器学习算法中获得的中高级信息相结合的通用方法。这包括物体识别和将区域分为“人造”和“自然”两类。我们认为，这种多层次的方法使用户可以直接访问图像中的位置，但它仍然利用了计算机视觉和机器学习的最新发展潜力。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Sonification of images for the visually impaired using a multi-level approach

This paper presents a system that strives to give visually impaired persons direct perceptual access to images via an acoustic signal. The user explores the image actively on a touch screen and receives auditory feedback about the image content at the current position. The design of such a system involves two major challenges: what is the most useful and relevant image information, and how can as much information as possible be captured in an audio signal. We address both problems, and propose a general approach that combines low-level information, such as color, edges, and roughness, with mid- and high-level information obtained from Machine Learning algorithms. This includes object recognition and the classification of regions into the categories "man made" versus "natural". We argue that this multi-level approach gives users direct access to what is where in the image, yet it still exploits the potential of recent developments in Computer Vision and Machine Learning.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

International Conference on Adaptive Hypermedia and Adaptive Web-Based Systems

自引率

0.00%

发文量