视觉增强和色彩转换算法对无声视频远程声音恢复的影响

IF 1.7 4区 工程技术 Q3 ENGINEERING, ELECTRICAL & ELECTRONIC
Ren-Jun Choong, Wun-She Yap, Yan Chai Hum, Khin Wee Lai, Lloyd Ling, Anthony Vodacek, Yee Kai Tee
{"title":"视觉增强和色彩转换算法对无声视频远程声音恢复的影响","authors":"Ren-Jun Choong,&nbsp;Wun-She Yap,&nbsp;Yan Chai Hum,&nbsp;Khin Wee Lai,&nbsp;Lloyd Ling,&nbsp;Anthony Vodacek,&nbsp;Yee Kai Tee","doi":"10.1002/jsid.1275","DOIUrl":null,"url":null,"abstract":"<p>The visual microphone is a technique for remote sound recovery that extracts sound information from tiny pixel-scale vibrations in a video. Despite having demonstrated success in sound recovery, the impact of various visual enhancement and color conversion algorithms applied on the video before the sound recovery process has not been explored. Thus, it is important to investigate these effects have on the recovered sound quality, as the vibrations are so small the effects play an important role. This work experimented with different color to grayscale conversions and visual enhancement algorithms on 576 videos, and found that the recovered sound quality is indeed greatly affected by the choice of algorithms. The best conversion algorithms were found to be the average of the red, green and blue color channels and the perceptual lightness in the CIELAB color space, improving the recovered sound quality by up to 23.22%. Furthermore, visual enhancement techniques such as gamma correction have been found to corrupt vibration information, leading to a 22.47% drop in recovered sound quality in one of the tested videos. Therefore, it is advisable to avoid or minimize the use of visual enhancement techniques for remote sound recovery to prevent the elimination of useful subtle vibrations.</p>","PeriodicalId":49979,"journal":{"name":"Journal of the Society for Information Display","volume":null,"pages":null},"PeriodicalIF":1.7000,"publicationDate":"2024-03-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Impact of visual enhancement and color conversion algorithms on remote sound recovery from silent videos\",\"authors\":\"Ren-Jun Choong,&nbsp;Wun-She Yap,&nbsp;Yan Chai Hum,&nbsp;Khin Wee Lai,&nbsp;Lloyd Ling,&nbsp;Anthony Vodacek,&nbsp;Yee Kai Tee\",\"doi\":\"10.1002/jsid.1275\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>The visual microphone is a technique for remote sound recovery that extracts sound information from tiny pixel-scale vibrations in a video. Despite having demonstrated success in sound recovery, the impact of various visual enhancement and color conversion algorithms applied on the video before the sound recovery process has not been explored. Thus, it is important to investigate these effects have on the recovered sound quality, as the vibrations are so small the effects play an important role. This work experimented with different color to grayscale conversions and visual enhancement algorithms on 576 videos, and found that the recovered sound quality is indeed greatly affected by the choice of algorithms. The best conversion algorithms were found to be the average of the red, green and blue color channels and the perceptual lightness in the CIELAB color space, improving the recovered sound quality by up to 23.22%. Furthermore, visual enhancement techniques such as gamma correction have been found to corrupt vibration information, leading to a 22.47% drop in recovered sound quality in one of the tested videos. Therefore, it is advisable to avoid or minimize the use of visual enhancement techniques for remote sound recovery to prevent the elimination of useful subtle vibrations.</p>\",\"PeriodicalId\":49979,\"journal\":{\"name\":\"Journal of the Society for Information Display\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":1.7000,\"publicationDate\":\"2024-03-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of the Society for Information Display\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1002/jsid.1275\",\"RegionNum\":4,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of the Society for Information Display","FirstCategoryId":"5","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/jsid.1275","RegionNum":4,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0

摘要

视觉麦克风是一种远程声音恢复技术,可从视频中微小的像素级振动中提取声音信息。尽管在声音恢复方面取得了成功,但在声音恢复过程之前对视频应用的各种视觉增强和色彩转换算法的影响尚未得到探讨。因此,研究这些算法对声音恢复质量的影响非常重要,因为振动是如此微小,这些影响起着重要作用。这项工作在 576 个视频上试验了不同的彩色到灰度转换和视觉增强算法,发现声音恢复质量确实受到算法选择的很大影响。最佳转换算法是 CIELAB 色彩空间中红、绿、蓝色彩通道和感知亮度的平均值,可将恢复的音质提高 23.22%。此外,人们还发现伽玛校正等视觉增强技术会破坏振动信息,导致其中一个测试视频的恢复音质下降了 22.47%。因此,最好避免或尽量减少在远程声音恢复中使用视觉增强技术,以防止有用的细微振动被消除。
本文章由计算机程序翻译,如有差异,请以英文原文为准。

Impact of visual enhancement and color conversion algorithms on remote sound recovery from silent videos

Impact of visual enhancement and color conversion algorithms on remote sound recovery from silent videos

The visual microphone is a technique for remote sound recovery that extracts sound information from tiny pixel-scale vibrations in a video. Despite having demonstrated success in sound recovery, the impact of various visual enhancement and color conversion algorithms applied on the video before the sound recovery process has not been explored. Thus, it is important to investigate these effects have on the recovered sound quality, as the vibrations are so small the effects play an important role. This work experimented with different color to grayscale conversions and visual enhancement algorithms on 576 videos, and found that the recovered sound quality is indeed greatly affected by the choice of algorithms. The best conversion algorithms were found to be the average of the red, green and blue color channels and the perceptual lightness in the CIELAB color space, improving the recovered sound quality by up to 23.22%. Furthermore, visual enhancement techniques such as gamma correction have been found to corrupt vibration information, leading to a 22.47% drop in recovered sound quality in one of the tested videos. Therefore, it is advisable to avoid or minimize the use of visual enhancement techniques for remote sound recovery to prevent the elimination of useful subtle vibrations.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Journal of the Society for Information Display
Journal of the Society for Information Display 工程技术-材料科学:综合
CiteScore
4.80
自引率
8.70%
发文量
98
审稿时长
3 months
期刊介绍: The Journal of the Society for Information Display publishes original works dealing with the theory and practice of information display. Coverage includes materials, devices and systems; the underlying chemistry, physics, physiology and psychology; measurement techniques, manufacturing technologies; and all aspects of the interaction between equipment and its users. Review articles are also published in all of these areas. Occasional special issues or sections consist of collections of papers on specific topical areas or collections of full length papers based in part on oral or poster presentations given at SID sponsored conferences.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信