{"title":"Detecting Discrepancies Between Subtitles and Audio in Gameplay Videos With EchoTest","authors":"Ian Gauk;Cor-Paul Bezemer","doi":"10.1109/TG.2024.3435799","DOIUrl":null,"url":null,"abstract":"The landscape of accessibility features in video games remains inconsistent, posing challenges for gamers who seek experiences tailored to their needs. Accessibility features, such as subtitles are widely used by players but are difficult to test manually due to the large scope of games and the variability in how subtitles can appear. In this article, we introduce an automated approach (<sc>EchoTest</small>) to extract subtitles and spoken audio from a gameplay video, convert them into text, and compare them to detect discrepancies, such as typos, desynchronization, and missing text. <sc>EchoTest</small> can be used by game developers to identify discrepancies between subtitles and spoken audio in their games, enabling them to better test the accessibility of their games. In an empirical study on gameplay videos from 15 popular games, <sc>EchoTest</small> can verify discrepancies between subtitles and audio with a precision of 98% and a recall of 89%. In addition, <sc>EchoTest</small> performs well with a precision of 73% and a recall of 99% on a challenging generated benchmark.","PeriodicalId":55977,"journal":{"name":"IEEE Transactions on Games","volume":"17 1","pages":"224-234"},"PeriodicalIF":1.7000,"publicationDate":"2024-07-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Games","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10614845/","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
The landscape of accessibility features in video games remains inconsistent, posing challenges for gamers who seek experiences tailored to their needs. Accessibility features, such as subtitles are widely used by players but are difficult to test manually due to the large scope of games and the variability in how subtitles can appear. In this article, we introduce an automated approach (EchoTest) to extract subtitles and spoken audio from a gameplay video, convert them into text, and compare them to detect discrepancies, such as typos, desynchronization, and missing text. EchoTest can be used by game developers to identify discrepancies between subtitles and spoken audio in their games, enabling them to better test the accessibility of their games. In an empirical study on gameplay videos from 15 popular games, EchoTest can verify discrepancies between subtitles and audio with a precision of 98% and a recall of 89%. In addition, EchoTest performs well with a precision of 73% and a recall of 99% on a challenging generated benchmark.