基于红色排斥和神经网络的视听语音识别

Australasian Computer Science Conference Pub Date : 1900-01-01 DOI:10.1145/563857.563819

T. Lewis, D. Powers

{"title":"基于红色排斥和神经网络的视听语音识别","authors":"T. Lewis, D. Powers","doi":"10.1145/563857.563819","DOIUrl":null,"url":null,"abstract":"Automatic speech recognition (ASR) performs well under restricted conditions, but performance degrades in noisy environments. Audio-Visual Speech Recognition (AVSR) combats this by incorporating a visual signal into the recognition. This paper briefly reviews the contribution of psycholinguistics to this endeavour and the recent advances in machine AVSR. An important first step in AVSR is that of feature extraction from the mouth region and a technique developed by the authors is breifly presented. This paper examines examine how useful this extraction technique in combination with several integration arhitectures is at the given task, demonstrates that vision does infact assist speech recognition when used in a linguistically guided fashion, and gives insight remaining issues.","PeriodicalId":136130,"journal":{"name":"Australasian Computer Science Conference","volume":"6 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"46","resultStr":"{\"title\":\"Audio-Visual Speech Recognition Using Red Exclusion and Neural Networks\",\"authors\":\"T. Lewis, D. Powers\",\"doi\":\"10.1145/563857.563819\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Automatic speech recognition (ASR) performs well under restricted conditions, but performance degrades in noisy environments. Audio-Visual Speech Recognition (AVSR) combats this by incorporating a visual signal into the recognition. This paper briefly reviews the contribution of psycholinguistics to this endeavour and the recent advances in machine AVSR. An important first step in AVSR is that of feature extraction from the mouth region and a technique developed by the authors is breifly presented. This paper examines examine how useful this extraction technique in combination with several integration arhitectures is at the given task, demonstrates that vision does infact assist speech recognition when used in a linguistically guided fashion, and gives insight remaining issues.\",\"PeriodicalId\":136130,\"journal\":{\"name\":\"Australasian Computer Science Conference\",\"volume\":\"6 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1900-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"46\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Australasian Computer Science Conference\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/563857.563819\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Australasian Computer Science Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/563857.563819","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 46

摘要

自动语音识别(ASR)在受限条件下表现良好，但在噪声环境下性能下降。视听语音识别(AVSR)通过在识别中加入视觉信号来解决这个问题。本文简要回顾了心理语言学在这方面的贡献以及机器自动语音识别方面的最新进展。AVSR的一个重要的第一步是从口腔区域提取特征，并简要介绍了作者开发的一种技术。本文考察了这种提取技术与几种集成架构相结合在给定任务中的作用，证明了当以语言引导的方式使用时，视觉确实有助于语音识别，并给出了遗留问题的见解。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Audio-Visual Speech Recognition Using Red Exclusion and Neural Networks

Automatic speech recognition (ASR) performs well under restricted conditions, but performance degrades in noisy environments. Audio-Visual Speech Recognition (AVSR) combats this by incorporating a visual signal into the recognition. This paper briefly reviews the contribution of psycholinguistics to this endeavour and the recent advances in machine AVSR. An important first step in AVSR is that of feature extraction from the mouth region and a technique developed by the authors is breifly presented. This paper examines examine how useful this extraction technique in combination with several integration arhitectures is at the given task, demonstrates that vision does infact assist speech recognition when used in a linguistically guided fashion, and gives insight remaining issues.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Australasian Computer Science Conference

自引率

0.00%

发文量