{"title":"Asynchronous integration of audio and visual sources in bi-modal automatic speech recognition","authors":"P. Deléglise, A. Rogozan, M. Alissali","doi":"10.5281/ZENODO.36298","DOIUrl":null,"url":null,"abstract":"This paper presents our work on the integration of visual data in automatic speech recognition systems. We particularly aim at solving two problems: • classifiation differences for the modeling of acoustic information (phonemes) and visual information (visemes); • the phenomena of anticipation and retention of visemes on the corresponding phonemes. We developed and tested three systems, each dealing with one or both problems and proposing a different integration strategy. The comparison of system performances show that some of the solutions we propose give satisfactory results, and suggest that further work on some others would lead to more performance improvement.","PeriodicalId":282153,"journal":{"name":"1996 8th European Signal Processing Conference (EUSIPCO 1996)","volume":"88 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1996-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"1996 8th European Signal Processing Conference (EUSIPCO 1996)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5281/ZENODO.36298","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 10
Abstract
This paper presents our work on the integration of visual data in automatic speech recognition systems. We particularly aim at solving two problems: • classifiation differences for the modeling of acoustic information (phonemes) and visual information (visemes); • the phenomena of anticipation and retention of visemes on the corresponding phonemes. We developed and tested three systems, each dealing with one or both problems and proposing a different integration strategy. The comparison of system performances show that some of the solutions we propose give satisfactory results, and suggest that further work on some others would lead to more performance improvement.