{"title":"通过基于相位的欧拉视频放大改进唇读的短语识别","authors":"Salam Nandakishor, D. Pati","doi":"10.1109/NCC52529.2021.9530021","DOIUrl":null,"url":null,"abstract":"Lip reading is a technique to understand speech by visual observations of the lip movements. While speaking the subtle motion or temporal variations of our mouth are generally invisible by naked humans eyes. It is mainly due to the limited range of visual perception. These imperceptible visual information consist of useful hidden information. The Eulerian video magnification (EVM) technique is used to magnify the video for revealing such hidden information. In this work, the phase based EVM method is used to magnify the subtle spatial and temporal information of the mouth movements for phrases recognition task. The local binary pattern histogram extracted from three orthogonal plane (XY, XT and YT), known as LBP-TOP is used as visual feature to represent mouth movements. The support vector machine (SVM) is used for recognition of phrases. The experiments are performed on OuluVS database. The lip-reading approach without EVM provides 62% accuracy whereas the phase based EVM method provides 70% accuracy. This shows that the proposed method extracts comparatively more robust and discriminative visual features for phrase recognition task.","PeriodicalId":414087,"journal":{"name":"2021 National Conference on Communications (NCC)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-07-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Phrase recognition using Improved Lip reading through Phase-Based Eulerian Video Magnification\",\"authors\":\"Salam Nandakishor, D. Pati\",\"doi\":\"10.1109/NCC52529.2021.9530021\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Lip reading is a technique to understand speech by visual observations of the lip movements. While speaking the subtle motion or temporal variations of our mouth are generally invisible by naked humans eyes. It is mainly due to the limited range of visual perception. These imperceptible visual information consist of useful hidden information. The Eulerian video magnification (EVM) technique is used to magnify the video for revealing such hidden information. In this work, the phase based EVM method is used to magnify the subtle spatial and temporal information of the mouth movements for phrases recognition task. The local binary pattern histogram extracted from three orthogonal plane (XY, XT and YT), known as LBP-TOP is used as visual feature to represent mouth movements. The support vector machine (SVM) is used for recognition of phrases. The experiments are performed on OuluVS database. The lip-reading approach without EVM provides 62% accuracy whereas the phase based EVM method provides 70% accuracy. This shows that the proposed method extracts comparatively more robust and discriminative visual features for phrase recognition task.\",\"PeriodicalId\":414087,\"journal\":{\"name\":\"2021 National Conference on Communications (NCC)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-07-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 National Conference on Communications (NCC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/NCC52529.2021.9530021\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 National Conference on Communications (NCC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/NCC52529.2021.9530021","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Phrase recognition using Improved Lip reading through Phase-Based Eulerian Video Magnification
Lip reading is a technique to understand speech by visual observations of the lip movements. While speaking the subtle motion or temporal variations of our mouth are generally invisible by naked humans eyes. It is mainly due to the limited range of visual perception. These imperceptible visual information consist of useful hidden information. The Eulerian video magnification (EVM) technique is used to magnify the video for revealing such hidden information. In this work, the phase based EVM method is used to magnify the subtle spatial and temporal information of the mouth movements for phrases recognition task. The local binary pattern histogram extracted from three orthogonal plane (XY, XT and YT), known as LBP-TOP is used as visual feature to represent mouth movements. The support vector machine (SVM) is used for recognition of phrases. The experiments are performed on OuluVS database. The lip-reading approach without EVM provides 62% accuracy whereas the phase based EVM method provides 70% accuracy. This shows that the proposed method extracts comparatively more robust and discriminative visual features for phrase recognition task.