{"title":"Attention Modeling with Temporal Shift in Sign Language Recognition","authors":"Ahmet Faruk Celimli, Ogulcan Özdemir, L. Akarun","doi":"10.1109/SIU55565.2022.9864987","DOIUrl":null,"url":null,"abstract":"Sign languages are visual languages expressed with multiple cues including facial expressions, upper-body and hand gestures. These different visual cues can be used together or at different instants to convey the message. In order to recognize sign languages, it is crucial to model what, where and when to attend. In this study, we developed a model to use different visual cues at the same time by using Temporal Shift Modules (TSMs) and attention modeling. Our experiments are conducted with BospohorusSign22k dataset. Our system has achieved 92.46% recognition accuracy and improved the performance approximately 14% compared to the baseline study with 78.85% accuracy.","PeriodicalId":115446,"journal":{"name":"2022 30th Signal Processing and Communications Applications Conference (SIU)","volume":"284 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 30th Signal Processing and Communications Applications Conference (SIU)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SIU55565.2022.9864987","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Sign languages are visual languages expressed with multiple cues including facial expressions, upper-body and hand gestures. These different visual cues can be used together or at different instants to convey the message. In order to recognize sign languages, it is crucial to model what, where and when to attend. In this study, we developed a model to use different visual cues at the same time by using Temporal Shift Modules (TSMs) and attention modeling. Our experiments are conducted with BospohorusSign22k dataset. Our system has achieved 92.46% recognition accuracy and improved the performance approximately 14% compared to the baseline study with 78.85% accuracy.