基于视频的美国手语文本的过渡动作合成

Yulia, Chuan-Kai Yang, Yuan-Cheng Lai
{"title":"基于视频的美国手语文本的过渡动作合成","authors":"Yulia, Chuan-Kai Yang, Yuan-Cheng Lai","doi":"10.1109/ACIT54803.2022.9913078","DOIUrl":null,"url":null,"abstract":"This paper describes a novel approach to provide a text to American Sign Language (ASL) media, a Video-Based Text to ASL. The hearing impaired or we called as the deaf are used to communicate with sign language. When they have to face the spoken language, they have difficulties on reading the spoken words as fast as the people that do not suffer from the hearing problem. The availability of a public dataset named ASL Lexicon Dataset offers the chance to make a video-based interpreter for the deaf. After the dataset has been pre-processed, it is fed to OpenPose library to extract the skeleton of a signer and saved as JSON files. Our system requires a user to input some glosses by texts, and then it finds the JSON files and the videos for the corresponding glosses. The whole sequence of the original video is also used as a transition pool. Later, the corresponding frames of the glosses are used together with the transition pool to construct the sequence of transition frames. After obtaining the sequences, to enhance the smoothness of the motion, we also apply one cross-faded frame in between the transition. Since this algorithm is fully dependent on the transition pool, there is some limitation on making a good transition. If the transition frames with logically and visually correct motions are not available, the result may not be not satisfactory; otherwise, this system can generate smooth transitions.","PeriodicalId":431250,"journal":{"name":"2022 12th International Conference on Advanced Computer Information Technologies (ACIT)","volume":"20 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Video-Based Text to American Sign Language via Transitional Motion Synthesis\",\"authors\":\"Yulia, Chuan-Kai Yang, Yuan-Cheng Lai\",\"doi\":\"10.1109/ACIT54803.2022.9913078\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper describes a novel approach to provide a text to American Sign Language (ASL) media, a Video-Based Text to ASL. The hearing impaired or we called as the deaf are used to communicate with sign language. When they have to face the spoken language, they have difficulties on reading the spoken words as fast as the people that do not suffer from the hearing problem. The availability of a public dataset named ASL Lexicon Dataset offers the chance to make a video-based interpreter for the deaf. After the dataset has been pre-processed, it is fed to OpenPose library to extract the skeleton of a signer and saved as JSON files. Our system requires a user to input some glosses by texts, and then it finds the JSON files and the videos for the corresponding glosses. The whole sequence of the original video is also used as a transition pool. Later, the corresponding frames of the glosses are used together with the transition pool to construct the sequence of transition frames. After obtaining the sequences, to enhance the smoothness of the motion, we also apply one cross-faded frame in between the transition. Since this algorithm is fully dependent on the transition pool, there is some limitation on making a good transition. If the transition frames with logically and visually correct motions are not available, the result may not be not satisfactory; otherwise, this system can generate smooth transitions.\",\"PeriodicalId\":431250,\"journal\":{\"name\":\"2022 12th International Conference on Advanced Computer Information Technologies (ACIT)\",\"volume\":\"20 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-09-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 12th International Conference on Advanced Computer Information Technologies (ACIT)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ACIT54803.2022.9913078\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 12th International Conference on Advanced Computer Information Technologies (ACIT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ACIT54803.2022.9913078","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

本文介绍了一种为美国手语(ASL)媒体提供文本的新方法——基于视频的ASL文本。听力受损的人,或者我们称之为聋人,被用来用手语交流。当他们不得不面对口语时,他们很难像没有听力问题的人一样快地阅读口语。一个名为ASL Lexicon dataset的公共数据集的可用性为聋哑人提供了制作基于视频的口译器的机会。在数据集经过预处理后,将其提供给OpenPose库以提取签名者的骨架并保存为JSON文件。我们的系统需要用户通过文本输入一些注释,然后找到相应注释的JSON文件和视频。原始视频的整个序列也被用作过渡池。然后,将相应的帧与转换池结合使用,构建转换帧序列。在获得序列后,为了增强运动的平滑性,我们还在过渡之间应用了一个交叉褪色帧。由于该算法完全依赖于转换池,因此在进行良好的转换时存在一些限制。如果没有逻辑和视觉上正确运动的过渡帧,结果可能不令人满意;否则,该系统可以生成平滑的转换。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Video-Based Text to American Sign Language via Transitional Motion Synthesis
This paper describes a novel approach to provide a text to American Sign Language (ASL) media, a Video-Based Text to ASL. The hearing impaired or we called as the deaf are used to communicate with sign language. When they have to face the spoken language, they have difficulties on reading the spoken words as fast as the people that do not suffer from the hearing problem. The availability of a public dataset named ASL Lexicon Dataset offers the chance to make a video-based interpreter for the deaf. After the dataset has been pre-processed, it is fed to OpenPose library to extract the skeleton of a signer and saved as JSON files. Our system requires a user to input some glosses by texts, and then it finds the JSON files and the videos for the corresponding glosses. The whole sequence of the original video is also used as a transition pool. Later, the corresponding frames of the glosses are used together with the transition pool to construct the sequence of transition frames. After obtaining the sequences, to enhance the smoothness of the motion, we also apply one cross-faded frame in between the transition. Since this algorithm is fully dependent on the transition pool, there is some limitation on making a good transition. If the transition frames with logically and visually correct motions are not available, the result may not be not satisfactory; otherwise, this system can generate smooth transitions.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信