基于视频的美国手语文本的过渡动作合成

2022 12th International Conference on Advanced Computer Information Technologies (ACIT) Pub Date : 2022-09-26 DOI:10.1109/ACIT54803.2022.9913078

Yulia, Chuan-Kai Yang, Yuan-Cheng Lai

{"title":"基于视频的美国手语文本的过渡动作合成","authors":"Yulia, Chuan-Kai Yang, Yuan-Cheng Lai","doi":"10.1109/ACIT54803.2022.9913078","DOIUrl":null,"url":null,"abstract":"This paper describes a novel approach to provide a text to American Sign Language (ASL) media, a Video-Based Text to ASL. The hearing impaired or we called as the deaf are used to communicate with sign language. When they have to face the spoken language, they have difficulties on reading the spoken words as fast as the people that do not suffer from the hearing problem. The availability of a public dataset named ASL Lexicon Dataset offers the chance to make a video-based interpreter for the deaf. After the dataset has been pre-processed, it is fed to OpenPose library to extract the skeleton of a signer and saved as JSON files. Our system requires a user to input some glosses by texts, and then it finds the JSON files and the videos for the corresponding glosses. The whole sequence of the original video is also used as a transition pool. Later, the corresponding frames of the glosses are used together with the transition pool to construct the sequence of transition frames. After obtaining the sequences, to enhance the smoothness of the motion, we also apply one cross-faded frame in between the transition. Since this algorithm is fully dependent on the transition pool, there is some limitation on making a good transition. If the transition frames with logically and visually correct motions are not available, the result may not be not satisfactory; otherwise, this system can generate smooth transitions.","PeriodicalId":431250,"journal":{"name":"2022 12th International Conference on Advanced Computer Information Technologies (ACIT)","volume":"20 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Video-Based Text to American Sign Language via Transitional Motion Synthesis\",\"authors\":\"Yulia, Chuan-Kai Yang, Yuan-Cheng Lai\",\"doi\":\"10.1109/ACIT54803.2022.9913078\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper describes a novel approach to provide a text to American Sign Language (ASL) media, a Video-Based Text to ASL. The hearing impaired or we called as the deaf are used to communicate with sign language. When they have to face the spoken language, they have difficulties on reading the spoken words as fast as the people that do not suffer from the hearing problem. The availability of a public dataset named ASL Lexicon Dataset offers the chance to make a video-based interpreter for the deaf. After the dataset has been pre-processed, it is fed to OpenPose library to extract the skeleton of a signer and saved as JSON files. Our system requires a user to input some glosses by texts, and then it finds the JSON files and the videos for the corresponding glosses. The whole sequence of the original video is also used as a transition pool. Later, the corresponding frames of the glosses are used together with the transition pool to construct the sequence of transition frames. After obtaining the sequences, to enhance the smoothness of the motion, we also apply one cross-faded frame in between the transition. Since this algorithm is fully dependent on the transition pool, there is some limitation on making a good transition. If the transition frames with logically and visually correct motions are not available, the result may not be not satisfactory; otherwise, this system can generate smooth transitions.\",\"PeriodicalId\":431250,\"journal\":{\"name\":\"2022 12th International Conference on Advanced Computer Information Technologies (ACIT)\",\"volume\":\"20 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-09-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 12th International Conference on Advanced Computer Information Technologies (ACIT)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ACIT54803.2022.9913078\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 12th International Conference on Advanced Computer Information Technologies (ACIT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ACIT54803.2022.9913078","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

本文介绍了一种为美国手语(ASL)媒体提供文本的新方法——基于视频的ASL文本。听力受损的人，或者我们称之为聋人，被用来用手语交流。当他们不得不面对口语时，他们很难像没有听力问题的人一样快地阅读口语。一个名为ASL Lexicon dataset的公共数据集的可用性为聋哑人提供了制作基于视频的口译器的机会。在数据集经过预处理后，将其提供给OpenPose库以提取签名者的骨架并保存为JSON文件。我们的系统需要用户通过文本输入一些注释，然后找到相应注释的JSON文件和视频。原始视频的整个序列也被用作过渡池。然后，将相应的帧与转换池结合使用，构建转换帧序列。在获得序列后，为了增强运动的平滑性，我们还在过渡之间应用了一个交叉褪色帧。由于该算法完全依赖于转换池，因此在进行良好的转换时存在一些限制。如果没有逻辑和视觉上正确运动的过渡帧，结果可能不令人满意;否则，该系统可以生成平滑的转换。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Video-Based Text to American Sign Language via Transitional Motion Synthesis

This paper describes a novel approach to provide a text to American Sign Language (ASL) media, a Video-Based Text to ASL. The hearing impaired or we called as the deaf are used to communicate with sign language. When they have to face the spoken language, they have difficulties on reading the spoken words as fast as the people that do not suffer from the hearing problem. The availability of a public dataset named ASL Lexicon Dataset offers the chance to make a video-based interpreter for the deaf. After the dataset has been pre-processed, it is fed to OpenPose library to extract the skeleton of a signer and saved as JSON files. Our system requires a user to input some glosses by texts, and then it finds the JSON files and the videos for the corresponding glosses. The whole sequence of the original video is also used as a transition pool. Later, the corresponding frames of the glosses are used together with the transition pool to construct the sequence of transition frames. After obtaining the sequences, to enhance the smoothness of the motion, we also apply one cross-faded frame in between the transition. Since this algorithm is fully dependent on the transition pool, there is some limitation on making a good transition. If the transition frames with logically and visually correct motions are not available, the result may not be not satisfactory; otherwise, this system can generate smooth transitions.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2022 12th International Conference on Advanced Computer Information Technologies (ACIT)

自引率

0.00%

发文量