{"title":"基于视频的美国手语文本的过渡动作合成","authors":"Yulia, Chuan-Kai Yang, Yuan-Cheng Lai","doi":"10.1109/ACIT54803.2022.9913078","DOIUrl":null,"url":null,"abstract":"This paper describes a novel approach to provide a text to American Sign Language (ASL) media, a Video-Based Text to ASL. The hearing impaired or we called as the deaf are used to communicate with sign language. When they have to face the spoken language, they have difficulties on reading the spoken words as fast as the people that do not suffer from the hearing problem. The availability of a public dataset named ASL Lexicon Dataset offers the chance to make a video-based interpreter for the deaf. After the dataset has been pre-processed, it is fed to OpenPose library to extract the skeleton of a signer and saved as JSON files. Our system requires a user to input some glosses by texts, and then it finds the JSON files and the videos for the corresponding glosses. The whole sequence of the original video is also used as a transition pool. Later, the corresponding frames of the glosses are used together with the transition pool to construct the sequence of transition frames. After obtaining the sequences, to enhance the smoothness of the motion, we also apply one cross-faded frame in between the transition. Since this algorithm is fully dependent on the transition pool, there is some limitation on making a good transition. If the transition frames with logically and visually correct motions are not available, the result may not be not satisfactory; otherwise, this system can generate smooth transitions.","PeriodicalId":431250,"journal":{"name":"2022 12th International Conference on Advanced Computer Information Technologies (ACIT)","volume":"20 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Video-Based Text to American Sign Language via Transitional Motion Synthesis\",\"authors\":\"Yulia, Chuan-Kai Yang, Yuan-Cheng Lai\",\"doi\":\"10.1109/ACIT54803.2022.9913078\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper describes a novel approach to provide a text to American Sign Language (ASL) media, a Video-Based Text to ASL. The hearing impaired or we called as the deaf are used to communicate with sign language. When they have to face the spoken language, they have difficulties on reading the spoken words as fast as the people that do not suffer from the hearing problem. The availability of a public dataset named ASL Lexicon Dataset offers the chance to make a video-based interpreter for the deaf. After the dataset has been pre-processed, it is fed to OpenPose library to extract the skeleton of a signer and saved as JSON files. Our system requires a user to input some glosses by texts, and then it finds the JSON files and the videos for the corresponding glosses. The whole sequence of the original video is also used as a transition pool. Later, the corresponding frames of the glosses are used together with the transition pool to construct the sequence of transition frames. After obtaining the sequences, to enhance the smoothness of the motion, we also apply one cross-faded frame in between the transition. Since this algorithm is fully dependent on the transition pool, there is some limitation on making a good transition. If the transition frames with logically and visually correct motions are not available, the result may not be not satisfactory; otherwise, this system can generate smooth transitions.\",\"PeriodicalId\":431250,\"journal\":{\"name\":\"2022 12th International Conference on Advanced Computer Information Technologies (ACIT)\",\"volume\":\"20 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-09-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 12th International Conference on Advanced Computer Information Technologies (ACIT)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ACIT54803.2022.9913078\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 12th International Conference on Advanced Computer Information Technologies (ACIT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ACIT54803.2022.9913078","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Video-Based Text to American Sign Language via Transitional Motion Synthesis
This paper describes a novel approach to provide a text to American Sign Language (ASL) media, a Video-Based Text to ASL. The hearing impaired or we called as the deaf are used to communicate with sign language. When they have to face the spoken language, they have difficulties on reading the spoken words as fast as the people that do not suffer from the hearing problem. The availability of a public dataset named ASL Lexicon Dataset offers the chance to make a video-based interpreter for the deaf. After the dataset has been pre-processed, it is fed to OpenPose library to extract the skeleton of a signer and saved as JSON files. Our system requires a user to input some glosses by texts, and then it finds the JSON files and the videos for the corresponding glosses. The whole sequence of the original video is also used as a transition pool. Later, the corresponding frames of the glosses are used together with the transition pool to construct the sequence of transition frames. After obtaining the sequences, to enhance the smoothness of the motion, we also apply one cross-faded frame in between the transition. Since this algorithm is fully dependent on the transition pool, there is some limitation on making a good transition. If the transition frames with logically and visually correct motions are not available, the result may not be not satisfactory; otherwise, this system can generate smooth transitions.