Gaits Generation from a Mimetic Word based on Sound Symbolism

Transactions of The Japanese Society for Artificial Intelligence Pub Date : 2021-09-01 DOI:10.1527/TJSAI.36-5_D-KC7

H. Kato, Takatsugu Hirayama, Keisuke Doman, I. Ide, Yasutomo Kawanishi, Daisuke Deguchi, H. Murase

{"title":"Gaits Generation from a Mimetic Word based on Sound Symbolism","authors":"H. Kato, Takatsugu Hirayama, Keisuke Doman, I. Ide, Yasutomo Kawanishi, Daisuke Deguchi, H. Murase","doi":"10.1527/TJSAI.36-5_D-KC7","DOIUrl":null,"url":null,"abstract":"The Japanese language is known to have a rich vocabulary of mimetic words, which have the property of sound symbolism; Phonemes that compose the mimetic words are strongly related to the impression of various phenomena. Especially, human gait is one of the most commonly represented phenomena by mimetic words expressing its visually dynamic state. Sound symbolism is useful for modeling the relation between gaits and mimetic words intuitively, but there has been no study on their intuitive generation. Most previous gait generation methods set specific class labels such as “elderly” but have not considered the intuitiveness of the generation model. Thus, in this paper, we propose a framework to generate gaits from a mimetic word based on sound symbolism. This framework enables us to generate gaits from one or more mimetic words. It leads to the construction of a generation model represented in a continuous feature space, which is similar to human intuition. Concretely, we train an encoder-decoder model conditioned by a “phonetic vector”, a quantitive representation of mimetic words, with an adaptive instance normalization module inspired by style transfer. The phonetic vector is a dense description of the intuitive impression of a corresponding gait and is calculated from many mimetic words in the HOYO dataset, which includes gait motion data and corresponding mimetic word annotations. Through experiments, we confirmed the effectiveness of the proposed framework.","PeriodicalId":23256,"journal":{"name":"Transactions of The Japanese Society for Artificial Intelligence","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2021-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Transactions of The Japanese Society for Artificial Intelligence","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1527/TJSAI.36-5_D-KC7","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

The Japanese language is known to have a rich vocabulary of mimetic words, which have the property of sound symbolism; Phonemes that compose the mimetic words are strongly related to the impression of various phenomena. Especially, human gait is one of the most commonly represented phenomena by mimetic words expressing its visually dynamic state. Sound symbolism is useful for modeling the relation between gaits and mimetic words intuitively, but there has been no study on their intuitive generation. Most previous gait generation methods set specific class labels such as “elderly” but have not considered the intuitiveness of the generation model. Thus, in this paper, we propose a framework to generate gaits from a mimetic word based on sound symbolism. This framework enables us to generate gaits from one or more mimetic words. It leads to the construction of a generation model represented in a continuous feature space, which is similar to human intuition. Concretely, we train an encoder-decoder model conditioned by a “phonetic vector”, a quantitive representation of mimetic words, with an adaptive instance normalization module inspired by style transfer. The phonetic vector is a dense description of the intuitive impression of a corresponding gait and is calculated from many mimetic words in the HOYO dataset, which includes gait motion data and corresponding mimetic word annotations. Through experiments, we confirmed the effectiveness of the proposed framework.

查看原文本刊更多论文

基于声音象征主义的拟声词步态生成

众所周知，日语有丰富的拟态词词汇，这些拟态词具有声音象征的特性；构成拟态词的音位与各种现象的印象密切相关。尤其是人的步态是最常见的现象之一，通过模仿词来表达其视觉动态状态。声音象征主义有助于直观地模拟步态和拟态词之间的关系，但尚未对其直观生成进行研究。以前的大多数步态生成方法都设置了特定的类别标签，如“老年人”，但没有考虑生成模型的直观性。因此，在本文中，我们提出了一个基于声音象征主义的模仿词生成步态的框架。这个框架使我们能够从一个或多个模仿词中生成步态。它导致了在连续特征空间中表示的生成模型的构建，这类似于人类的直觉。具体来说，我们训练了一个编码器-解码器模型，该模型以“语音向量”为条件，即模仿词的量化表示，并受风格转移的启发，使用自适应实例规范化模块。语音矢量是对相应步态的直观印象的密集描述，并且是从HOYO数据集中的许多模仿词计算的，该数据集包括步态运动数据和相应的模仿词注释。通过实验验证了该框架的有效性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Transactions of The Japanese Society for Artificial Intelligence Computer Science-Artificial Intelligence

CiteScore

0.40

自引率

0.00%

发文量