{"title":"基于生成对抗网络的视触觉交叉模态研究","authors":"Yaoyao Li, Huailin Zhao, Huaping Liu, Shan Lu, Y.R. Hou","doi":"10.1049/CCS2.12008","DOIUrl":null,"url":null,"abstract":"Joint Fund of Science & Technology Department of Liaoning Province and State Key Laboratory of Robotics, Grant/Award Number: 2020‐KF‐22‐06; The National Natural Science Foundation Project, Grant/Award Number: 61673238 Abstract Aiming at the research of assisted blind technology, a generative adversarial network model was proposed to complete the transformation of the mode from vision to touch. Firstly, two key representations of visual to tactile sense are identified: the texture image of the object and the audio frequency that generates vibrotactile. It is essentially a matter of generating audio from images. The authors propose a cross‐modal network framework that generates corresponding vibrotactile signals based on texture images. More importantly, the network structure is an end‐to‐end, which eliminates the traditional intermediate form of converting texture image to spectrum image, and can directly carry out the transformation from visual to tactile. A quantitative evaluation system is proposed in this study, which can evaluate the performance of the network model. The experimental results show that the network can complete the conversion of visual information to tactile signals. The proposed method is proved to be superior to the existing method of indirectly generating vibrotactile signals, and the applicability of the model is verified.","PeriodicalId":187152,"journal":{"name":"Cogn. Comput. Syst.","volume":"14 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-04-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Research on visual-tactile cross-modality based on generative adversarial network\",\"authors\":\"Yaoyao Li, Huailin Zhao, Huaping Liu, Shan Lu, Y.R. Hou\",\"doi\":\"10.1049/CCS2.12008\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Joint Fund of Science & Technology Department of Liaoning Province and State Key Laboratory of Robotics, Grant/Award Number: 2020‐KF‐22‐06; The National Natural Science Foundation Project, Grant/Award Number: 61673238 Abstract Aiming at the research of assisted blind technology, a generative adversarial network model was proposed to complete the transformation of the mode from vision to touch. Firstly, two key representations of visual to tactile sense are identified: the texture image of the object and the audio frequency that generates vibrotactile. It is essentially a matter of generating audio from images. The authors propose a cross‐modal network framework that generates corresponding vibrotactile signals based on texture images. More importantly, the network structure is an end‐to‐end, which eliminates the traditional intermediate form of converting texture image to spectrum image, and can directly carry out the transformation from visual to tactile. A quantitative evaluation system is proposed in this study, which can evaluate the performance of the network model. The experimental results show that the network can complete the conversion of visual information to tactile signals. The proposed method is proved to be superior to the existing method of indirectly generating vibrotactile signals, and the applicability of the model is verified.\",\"PeriodicalId\":187152,\"journal\":{\"name\":\"Cogn. Comput. Syst.\",\"volume\":\"14 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-04-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Cogn. Comput. Syst.\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1049/CCS2.12008\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Cogn. Comput. Syst.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1049/CCS2.12008","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Research on visual-tactile cross-modality based on generative adversarial network
Joint Fund of Science & Technology Department of Liaoning Province and State Key Laboratory of Robotics, Grant/Award Number: 2020‐KF‐22‐06; The National Natural Science Foundation Project, Grant/Award Number: 61673238 Abstract Aiming at the research of assisted blind technology, a generative adversarial network model was proposed to complete the transformation of the mode from vision to touch. Firstly, two key representations of visual to tactile sense are identified: the texture image of the object and the audio frequency that generates vibrotactile. It is essentially a matter of generating audio from images. The authors propose a cross‐modal network framework that generates corresponding vibrotactile signals based on texture images. More importantly, the network structure is an end‐to‐end, which eliminates the traditional intermediate form of converting texture image to spectrum image, and can directly carry out the transformation from visual to tactile. A quantitative evaluation system is proposed in this study, which can evaluate the performance of the network model. The experimental results show that the network can complete the conversion of visual information to tactile signals. The proposed method is proved to be superior to the existing method of indirectly generating vibrotactile signals, and the applicability of the model is verified.