面向多维触觉渲染的时间力触觉数据跨模态生成算法

IF 9.7 1区计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

IEEE Transactions on Multimedia Pub Date : 2025-07-21 DOI:10.1109/TMM.2025.3590907

Rui Song;Guohong Liu;Yan Zhang;Xiaoying Sun

{"title":"面向多维触觉渲染的时间力触觉数据跨模态生成算法","authors":"Rui Song;Guohong Liu;Yan Zhang;Xiaoying Sun","doi":"10.1109/TMM.2025.3590907","DOIUrl":null,"url":null,"abstract":"Exploiting the correlation between multimodal data to generate tactile data has become a preferred approach to enhance tactile rendering fidelity. Nevertheless, existing studies have often overlooked the temporal dynamics of force tactile data. To fill this gap in the literature, this paper introduces a joint visual-audio approach to generate a temporal tactile data (VA2T) algorithm, focusing on the temporal and long-term dependencies of force tactile data. VA2T uses a feature extraction network to extract audio and image features and then uses an attention mechanism and decoder to fuse these features. The tactile reconstructor generates temporal friction and a normal force, with dilated causal convolution securing the temporal dependencies in the force tactile data. Simulation experiments on the LMT dataset demonstrate that compared with the transformer and audio-visual-aided haptic signal reconstruction (AVHR) algorithms, the VA2T algorithm reduces the RMSE for generated friction by 29.44% and 32.37%, respectively, and for normal forces by 23.30% and 35.43%, respectively. In addition, we developed a haptic rendering approach that combines electrovibration and mechanical vibration to render the generated friction and normal force. The subjective experimental results showed that the rendering fidelity of the data generated using the VA2T method was significantly higher than that of the data generated using the transformer and AVHR methods.","PeriodicalId":13273,"journal":{"name":"IEEE Transactions on Multimedia","volume":"27 ","pages":"5092-5102"},"PeriodicalIF":9.7000,"publicationDate":"2025-07-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A Cross-Modal Generation Algorithm for Temporal Force Tactile Data for Multidimensional Haptic Rendering\",\"authors\":\"Rui Song;Guohong Liu;Yan Zhang;Xiaoying Sun\",\"doi\":\"10.1109/TMM.2025.3590907\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Exploiting the correlation between multimodal data to generate tactile data has become a preferred approach to enhance tactile rendering fidelity. Nevertheless, existing studies have often overlooked the temporal dynamics of force tactile data. To fill this gap in the literature, this paper introduces a joint visual-audio approach to generate a temporal tactile data (VA2T) algorithm, focusing on the temporal and long-term dependencies of force tactile data. VA2T uses a feature extraction network to extract audio and image features and then uses an attention mechanism and decoder to fuse these features. The tactile reconstructor generates temporal friction and a normal force, with dilated causal convolution securing the temporal dependencies in the force tactile data. Simulation experiments on the LMT dataset demonstrate that compared with the transformer and audio-visual-aided haptic signal reconstruction (AVHR) algorithms, the VA2T algorithm reduces the RMSE for generated friction by 29.44% and 32.37%, respectively, and for normal forces by 23.30% and 35.43%, respectively. In addition, we developed a haptic rendering approach that combines electrovibration and mechanical vibration to render the generated friction and normal force. The subjective experimental results showed that the rendering fidelity of the data generated using the VA2T method was significantly higher than that of the data generated using the transformer and AVHR methods.\",\"PeriodicalId\":13273,\"journal\":{\"name\":\"IEEE Transactions on Multimedia\",\"volume\":\"27 \",\"pages\":\"5092-5102\"},\"PeriodicalIF\":9.7000,\"publicationDate\":\"2025-07-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Multimedia\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/11086402/\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Multimedia","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/11086402/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

摘要

利用多模态数据之间的相关性生成触觉数据已成为提高触觉渲染保真度的首选方法。然而，现有的研究往往忽略了力触觉数据的时间动态。为了填补这一文献空白，本文引入了一种联合视听方法来生成时间触觉数据（VA2T）算法，重点关注力触觉数据的时间和长期依赖性。VA2T使用特征提取网络提取音频和图像特征，然后使用注意机制和解码器融合这些特征。触觉重构器产生时间摩擦和法向力，扩展的因果卷积确保了力触觉数据的时间依赖性。在LMT数据集上的仿真实验表明，与变压器和视听辅助触觉信号重建（AVHR）算法相比，VA2T算法对产生摩擦的RMSE分别降低了29.44%和32.37%，对法向力的RMSE分别降低了23.30%和35.43%。此外，我们开发了一种结合电振动和机械振动的触觉渲染方法来渲染产生的摩擦和法向力。主观实验结果表明，使用VA2T方法生成的数据的渲染保真度明显高于使用变压器和AVHR方法生成的数据。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

A Cross-Modal Generation Algorithm for Temporal Force Tactile Data for Multidimensional Haptic Rendering

Exploiting the correlation between multimodal data to generate tactile data has become a preferred approach to enhance tactile rendering fidelity. Nevertheless, existing studies have often overlooked the temporal dynamics of force tactile data. To fill this gap in the literature, this paper introduces a joint visual-audio approach to generate a temporal tactile data (VA2T) algorithm, focusing on the temporal and long-term dependencies of force tactile data. VA2T uses a feature extraction network to extract audio and image features and then uses an attention mechanism and decoder to fuse these features. The tactile reconstructor generates temporal friction and a normal force, with dilated causal convolution securing the temporal dependencies in the force tactile data. Simulation experiments on the LMT dataset demonstrate that compared with the transformer and audio-visual-aided haptic signal reconstruction (AVHR) algorithms, the VA2T algorithm reduces the RMSE for generated friction by 29.44% and 32.37%, respectively, and for normal forces by 23.30% and 35.43%, respectively. In addition, we developed a haptic rendering approach that combines electrovibration and mechanical vibration to render the generated friction and normal force. The subjective experimental results showed that the rendering fidelity of the data generated using the VA2T method was significantly higher than that of the data generated using the transformer and AVHR methods.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE Transactions on Multimedia 工程技术-电信学

CiteScore

11.70

自引率

11.00%

发文量

576

审稿时长

5.5 months

期刊介绍： The IEEE Transactions on Multimedia delves into diverse aspects of multimedia technology and applications, covering circuits, networking, signal processing, systems, software, and systems integration. The scope aligns with the Fields of Interest of the sponsors, ensuring a comprehensive exploration of research in multimedia.