Self-Supervised Visuo-Tactile Pretraining to Locate and Follow Garment Features

J. Kerr, Huang Huang, Albert Wilcox, Ryan Hoque, Jeffrey Ichnowski, R. Calandra, Ken Goldberg
{"title":"Self-Supervised Visuo-Tactile Pretraining to Locate and Follow Garment Features","authors":"J. Kerr, Huang Huang, Albert Wilcox, Ryan Hoque, Jeffrey Ichnowski, R. Calandra, Ken Goldberg","doi":"10.15607/RSS.2023.XIX.018","DOIUrl":null,"url":null,"abstract":"Humans make extensive use of vision and touch as complementary senses, with vision providing global information about the scene and touch measuring local information during manipulation without suffering from occlusions. While prior work demonstrates the efficacy of tactile sensing for precise manipulation of deformables, they typically rely on supervised, human-labeled datasets. We propose Self-Supervised Visuo-Tactile Pretraining (SSVTP), a framework for learning multi-task visuo-tactile representations in a self-supervised manner through cross-modal supervision. We design a mechanism that enables a robot to autonomously collect precisely spatially-aligned visual and tactile image pairs, then train visual and tactile encoders to embed these pairs into a shared latent space using cross-modal contrastive loss. We apply this latent space to downstream perception and control of deformable garments on flat surfaces, and evaluate the flexibility of the learned representations without fine-tuning on 5 tasks: feature classification, contact localization, anomaly detection, feature search from a visual query (e.g., garment feature localization under occlusion), and edge following along cloth edges. The pretrained representations achieve a 73-100% success rate on these 5 tasks.","PeriodicalId":248720,"journal":{"name":"Robotics: Science and Systems XIX","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Robotics: Science and Systems XIX","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.15607/RSS.2023.XIX.018","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Humans make extensive use of vision and touch as complementary senses, with vision providing global information about the scene and touch measuring local information during manipulation without suffering from occlusions. While prior work demonstrates the efficacy of tactile sensing for precise manipulation of deformables, they typically rely on supervised, human-labeled datasets. We propose Self-Supervised Visuo-Tactile Pretraining (SSVTP), a framework for learning multi-task visuo-tactile representations in a self-supervised manner through cross-modal supervision. We design a mechanism that enables a robot to autonomously collect precisely spatially-aligned visual and tactile image pairs, then train visual and tactile encoders to embed these pairs into a shared latent space using cross-modal contrastive loss. We apply this latent space to downstream perception and control of deformable garments on flat surfaces, and evaluate the flexibility of the learned representations without fine-tuning on 5 tasks: feature classification, contact localization, anomaly detection, feature search from a visual query (e.g., garment feature localization under occlusion), and edge following along cloth edges. The pretrained representations achieve a 73-100% success rate on these 5 tasks.
定位和跟踪服装特征的自监督视觉-触觉预训练
人类广泛使用视觉和触觉作为互补的感官,在操作过程中,视觉提供关于场景的全局信息,触觉测量局部信息,而不会出现闭塞。虽然先前的工作证明了触觉感知对精确操纵可变形物体的功效,但它们通常依赖于有监督的、人工标记的数据集。我们提出了自监督视觉触觉预训练(SSVTP),这是一个通过跨模态监督以自监督的方式学习多任务视觉触觉表征的框架。我们设计了一种机制,使机器人能够自主收集精确的空间对齐视觉和触觉图像对,然后训练视觉和触觉编码器,利用跨模态对比损失将这些图像对嵌入到共享的潜在空间中。我们将这个潜在空间应用于平面上可变形服装的下游感知和控制,并在没有微调的情况下评估学习表征的灵活性,这5个任务是:特征分类、接触定位、异常检测、从视觉查询中搜索特征(例如,遮挡下的服装特征定位),以及沿布料边缘跟踪。预训练表征在这5个任务上达到了73-100%的成功率。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信