{"title":"Instance Tracking in 3D Scenes from Egocentric Videos","authors":"Yunhan Zhao, Haoyu Ma, Shu Kong, Charless Fowlkes","doi":"arxiv-2312.04117","DOIUrl":null,"url":null,"abstract":"Egocentric sensors such as AR/VR devices capture human-object interactions\nand offer the potential to provide task-assistance by recalling 3D locations of\nobjects of interest in the surrounding environment. This capability requires\ninstance tracking in real-world 3D scenes from egocentric videos (IT3DEgo). We\nexplore this problem by first introducing a new benchmark dataset, consisting\nof RGB and depth videos, per-frame camera pose, and instance-level annotations\nin both 2D camera and 3D world coordinates. We present an evaluation protocol\nwhich evaluates tracking performance in 3D coordinates with two settings for\nenrolling instances to track: (1) single-view online enrollment where an\ninstance is specified on-the-fly based on the human wearer's interactions. and\n(2) multi-view pre-enrollment where images of an instance to be tracked are\nstored in memory ahead of time. To address IT3DEgo, we first re-purpose methods\nfrom relevant areas, e.g., single object tracking (SOT) -- running SOT methods\nto track instances in 2D frames and lifting them to 3D using camera pose and\ndepth. We also present a simple method that leverages pretrained segmentation\nand detection models to generate proposals from RGB frames and match proposals\nwith enrolled instance images. Perhaps surprisingly, our extensive experiments\nshow that our method (with no finetuning) significantly outperforms SOT-based\napproaches. We conclude by arguing that the problem of egocentric instance\ntracking is made easier by leveraging camera pose and using a 3D allocentric\n(world) coordinate representation.","PeriodicalId":48599,"journal":{"name":"Journal of Integrative Medicine-Jim","volume":null,"pages":null},"PeriodicalIF":4.2000,"publicationDate":"2023-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Integrative Medicine-Jim","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2312.04117","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"INTEGRATIVE & COMPLEMENTARY MEDICINE","Score":null,"Total":0}
引用次数: 0
Abstract
Egocentric sensors such as AR/VR devices capture human-object interactions
and offer the potential to provide task-assistance by recalling 3D locations of
objects of interest in the surrounding environment. This capability requires
instance tracking in real-world 3D scenes from egocentric videos (IT3DEgo). We
explore this problem by first introducing a new benchmark dataset, consisting
of RGB and depth videos, per-frame camera pose, and instance-level annotations
in both 2D camera and 3D world coordinates. We present an evaluation protocol
which evaluates tracking performance in 3D coordinates with two settings for
enrolling instances to track: (1) single-view online enrollment where an
instance is specified on-the-fly based on the human wearer's interactions. and
(2) multi-view pre-enrollment where images of an instance to be tracked are
stored in memory ahead of time. To address IT3DEgo, we first re-purpose methods
from relevant areas, e.g., single object tracking (SOT) -- running SOT methods
to track instances in 2D frames and lifting them to 3D using camera pose and
depth. We also present a simple method that leverages pretrained segmentation
and detection models to generate proposals from RGB frames and match proposals
with enrolled instance images. Perhaps surprisingly, our extensive experiments
show that our method (with no finetuning) significantly outperforms SOT-based
approaches. We conclude by arguing that the problem of egocentric instance
tracking is made easier by leveraging camera pose and using a 3D allocentric
(world) coordinate representation.
期刊介绍:
The predecessor of JIM is the Journal of Chinese Integrative Medicine (Zhong Xi Yi Jie He Xue Bao). With this new, English-language publication, we are committed to make JIM an international platform for publishing high-quality papers on complementary and alternative medicine (CAM) and an open forum in which the different professions and international scholarly communities can exchange views, share research and their clinical experience, discuss CAM education, and confer about issues and problems in our various disciplines and in CAM as a whole in order to promote integrative medicine.
JIM is indexed/abstracted in: MEDLINE/PubMed, ScienceDirect, Emerging Sources Citation Index (ESCI), Scopus, Embase, Chemical Abstracts (CA), CAB Abstracts, EBSCO, WPRIM, JST China, Chinese Science Citation Database (CSCD), and China National Knowledge Infrastructure (CNKI).
JIM Editorial Office uses ThomsonReuters ScholarOne Manuscripts as submitting and review system (submission link: http://mc03.manuscriptcentral.com/jcim-en).
JIM is published bimonthly. Manuscripts submitted to JIM should be written in English. Article types include but are not limited to randomized controlled and pragmatic trials, translational and patient-centered effectiveness outcome studies, case series and reports, clinical trial protocols, preclinical and basic science studies, systematic reviews and meta-analyses, papers on methodology and CAM history or education, conference proceedings, editorials, commentaries, short communications, book reviews, and letters to the editor.
Our purpose is to publish a prestigious international journal for studies in integrative medicine. To achieve this aim, we seek to publish high-quality papers on any aspects of integrative medicine, such as acupuncture and traditional Chinese medicine, Ayurveda medicine, herbal medicine, homeopathy, nutrition, chiropractic, mind-body medicine, taichi, qigong, meditation, and any other modalities of CAM; our commitment to international scope ensures that research and progress from all regions of the world are widely covered. These ensure that articles published in JIM have the maximum exposure to the international scholarly community.
JIM can help its authors let their papers reach the widest possible range of readers, and let all those who share an interest in their research field be concerned with their study.