2021 18th Conference on Robots and Vision (CRV)最新文献

筛选
英文 中文
Program Committee 项目委员会
2021 18th Conference on Robots and Vision (CRV) Pub Date : 2022-12-01 DOI: 10.1109/SEFM.2005.36
W. Mikhael, Abdullah Eroglu, Liang Chen, G. Záruba, C.-C. Jay Kuo
{"title":"Program Committee","authors":"W. Mikhael, Abdullah Eroglu, Liang Chen, G. Záruba, C.-C. Jay Kuo","doi":"10.1109/SEFM.2005.36","DOIUrl":"https://doi.org/10.1109/SEFM.2005.36","url":null,"abstract":"Jun Ai, Beihang University, China Doo-Hwan Bae, Korea Advanced Institute of Science and Technology, Korea Mark Bentsen, Argo Data, USA Lon Chase, IEEE Reliability Society, USA Yixiang Chen, East China Normal University, China Zhenyu Chen, Nanjing University, China Byoungju Choi, Ewha Womans University, Korea William Chu, Tunghai University, Taiwan Sunita Chulani, Cisco, USA Vidroha Debroy, AT&T (USA), USA Junhua Ding, University of North Texas, USA Tadashi Dohi, Hiroshima University, Japan Jian Dong, Harbin Institute of Technology, China Wei Dong, National University of Defense Technology, China Yunwei Dong, Northwestern Polytechnical University, China Lance Fiondella, University of Massachusetts Dartmouth, USA Ruizhi Gao, Sonos Inc., USA Bing Guo, Sichuan University, China Tom Hill, The Fellows Consulting Group, USA Birgit Hofer, Graz University of Technology, Austria Chin-Yu Huang, National Tsing Hua University, Taiwan Zhao Ji, Guangdong Ocean University, China Chuan Li, Chongqing Technology and Business University, China Jenny Li, Kean University, USA Steve Li, Western New England University, USA Yihao Li, Graz University of Technology, Austria Yun Lin, Harbin Engineering University, China Shaoying Liu, Hiroshima University, Japan José Maldonado, University of São Paulo, Brazil Nick Multari, Pacific Northwest National Laboratory, USA Manuel Nuñez, Universidad Complutense de Madrid, Spain Pete Rotella, Cisco, USA Mike Siok, University of Texas at Arlington, USA Hongwei Tao, Zhengzhou University of Light Industry, China Nguyen Tien, University of Texas at Dallas, USA Tugkan Tuglular, Izmir Institute of Technology, Turkey Auri Vincenzi, Federal University of São Carlos, Brazil Jian Wang, Chinese Academy of Sciences, China Yong Wang, Anhui University of Engineering, China Ziyuan Wang, Nanjing University of Posts and Telecommunications, China Franz Wotawa, Graz University of Technology, Austria Qinggang Wu, Zhengzhou University of Light Industry, China Jianwen Xiang, Wuhan University of Technology, China Dianxing Xu, University of Missouri Kansas City, USA Han Xu, Huawei Company, China Hongji Yang, University of Leicester, UK","PeriodicalId":413697,"journal":{"name":"2021 18th Conference on Robots and Vision (CRV)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126346898","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
RADDet: Range-Azimuth-Doppler based Radar Object Detection for Dynamic Road Users RADDet:基于距离-方位-多普勒的动态道路使用者雷达目标检测
2021 18th Conference on Robots and Vision (CRV) Pub Date : 2021-05-01 DOI: 10.1109/CRV52889.2021.00021
Ao Zhang, F. Nowruzi, R. Laganière
{"title":"RADDet: Range-Azimuth-Doppler based Radar Object Detection for Dynamic Road Users","authors":"Ao Zhang, F. Nowruzi, R. Laganière","doi":"10.1109/CRV52889.2021.00021","DOIUrl":"https://doi.org/10.1109/CRV52889.2021.00021","url":null,"abstract":"Object detection using automotive radars has not been explored with deep learning models in comparison to the camera based approaches. This can be attributed to the lack of public radar datasets. In this paper, we collect a novel radar dataset that contains radar data in the form of Range-AzimuthDoppler tensors along with the bounding boxes on the tensor for dynamic road users, category labels, and 2D bounding boxes on the Cartesian Bird-Eye-View range map. To build the dataset, we propose an instance-wise auto-annotation method. Furthermore, a novel Range-Azimuth-Doppler based multiclass object detection deep learning model is proposed. The algorithm is a one-stage anchor-based detector that generates both 3D bounding boxes and 2D bounding boxes on RangeAzimuth-Doppler and Cartesian domains, respectively. Our proposed algorithm achieves 56.3% AP with IOU of 0.3 on 3D bounding box predictions, and 51.6% with IOU of 0.5 on 2D bounding box prediction. Our dataset and the code can be found at https://github.com/ZhangAoCanada/RADDet.git.","PeriodicalId":413697,"journal":{"name":"2021 18th Conference on Robots and Vision (CRV)","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125933773","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 42
Preservation of High Frequency Content for Deep Learning-Based Medical Image Classification 基于深度学习的医学图像分类中高频内容的保存
2021 18th Conference on Robots and Vision (CRV) Pub Date : 2021-05-01 DOI: 10.1109/CRV52889.2021.00010
Declan McIntosh, T. Marques, A. Albu
{"title":"Preservation of High Frequency Content for Deep Learning-Based Medical Image Classification","authors":"Declan McIntosh, T. Marques, A. Albu","doi":"10.1109/CRV52889.2021.00010","DOIUrl":"https://doi.org/10.1109/CRV52889.2021.00010","url":null,"abstract":"Chest radiographs are used for the diagnosis of multiple critical illnesses (e.g., Pneumonia, heart failure, lung cancer), for this reason, systems for the automatic or semi-automatic analysis of these data are of particular interest. An efficient analysis of large amounts of chest radiographs can aid physicians and radiologists, ultimately allowing for better medical care of lung-, heart- and chest-related conditions. We propose a novel Discrete Wavelet Transform (DWT)-based method for the efficient identification and encoding of visual information that is typically lost in the down-sampling of high-resolution radiographs, a common step in computer-aided diagnostic pipelines. Our proposed approach requires only slight modifications to the input of existing state-of-the-art Convolutional Neural Networks (CNNs), making it easily applicable to existing image classification frameworks. We show that the extra high-frequency components offered by our method increased the classification performance of several CNNs in benchmarks employing the NIH Chest-8 and ImageNet-2017 datasets. Based on our results we hypothesize that providing frequency-specific coefficients allows the CNNs to specialize in the identification of structures that are particular to a frequency band, ultimately increasing classification performance, without an increase in computational load. The implementation of our work is available at github.com/DeclanMcIntosh/LeGallCuda.","PeriodicalId":413697,"journal":{"name":"2021 18th Conference on Robots and Vision (CRV)","volume":"75 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131637372","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
2LSPE: 2D Learnable Sinusoidal Positional Encoding using Transformer for Scene Text Recognition 使用变压器进行场景文本识别的二维可学习正弦位置编码
2021 18th Conference on Robots and Vision (CRV) Pub Date : 2021-05-01 DOI: 10.1109/CRV52889.2021.00024
Z. Raisi, Mohamed A. Naiel, Georges Younes, Steven Wardell, J. Zelek
{"title":"2LSPE: 2D Learnable Sinusoidal Positional Encoding using Transformer for Scene Text Recognition","authors":"Z. Raisi, Mohamed A. Naiel, Georges Younes, Steven Wardell, J. Zelek","doi":"10.1109/CRV52889.2021.00024","DOIUrl":"https://doi.org/10.1109/CRV52889.2021.00024","url":null,"abstract":"Positional Encoding (PE) plays a vital role in a Transformer’s ability to capture the order of sequential information, allowing it to overcome the permutation equivarience property. Recent state-of-the-art Transformer-based scene text recognition methods have leveraged the advantages of the 2D form of PE with fixed sinusoidal frequencies, also known as 2SPE, to better encode the 2D spatial dependencies of characters in a scene text image. These 2SPE-based Transformer frameworks have outperformed Recurrent Neural Networks (RNNs) based methods, mostly on recognizing text of arbitrary shapes; However, they are not tailored to the type of data and classification task at hand. In this paper, we extend a recent Learnable Sinusoidal frequencies PE (LSPE) from 1D to 2D, which we hereafter refer to as 2LSPE, and study how to adaptively choose the sinusoidal frequencies from the input training data. Moreover, we show how to apply the proposed Transformer architecture for scene text recognition. We compare our method against 11 state-of-the-art methods and show that it outperforms them in over 50% of the standard tests and are no worse than the second best performer, whereas we outperform all other methods on irregular text datasets (i.e., non horizontal or vertical layouts). Experimental results demonstrate that the proposed method offers higher word recognition accuracy (WRA) than two recent Transformer-based methods, and eleven state-of-theart RNN-based techniques on four challenging irregular-text recognition datasets, all while maintaining the highest WRA values on the regular-text datasets.","PeriodicalId":413697,"journal":{"name":"2021 18th Conference on Robots and Vision (CRV)","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126160990","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Mobile Manipulation in Unknown Environments with Differential Inverse Kinematics Control 基于微分逆运动学控制的未知环境下移动操纵
2021 18th Conference on Robots and Vision (CRV) Pub Date : 2021-05-01 DOI: 10.1109/CRV52889.2021.00017
Adam Heins, M. Jakob, Angela P. Schoellig
{"title":"Mobile Manipulation in Unknown Environments with Differential Inverse Kinematics Control","authors":"Adam Heins, M. Jakob, Angela P. Schoellig","doi":"10.1109/CRV52889.2021.00017","DOIUrl":"https://doi.org/10.1109/CRV52889.2021.00017","url":null,"abstract":"Mobile manipulators combine the large workspace of mobile robots with the interactive capabilities of manipulator arms, making them useful in a variety of domains including construction and assistive care. We propose a differential inverse kinematics whole-body control approach for position-controlled industrial mobile manipulators. Our controller is capable of task-space trajectory tracking, force regulation, obstacle and singularity avoidance, and pushing an object toward a goal location, with limited sensing and knowledge of the environment. We evaluate the proposed approach through extensive experiments on a 9 degree-of-freedom omnidirectional mobile manipulator. A video demonstrating many of the experiments can be found at http://tiny.cc/crv21-mm.","PeriodicalId":413697,"journal":{"name":"2021 18th Conference on Robots and Vision (CRV)","volume":"117 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123051851","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Lidar Scan Registration Robust to Extreme Motions 激光雷达扫描配准鲁棒极端运动
2021 18th Conference on Robots and Vision (CRV) Pub Date : 2021-05-01 DOI: 10.1109/CRV52889.2021.00014
Simon-Pierre Deschênes, D. Baril, V. Kubelka, P. Giguère, F. Pomerleau
{"title":"Lidar Scan Registration Robust to Extreme Motions","authors":"Simon-Pierre Deschênes, D. Baril, V. Kubelka, P. Giguère, F. Pomerleau","doi":"10.1109/CRV52889.2021.00014","DOIUrl":"https://doi.org/10.1109/CRV52889.2021.00014","url":null,"abstract":"Registration algorithms, such as Iterative Closest Point (ICP), have proven effective in mobile robot localization algorithms over the last decades. However, they are susceptible to failure when a robot sustains extreme velocities and accelerations. For example, this kind of motion can happen after a collision, causing a point cloud to be heavily skewed. While point cloud de-skewing methods have been explored in the past to increase localization and mapping accuracy, these methods still rely on highly accurate odometry systems or ideal navigation conditions. In this paper, we present a method taking into account the remaining motion uncertainties of the trajectory used to de-skew a point cloud along with the environment geometry to increase the robustness of current registration algorithms. We compare our method to three other solutions in a test bench producing 3D maps with peak accelerations of 200 m/s2 and 800 rad/s2. In these extreme scenarios, we demonstrate that our method decreases the error by 9.26 % in translation and by 21.84 % in rotation. The proposed method is generic enough to be integrated to many variants of weighted ICP without adaptation and supports localization robustness in harsher terrains.","PeriodicalId":413697,"journal":{"name":"2021 18th Conference on Robots and Vision (CRV)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132438693","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Few-Shot Learning by Integrating Spatial and Frequency Representation 结合空间和频率表示的少镜头学习
2021 18th Conference on Robots and Vision (CRV) Pub Date : 2021-05-01 DOI: 10.1109/CRV52889.2021.00011
Xiangyu Chen, Guanghui Wang
{"title":"Few-Shot Learning by Integrating Spatial and Frequency Representation","authors":"Xiangyu Chen, Guanghui Wang","doi":"10.1109/CRV52889.2021.00011","DOIUrl":"https://doi.org/10.1109/CRV52889.2021.00011","url":null,"abstract":"Human beings can recognize new objects with only a few labeled examples, however, few-shot learning remains a challenging problem for machine learning systems. Most previous algorithms in few-shot learning only utilize spatial information of the images. In this paper, we propose to integrate the frequency information into the learning model to boost the discrimination ability of the system. We employ Discrete Cosine Transformation (DCT) to generate the frequency representation, then, integrate the features from both the spatial domain and frequency domain for classification. The proposed strategy and its effectiveness are validated with different backbones, datasets, and algorithms. Extensive experiments demonstrate that the frequency information is complementary to the spatial representations in few-shot classification. The classification accuracy is boosted significantly by integrating features from both the spatial and frequency domains in different few-shot learning tasks.","PeriodicalId":413697,"journal":{"name":"2021 18th Conference on Robots and Vision (CRV)","volume":"67 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128160441","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 16
Deep Koopman Representation for Control over Images (DKRCI) 图像控制的深度Koopman表示(DKRCI)
2021 18th Conference on Robots and Vision (CRV) Pub Date : 2021-05-01 DOI: 10.1109/CRV52889.2021.00029
Philippe Laferrière, Samuel Laferrière, Steven Dahdah, J. Forbes, L. Paull
{"title":"Deep Koopman Representation for Control over Images (DKRCI)","authors":"Philippe Laferrière, Samuel Laferrière, Steven Dahdah, J. Forbes, L. Paull","doi":"10.1109/CRV52889.2021.00029","DOIUrl":"https://doi.org/10.1109/CRV52889.2021.00029","url":null,"abstract":"The Koopman operator provides a means to represent nonlinear systems as infinite dimensional linear systems in a lifted state space. This enables the application of linear control techniques to nonlinear systems. However, the choice of a finite number of lifting functions, or Koopman observables, is still an unresolved problem. Deep learning techniques have recently been used to jointly learn these lifting function along with the Koopman operator. However, these methods require knowledge of the system’s state space. In this paper, we present a method to learn a Koopman representation directly from images and control inputs. We then demonstrate our deep learning architecture on a cart-pole system with external inputs.","PeriodicalId":413697,"journal":{"name":"2021 18th Conference on Robots and Vision (CRV)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126577241","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
A new geometric approach for three view line reconstruction and motion estimation in Manhattan Scenes 一种新的曼哈顿场景三线重建和运动估计几何方法
2021 18th Conference on Robots and Vision (CRV) Pub Date : 2021-05-01 DOI: 10.1109/CRV52889.2021.00026
Ayyappa Swamy Thatavarthy, Tanu Sharma, K. Krishna
{"title":"A new geometric approach for three view line reconstruction and motion estimation in Manhattan Scenes","authors":"Ayyappa Swamy Thatavarthy, Tanu Sharma, K. Krishna","doi":"10.1109/CRV52889.2021.00026","DOIUrl":"https://doi.org/10.1109/CRV52889.2021.00026","url":null,"abstract":"Owing to the inherent geometry, the extent of map reconstructed using line-based SfM(structure from motion) is superior to point-based SfM. However, with the existing methods, estimation of structure and motion from observed 2D line segments in images is more complex than that of points. To overcome this, we propose a simple and robust 1-parameter approach for Structure and Motion Estimation from line features in Manhattan Scenes from three views. We leverage the vanishing point directions to estimate the relative rotations as well as to fix the 3D line direction. In consequence we build a constraints matrix, which has the relative translations and 3D line depth as its null space. We then perform 1-parameter line BA using factor graph based cost function. We compare the efficacy of our method with standard line triangulation in synthetic as well as real-world scenes.","PeriodicalId":413697,"journal":{"name":"2021 18th Conference on Robots and Vision (CRV)","volume":"181 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115697582","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Building Height Estimation using Street-View Images, Deep-Learning, Contour Processing, and Geospatial Data 使用街景图像、深度学习、轮廓处理和地理空间数据估算建筑物高度
2021 18th Conference on Robots and Vision (CRV) Pub Date : 2021-05-01 DOI: 10.1109/CRV52889.2021.00022
A. Al-Habashna
{"title":"Building Height Estimation using Street-View Images, Deep-Learning, Contour Processing, and Geospatial Data","authors":"A. Al-Habashna","doi":"10.1109/CRV52889.2021.00022","DOIUrl":"https://doi.org/10.1109/CRV52889.2021.00022","url":null,"abstract":"In the recent years, there has been an increasing interest in extracting data from street-view images. This includes various applications such as estimating the demographic makeup of neighborhoods to building instance classification. Building height is an important piece of information that can be used to enrich two-dimensional footprints of buildings, and enhance analysis on such footprints (e.g., economic analysis, urban planning). In this paper, a proposed algorithm (and its open-source implementation) for automatic estimation of building height from street-view images, using Convolutional Neural Networks (CNNs) and image processing techniques, is presented. The algorithm also utilizes geospatial data that can be obtained from different sources. The algorithm will ultimately be used to enrich the Open Database of Buildings (ODB), that has been published by Statistics Canada, as a part of the Linkable Open Data Environment (LODE). Some of the obtained results for building height estimation are presented in this paper. Furthermore, current and future improvements, some challenging cases and the scalability of the system are discussed.","PeriodicalId":413697,"journal":{"name":"2021 18th Conference on Robots and Vision (CRV)","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123530449","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信