German Conference on Pattern Recognition最新文献

筛选
英文 中文
SF2SE3: Clustering Scene Flow into SE(3)-Motions via Proposal and Selection SF2SE3:通过提议和选择将场景流聚类到SE(3)中
German Conference on Pattern Recognition Pub Date : 2022-09-18 DOI: 10.1007/978-3-031-16788-1_14
Leonhard Sommer, Philipp Schröppel, T. Brox
{"title":"SF2SE3: Clustering Scene Flow into SE(3)-Motions via Proposal and Selection","authors":"Leonhard Sommer, Philipp Schröppel, T. Brox","doi":"10.1007/978-3-031-16788-1_14","DOIUrl":"https://doi.org/10.1007/978-3-031-16788-1_14","url":null,"abstract":"","PeriodicalId":221267,"journal":{"name":"German Conference on Pattern Recognition","volume":"61 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126270875","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Diverse Video Captioning by Adaptive Spatio-temporal Attention 基于自适应时空注意力的视频字幕
German Conference on Pattern Recognition Pub Date : 2022-08-19 DOI: 10.48550/arXiv.2208.09266
Zohreh Ghaderi, Leonard Salewski, H. Lensch
{"title":"Diverse Video Captioning by Adaptive Spatio-temporal Attention","authors":"Zohreh Ghaderi, Leonard Salewski, H. Lensch","doi":"10.48550/arXiv.2208.09266","DOIUrl":"https://doi.org/10.48550/arXiv.2208.09266","url":null,"abstract":"To generate proper captions for videos, the inference needs to identify relevant concepts and pay attention to the spatial relationships between them as well as to the temporal development in the clip. Our end-to-end encoder-decoder video captioning framework incorporates two transformer-based architectures, an adapted transformer for a single joint spatio-temporal video analysis as well as a self-attention-based decoder for advanced text generation. Furthermore, we introduce an adaptive frame selection scheme to reduce the number of required incoming frames while maintaining the relevant content when training both transformers. Additionally, we estimate semantic concepts relevant for video captioning by aggregating all ground truth captions of each sample. Our approach achieves state-of-the-art results on the MSVD, as well as on the large-scale MSR-VTT and the VATEX benchmark datasets considering multiple Natural Language Generation (NLG) metrics. Additional evaluations on diversity scores highlight the expressiveness and diversity in the structure of our generated captions.","PeriodicalId":221267,"journal":{"name":"German Conference on Pattern Recognition","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114777663","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Global Hierarchical Attention for 3D Point Cloud Analysis 三维点云分析的全局分层关注
German Conference on Pattern Recognition Pub Date : 2022-08-07 DOI: 10.48550/arXiv.2208.03791
Dan Jia, Alexander Hermans, Bastian Leibe
{"title":"Global Hierarchical Attention for 3D Point Cloud Analysis","authors":"Dan Jia, Alexander Hermans, Bastian Leibe","doi":"10.48550/arXiv.2208.03791","DOIUrl":"https://doi.org/10.48550/arXiv.2208.03791","url":null,"abstract":". We propose a new attention mechanism, called Global Hierarchical Attention (GHA), for 3D point cloud analysis. GHA approximates the regular global dot-product attention via a series of coarsening and interpolation operations over multiple hierarchy levels. The advantage of GHA is two-fold. First, it has linear complexity with respect to the number of points, enabling the processing of large point clouds. Second, GHA inherently possesses the inductive bias to focus on spatially close points, while retaining the global connectivity among all points. Combined with a feedforward network, GHA can be inserted into many existing network architectures. We experiment with multiple baseline networks and show that adding GHA consistently improves performance across different tasks and datasets. For the task of semantic segmentation, GHA gives a +1.7% mIoU increase to the MinkowskiEngine baseline on ScanNet. For the 3D object detection task, GHA improves the CenterPoint baseline by +0.5% mAP on the nuScenes dataset, and the 3DETR baseline by +2.1% mAP 25 and +1.5% mAP 50 on ScanNet.","PeriodicalId":221267,"journal":{"name":"German Conference on Pattern Recognition","volume":"122 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127968022","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Augmentation Learning for Semi-Supervised Classification 半监督分类的增强学习
German Conference on Pattern Recognition Pub Date : 2022-08-03 DOI: 10.48550/arXiv.2208.01956
Tim Frommknecht, Pedro Alves Zipf, Quanfu Fan, Nina Shvetsova, Hilde Kuehne
{"title":"Augmentation Learning for Semi-Supervised Classification","authors":"Tim Frommknecht, Pedro Alves Zipf, Quanfu Fan, Nina Shvetsova, Hilde Kuehne","doi":"10.48550/arXiv.2208.01956","DOIUrl":"https://doi.org/10.48550/arXiv.2208.01956","url":null,"abstract":"Recently, a number of new Semi-Supervised Learning methods have emerged. As the accuracy for ImageNet and similar datasets increased over time, the performance on tasks beyond the classification of natural images is yet to be explored. Most Semi-Supervised Learning methods rely on a carefully manually designed data augmentation pipeline that is not transferable for learning on images of other domains. In this work, we propose a Semi-Supervised Learning method that automatically selects the most effective data augmentation policy for a particular dataset. We build upon the Fixmatch method and extend it with meta-learning of augmentations. The augmentation is learned in additional training before the classification training and makes use of bi-level optimization, to optimize the augmentation policy and maximize accuracy. We evaluate our approach on two domain-specific datasets, containing satellite images and hand-drawn sketches, and obtain state-of-the-art results. We further investigate in an ablation the different parameters relevant for learning augmentation policies and show how policy learning can be used to adapt augmentations to datasets beyond ImageNet.","PeriodicalId":221267,"journal":{"name":"German Conference on Pattern Recognition","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-08-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124952434","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
ArtFID: Quantitative Evaluation of Neural Style Transfer ArtFID:神经风格迁移的定量评价
German Conference on Pattern Recognition Pub Date : 2022-07-25 DOI: 10.48550/arXiv.2207.12280
Matthias Wright, B. Ommer
{"title":"ArtFID: Quantitative Evaluation of Neural Style Transfer","authors":"Matthias Wright, B. Ommer","doi":"10.48550/arXiv.2207.12280","DOIUrl":"https://doi.org/10.48550/arXiv.2207.12280","url":null,"abstract":"The field of neural style transfer has experienced a surge of research exploring different avenues ranging from optimization-based approaches and feed-forward models to meta-learning methods. The developed techniques have not just progressed the field of style transfer, but also led to breakthroughs in other areas of computer vision, such as all of visual synthesis. However, whereas quantitative evaluation and benchmarking have become pillars of computer vision research, the reproducible, quantitative assessment of style transfer models is still lacking. Even in comparison to other fields of visual synthesis, where widely used metrics exist, the quantitative evaluation of style transfer is still lagging behind. To support the automatic comparison of different style transfer approaches and to study their respective strengths and weaknesses, the field would greatly benefit from a quantitative measurement of stylization performance. Therefore, we propose a method to complement the currently mostly qualitative evaluation schemes. We provide extensive evaluations and a large-scale user study to show that the proposed metric strongly coincides with human judgment.","PeriodicalId":221267,"journal":{"name":"German Conference on Pattern Recognition","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125289525","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
A Bhattacharyya Coefficient-Based Framework for Noise Model-Aware Random Walker Image Segmentation 基于Bhattacharyya系数的噪声感知随机沃克图像分割框架
German Conference on Pattern Recognition Pub Date : 2022-06-02 DOI: 10.48550/arXiv.2206.00947
Dominik Drees, Florian Eilers, Ang Bian, Xiaoyi Jiang
{"title":"A Bhattacharyya Coefficient-Based Framework for Noise Model-Aware Random Walker Image Segmentation","authors":"Dominik Drees, Florian Eilers, Ang Bian, Xiaoyi Jiang","doi":"10.48550/arXiv.2206.00947","DOIUrl":"https://doi.org/10.48550/arXiv.2206.00947","url":null,"abstract":". One well established method of interactive image segmentation is the random walker algorithm. Considerable research on this family of segmentation methods has been continuously conducted in recent years with numerous applications. These methods are common in using a simple Gaussian weight function which depends on a parameter that strongly influences the segmentation performance. In this work we propose a general framework of deriving weight functions based on probabilistic modeling. This framework can be concretized to cope with virtually any well-defined noise model. It eliminates the critical parameter and thus avoids time-consuming parameter search. We derive the specific weight functions for common noise types and show their superior performance on synthetic data as well as different biomedical image data (MRI images from the NYU fastMRI dataset, larvae images acquired with the FIM technique). Our framework can also be used in multiple other applications, e.g., the graph cut algorithm and its extensions.","PeriodicalId":221267,"journal":{"name":"German Conference on Pattern Recognition","volume":"100 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124747065","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Localized Vision-Language Matching for Open-vocabulary Object Detection 面向开放词汇目标检测的局部视觉语言匹配
German Conference on Pattern Recognition Pub Date : 2022-05-12 DOI: 10.48550/arXiv.2205.06160
M. A. Bravo, Sudhanshu Mittal, T. Brox
{"title":"Localized Vision-Language Matching for Open-vocabulary Object Detection","authors":"M. A. Bravo, Sudhanshu Mittal, T. Brox","doi":"10.48550/arXiv.2205.06160","DOIUrl":"https://doi.org/10.48550/arXiv.2205.06160","url":null,"abstract":"In this work, we propose an open-vocabulary object detection method that, based on image-caption pairs, learns to detect novel object classes along with a given set of known classes. It is a two-stage training approach that first uses a location-guided image-caption matching technique to learn class labels for both novel and known classes in a weakly-supervised manner and second specializes the model for the object detection task using known class annotations. We show that a simple language model fits better than a large contextualized language model for detecting novel objects. Moreover, we introduce a consistency-regularization technique to better exploit image-caption pair information. Our method compares favorably to existing open-vocabulary detection approaches while being data-efficient. Source code is available at https://github.com/lmb-freiburg/locov .","PeriodicalId":221267,"journal":{"name":"German Conference on Pattern Recognition","volume":"61 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126194403","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
Interpretable Prediction of Pulmonary Hypertension in Newborns using Echocardiograms 超声心动图对新生儿肺动脉高压的可解释性预测
German Conference on Pattern Recognition Pub Date : 2022-03-24 DOI: 10.48550/arXiv.2203.13038
H. Ragnarsdóttir, Laura Manduchi, H. Michel, F. Laumer, S. Wellmann, Ece Ozkan, Julia-Franziska Vogt
{"title":"Interpretable Prediction of Pulmonary Hypertension in Newborns using Echocardiograms","authors":"H. Ragnarsdóttir, Laura Manduchi, H. Michel, F. Laumer, S. Wellmann, Ece Ozkan, Julia-Franziska Vogt","doi":"10.48550/arXiv.2203.13038","DOIUrl":"https://doi.org/10.48550/arXiv.2203.13038","url":null,"abstract":". Pulmonary hypertension (PH) in newborns and infants is a complex condition associated with several pulmonary, cardiac, and systemic diseases contributing to morbidity and mortality. Therefore, ac-curate and early detection of PH is crucial for successful management. Using echocardiography, the primary diagnostic tool in pediatrics, human assessment is both time-consuming and expertise-demanding, rais-ing the need for an automated approach. In this work, we present an interpretable multi-view video-based deep learning approach to predict PH for a cohort of 194 newborns using echocardiograms. We use spatio-temporal convolutional architectures for the prediction of PH from each view, and aggregate the predictions of the different views using majority voting. To the best of our knowledge, this is the first work for an automated assessment of PH in newborns using echocardiograms. Our results show a mean F1-score of 0.84 for severity prediction and 0.92 for binary detection using 10-fold cross-validation. We complement our predictions with saliency maps and show that the learned model focuses on clinically relevant cardiac structures, motivating its usage in clinical practice.","PeriodicalId":221267,"journal":{"name":"German Conference on Pattern Recognition","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128390213","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Auto-Compressing Subset Pruning for Semantic Image Segmentation 语义图像分割的自动压缩子集剪枝
German Conference on Pattern Recognition Pub Date : 2022-01-26 DOI: 10.1007/978-3-031-16788-1_2
Konstantin Ditschuneit, J. Otterbach
{"title":"Auto-Compressing Subset Pruning for Semantic Image Segmentation","authors":"Konstantin Ditschuneit, J. Otterbach","doi":"10.1007/978-3-031-16788-1_2","DOIUrl":"https://doi.org/10.1007/978-3-031-16788-1_2","url":null,"abstract":"","PeriodicalId":221267,"journal":{"name":"German Conference on Pattern Recognition","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-01-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115946721","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Optimizing Edge Detection for Image Segmentation with Multicut Penalties 基于多切割惩罚的图像分割边缘检测优化
German Conference on Pattern Recognition Pub Date : 2021-12-10 DOI: 10.1007/978-3-031-16788-1_12
Steffen Jung, Sebastian Ziegler, Amirhossein Kardoost, M. Keuper
{"title":"Optimizing Edge Detection for Image Segmentation with Multicut Penalties","authors":"Steffen Jung, Sebastian Ziegler, Amirhossein Kardoost, M. Keuper","doi":"10.1007/978-3-031-16788-1_12","DOIUrl":"https://doi.org/10.1007/978-3-031-16788-1_12","url":null,"abstract":"","PeriodicalId":221267,"journal":{"name":"German Conference on Pattern Recognition","volume":"57 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133305334","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信