Predicting Signed Distance Functions for Visual Instance Segmentation

2021 Swedish Artificial Intelligence Society Workshop (SAIS) Pub Date : 2021-06-14 DOI:10.1109/SAIS53221.2021.9484039

Emil Brissman, Joakim Johnander, M. Felsberg

{"title":"Predicting Signed Distance Functions for Visual Instance Segmentation","authors":"Emil Brissman, Joakim Johnander, M. Felsberg","doi":"10.1109/SAIS53221.2021.9484039","DOIUrl":null,"url":null,"abstract":"Visual instance segmentation is a challenging problem and becomes even more difficult if objects of interest varies unconstrained in shape. Some objects are well described by a rectangle, however, this is hardly always the case. Consider for instance long, slender objects such as ropes. Anchor-based approaches classify predefined bounding boxes as either negative or positive and thus provide a limited set of shapes that can be handled. Defining anchor-boxes that fit well to all possible shapes leads to an infeasible number of prior boxes. We explore a different approach and propose to train a neural network to compute distance maps along different directions. The network is trained at each pixel to predict the distance to the closest object contour in a given direction. By pooling the distance maps we obtain an approximation to the signed distance function (SDF). The SDF may then be thresholded in order to obtain a foreground-background segmentation. We compare this segmentation to foreground segmentations obtained from the state-of-the-art instance segmentation method YOLACT. On the COCO dataset, our segmentation yields a higher performance in terms of foreground intersection over union (IoU). However, while the distance maps contain information on the individual instances, it is not straightforward to map them to the full instance segmentation. We still believe that this idea is a promising research direction for instance segmentation, as it better captures the different shapes found in the real world.","PeriodicalId":334078,"journal":{"name":"2021 Swedish Artificial Intelligence Society Workshop (SAIS)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-06-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 Swedish Artificial Intelligence Society Workshop (SAIS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SAIS53221.2021.9484039","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

Abstract

Visual instance segmentation is a challenging problem and becomes even more difficult if objects of interest varies unconstrained in shape. Some objects are well described by a rectangle, however, this is hardly always the case. Consider for instance long, slender objects such as ropes. Anchor-based approaches classify predefined bounding boxes as either negative or positive and thus provide a limited set of shapes that can be handled. Defining anchor-boxes that fit well to all possible shapes leads to an infeasible number of prior boxes. We explore a different approach and propose to train a neural network to compute distance maps along different directions. The network is trained at each pixel to predict the distance to the closest object contour in a given direction. By pooling the distance maps we obtain an approximation to the signed distance function (SDF). The SDF may then be thresholded in order to obtain a foreground-background segmentation. We compare this segmentation to foreground segmentations obtained from the state-of-the-art instance segmentation method YOLACT. On the COCO dataset, our segmentation yields a higher performance in terms of foreground intersection over union (IoU). However, while the distance maps contain information on the individual instances, it is not straightforward to map them to the full instance segmentation. We still believe that this idea is a promising research direction for instance segmentation, as it better captures the different shapes found in the real world.

查看原文本刊更多论文

预测有符号距离函数用于视觉实例分割

视觉实例分割是一个具有挑战性的问题，如果感兴趣的对象在形状上不受约束地变化，则变得更加困难。有些物体可以很好地用矩形来描述，然而，情况并非总是如此。例如，考虑细长的物体，如绳子。基于锚点的方法将预定义的边界框分为正负两种，从而提供一组有限的可处理形状。定义适合所有可能形状的锚定框会导致先验框的不可行的数量。我们探索了一种不同的方法，并提出训练一个神经网络来计算沿不同方向的距离图。该网络在每个像素上进行训练，以预测在给定方向上到最近的物体轮廓的距离。通过池化距离图，我们得到了有符号距离函数(SDF)的近似。然后可以对SDF进行阈值处理，以获得前景-背景分割。我们将这种分割与从最先进的实例分割方法YOLACT中获得的前景分割进行比较。在COCO数据集上，我们的分割在前景交集优于联合(IoU)方面产生了更高的性能。然而，尽管距离图包含了单个实例的信息，但将它们映射到完整的实例分割并不简单。我们仍然相信这个想法对于实例分割来说是一个很有前途的研究方向，因为它能更好地捕捉到现实世界中发现的不同形状。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2021 Swedish Artificial Intelligence Society Workshop (SAIS)

自引率

0.00%

发文量