Improved mask RCNN and cosine similarity using RGBD segmentation for Occlusion handling in Multi Object Tracking

Jurnal Ilmu Komputer dan Informasi Pub Date : 2023-03-01 DOI:10.21609/jiki.v16i1.1073

Siti Hadiyan Pratiwi, Putri Shaniya, G. Jati, W. Jatmiko

{"title":"Improved mask RCNN and cosine similarity using RGBD segmentation for Occlusion handling in Multi Object Tracking","authors":"Siti Hadiyan Pratiwi, Putri Shaniya, G. Jati, W. Jatmiko","doi":"10.21609/jiki.v16i1.1073","DOIUrl":null,"url":null,"abstract":"In this study, additional depth images were used to enrich the information in each image pixel. Segmentation, by its nature capable to process image up to pixel level. So, it can detect up to the smallest part of the object, even when it’s overlapped with another object. By using segmentation, the main goal is to be able to maintain the tracking process longer when the object starts to be occluded until it is severely occluded right before it is completely disappeared. Object tracking based on object detection was developed by modifying the Mask R-CNN architecture to process RGBD images. The detection results feature extracted using HOG, and each of them got compared to the target objects. The comparison was using cosine similarity calculation, and the maximum value of the detected object would update the target object for the next frame. The evaluation of the model was using mAP calculation. Mask R-CNN RGBD late fusion had a higher value by 5% than Mask R-CNN RGB. It was 68,234% and 63,668%, respectively. Meanwhile, the tracking evaluation uses the traditional method of calculating the id switching during the tracking process. Out of 295 frames, the original Mask R-CNN method had ten switching ID times. On the other hand, the proposed method Mask R-CNN RGBD had much better tracking results with switching ids close to 0. Keywords—Occlusion, RGBD, Mask R-CNN, Late fusion, Cosine similarity","PeriodicalId":31392,"journal":{"name":"Jurnal Ilmu Komputer dan Informasi","volume":"39 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2023-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Jurnal Ilmu Komputer dan Informasi","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.21609/jiki.v16i1.1073","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

In this study, additional depth images were used to enrich the information in each image pixel. Segmentation, by its nature capable to process image up to pixel level. So, it can detect up to the smallest part of the object, even when it’s overlapped with another object. By using segmentation, the main goal is to be able to maintain the tracking process longer when the object starts to be occluded until it is severely occluded right before it is completely disappeared. Object tracking based on object detection was developed by modifying the Mask R-CNN architecture to process RGBD images. The detection results feature extracted using HOG, and each of them got compared to the target objects. The comparison was using cosine similarity calculation, and the maximum value of the detected object would update the target object for the next frame. The evaluation of the model was using mAP calculation. Mask R-CNN RGBD late fusion had a higher value by 5% than Mask R-CNN RGB. It was 68,234% and 63,668%, respectively. Meanwhile, the tracking evaluation uses the traditional method of calculating the id switching during the tracking process. Out of 295 frames, the original Mask R-CNN method had ten switching ID times. On the other hand, the proposed method Mask R-CNN RGBD had much better tracking results with switching ids close to 0. Keywords—Occlusion, RGBD, Mask R-CNN, Late fusion, Cosine similarity

查看原文本刊更多论文

改进掩模RCNN和余弦相似度使用RGBD分割在多目标跟踪遮挡处理

在本研究中，使用额外的深度图像来丰富每个图像像素中的信息。分割，就其本质而言，能够处理图像到像素级。所以，它可以检测到物体最小的部分，即使它与另一个物体重叠。通过使用分割，主要目标是能够在物体开始被遮挡时保持更长的跟踪过程，直到它在完全消失之前被严重遮挡。通过修改Mask R-CNN架构，开发了基于目标检测的目标跟踪技术来处理RGBD图像。利用HOG提取检测结果特征，并与目标物体进行比较。比较使用余弦相似度计算，检测到的物体的最大值将更新下一帧的目标物体。采用mAP计算对模型进行评价。Mask R-CNN RGBD后期融合值比Mask R-CNN RGB高5%。分别为68234%和6368%。同时，跟踪评价采用传统的跟踪过程id切换计算方法。在295帧中，原始的Mask R-CNN方法有10次切换ID。另一方面，本文提出的Mask R-CNN RGBD方法在切换id接近0时的跟踪效果要好得多。关键词:遮挡，RGBD，掩模R-CNN，后期融合，余弦相似度

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Jurnal Ilmu Komputer dan Informasi

自引率

0.00%

发文量

审稿时长

4 weeks