Improved mask RCNN and cosine similarity using RGBD segmentation for Occlusion handling in Multi Object Tracking

Siti Hadiyan Pratiwi, Putri Shaniya, G. Jati, W. Jatmiko
{"title":"Improved mask RCNN and cosine similarity using RGBD segmentation for Occlusion handling in Multi Object Tracking","authors":"Siti Hadiyan Pratiwi, Putri Shaniya, G. Jati, W. Jatmiko","doi":"10.21609/jiki.v16i1.1073","DOIUrl":null,"url":null,"abstract":"In this study, additional depth images were used to enrich the information in each image pixel. Segmentation, by its nature capable to process image up to pixel level. So, it can detect up to the smallest part of the object, even when it’s overlapped with another object. By using segmentation, the main goal is to be able to maintain the tracking process longer when the object starts to be occluded until it is severely occluded right before it is completely disappeared. Object tracking based on object detection was developed by modifying the Mask R-CNN architecture to process RGBD images. The detection results feature extracted using HOG, and each of them got compared to the target objects. The comparison was using cosine similarity calculation, and the maximum value of the detected object would update the target object for the next frame. The evaluation of the model was using mAP calculation. Mask R-CNN RGBD late fusion had a higher value by 5% than Mask R-CNN RGB. It was 68,234% and 63,668%, respectively. Meanwhile, the tracking evaluation uses the traditional method of calculating the id switching during the tracking process. Out of 295 frames, the original Mask R-CNN method had ten switching ID times. On the other hand, the proposed method Mask R-CNN RGBD had much better tracking results with switching ids close to 0. Keywords—Occlusion, RGBD, Mask R-CNN, Late fusion, Cosine similarity","PeriodicalId":31392,"journal":{"name":"Jurnal Ilmu Komputer dan Informasi","volume":"39 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2023-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Jurnal Ilmu Komputer dan Informasi","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.21609/jiki.v16i1.1073","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

In this study, additional depth images were used to enrich the information in each image pixel. Segmentation, by its nature capable to process image up to pixel level. So, it can detect up to the smallest part of the object, even when it’s overlapped with another object. By using segmentation, the main goal is to be able to maintain the tracking process longer when the object starts to be occluded until it is severely occluded right before it is completely disappeared. Object tracking based on object detection was developed by modifying the Mask R-CNN architecture to process RGBD images. The detection results feature extracted using HOG, and each of them got compared to the target objects. The comparison was using cosine similarity calculation, and the maximum value of the detected object would update the target object for the next frame. The evaluation of the model was using mAP calculation. Mask R-CNN RGBD late fusion had a higher value by 5% than Mask R-CNN RGB. It was 68,234% and 63,668%, respectively. Meanwhile, the tracking evaluation uses the traditional method of calculating the id switching during the tracking process. Out of 295 frames, the original Mask R-CNN method had ten switching ID times. On the other hand, the proposed method Mask R-CNN RGBD had much better tracking results with switching ids close to 0. Keywords—Occlusion, RGBD, Mask R-CNN, Late fusion, Cosine similarity
改进掩模RCNN和余弦相似度使用RGBD分割在多目标跟踪遮挡处理
在本研究中,使用额外的深度图像来丰富每个图像像素中的信息。分割,就其本质而言,能够处理图像到像素级。所以,它可以检测到物体最小的部分,即使它与另一个物体重叠。通过使用分割,主要目标是能够在物体开始被遮挡时保持更长的跟踪过程,直到它在完全消失之前被严重遮挡。通过修改Mask R-CNN架构,开发了基于目标检测的目标跟踪技术来处理RGBD图像。利用HOG提取检测结果特征,并与目标物体进行比较。比较使用余弦相似度计算,检测到的物体的最大值将更新下一帧的目标物体。采用mAP计算对模型进行评价。Mask R-CNN RGBD后期融合值比Mask R-CNN RGB高5%。分别为68234%和6368%。同时,跟踪评价采用传统的跟踪过程id切换计算方法。在295帧中,原始的Mask R-CNN方法有10次切换ID。另一方面,本文提出的Mask R-CNN RGBD方法在切换id接近0时的跟踪效果要好得多。关键词:遮挡,RGBD,掩模R-CNN,后期融合,余弦相似度
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
审稿时长
4 weeks
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信