Multi-Human Parsing Machines

Proceedings of the 26th ACM international conference on Multimedia Pub Date : 2018-10-15 DOI:10.1145/3240508.3240515

Jianshu Li, Jian Zhao, Yunpeng Chen, S. Roy, Shuicheng Yan, Jiashi Feng, T. Sim

{"title":"Multi-Human Parsing Machines","authors":"Jianshu Li, Jian Zhao, Yunpeng Chen, S. Roy, Shuicheng Yan, Jiashi Feng, T. Sim","doi":"10.1145/3240508.3240515","DOIUrl":null,"url":null,"abstract":"Human parsing is an important task in human-centric analysis. Despite the remarkable progress in single-human parsing, the more realistic case of multi-human parsing remains challenging in terms of the data and the model. Compared with the considerable number of available single-human parsing datasets, the datasets for multi-human parsing are very limited in number mainly due to the huge annotation effort required. Besides the data challenge to multi-human parsing, the persons in real-world scenarios are often entangled with each other due to close interaction and body occlusion, making it difficult to distinguish body parts from different person instances. In this paper we propose the Multi-Human Parsing Machines (MHPM) system, which contains an MHP Montage model and an MHP Solver, to address both challenges in multi-human parsing. Specifically, the MHP Montage model in MHPM generates realistic images with multiple persons together with the parsing labels. It intelligently composes single persons onto background scene images while maintaining the structural information between persons and the scene. The generated images can be used to train better multi-human parsing algorithms. On the other hand, the MHP Solver in MHPM solves the bottleneck of distinguishing multiple entangled persons with close interaction. It employs a Group-Individual Push and Pull (GIPP) loss function, which can effectively separate persons with close interaction. We experimentally show that the proposed MHPM can achieve state-of-the-art performance on the multi-human parsing benchmark and the person individualization benchmark, which distinguishes closely entangled person instances.","PeriodicalId":339857,"journal":{"name":"Proceedings of the 26th ACM international conference on Multimedia","volume":"39 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-10-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"15","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 26th ACM international conference on Multimedia","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3240508.3240515","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 15

Abstract

Human parsing is an important task in human-centric analysis. Despite the remarkable progress in single-human parsing, the more realistic case of multi-human parsing remains challenging in terms of the data and the model. Compared with the considerable number of available single-human parsing datasets, the datasets for multi-human parsing are very limited in number mainly due to the huge annotation effort required. Besides the data challenge to multi-human parsing, the persons in real-world scenarios are often entangled with each other due to close interaction and body occlusion, making it difficult to distinguish body parts from different person instances. In this paper we propose the Multi-Human Parsing Machines (MHPM) system, which contains an MHP Montage model and an MHP Solver, to address both challenges in multi-human parsing. Specifically, the MHP Montage model in MHPM generates realistic images with multiple persons together with the parsing labels. It intelligently composes single persons onto background scene images while maintaining the structural information between persons and the scene. The generated images can be used to train better multi-human parsing algorithms. On the other hand, the MHP Solver in MHPM solves the bottleneck of distinguishing multiple entangled persons with close interaction. It employs a Group-Individual Push and Pull (GIPP) loss function, which can effectively separate persons with close interaction. We experimentally show that the proposed MHPM can achieve state-of-the-art performance on the multi-human parsing benchmark and the person individualization benchmark, which distinguishes closely entangled person instances.

查看原文本刊更多论文

多人解析机

人工解析是以人为中心的分析中的一项重要任务。尽管在单人解析方面取得了显著的进展，但就数据和模型而言，更现实的多人解析情况仍然具有挑战性。与大量可用的单人解析数据集相比，用于多人解析的数据集数量非常有限，这主要是由于需要大量的注释工作。除了对多人解析的数据挑战外，现实场景中的人由于密切的互动和身体遮挡，经常相互纠缠，使得从不同的人实例中区分身体部位变得困难。在本文中，我们提出了包含MHP蒙太奇模型和MHP求解器的多人解析机(MHPM)系统来解决多人解析中的这两个挑战。具体来说，MHPM中的MHP蒙太奇模型生成具有多个人物和解析标签的逼真图像。它在保持人物与场景之间的结构信息的同时，将单个人物智能地合成到背景场景图像中。生成的图像可以用来训练更好的多人解析算法。另一方面，MHPM中的MHP解算器解决了多个相互密切的纠缠人员难以识别的瓶颈问题。它采用了群体-个人推拉(GIPP)损失函数，可以有效地分离亲密互动的人。实验表明，所提出的MHPM可以在多人解析基准和个人个性化基准上达到最先进的性能，该基准可以区分紧密纠缠的人实例。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the 26th ACM international conference on Multimedia

自引率

0.00%

发文量