CPI-Parser: Integrating Causal Properties Into Multiple Human Parsing

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society Pub Date : 2024-10-03 DOI:10.1109/TIP.2024.3469579

Xuanhan Wang;Xiaojia Chen;Lianli Gao;Jingkuan Song;Heng Tao Shen

{"title":"CPI-Parser: Integrating Causal Properties Into Multiple Human Parsing","authors":"Xuanhan Wang;Xiaojia Chen;Lianli Gao;Jingkuan Song;Heng Tao Shen","doi":"10.1109/TIP.2024.3469579","DOIUrl":null,"url":null,"abstract":"Existing methods of multiple human parsing (MHP) apply deep models to learn instance-level representations for segmenting each person into non-overlapped body parts. However, learned representations often contain many spurious correlations that degrade model generalization, leading learned models to be vulnerable to visually contextual variations in images (e.g., unseen image styles/external interventions). To tackle this, we present a causal property integrated parsing model termed CPI-Parser, which is driven by fundamental causal principles involving two causal properties for human parsing (i.e., the causal diversity and the causal invariance). Specifically, we assume that an image is constructed by a mix of causal factors (the characteristics of body parts) and non-causal factors (external contexts), where only the former ones decide the essence of human parsing. Since causal/non-causal factors are unobservable, the proposed CPI-Parser is required to separate key factors that satisfy the causal properties from an image. In this way, the parser is able to rely on causal factors w.r.t relevant evidence rather than non-causal factors w.r.t spurious correlations, thus alleviating model degradation and yielding improved parsing ability. Notably, the CPI-Parser is designed in a flexible way and can be integrated into any existing MHP frameworks. Extensive experiments conducted on three widely used benchmarks demonstrate the effectiveness and generalizability of our method. Code and models are released (\n<uri>https://github.com/HAG-uestc/CPI-Parser</uri>\n) for research purpose.","PeriodicalId":94032,"journal":{"name":"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society","volume":"33 ","pages":"5771-5782"},"PeriodicalIF":0.0000,"publicationDate":"2024-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10704987/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Existing methods of multiple human parsing (MHP) apply deep models to learn instance-level representations for segmenting each person into non-overlapped body parts. However, learned representations often contain many spurious correlations that degrade model generalization, leading learned models to be vulnerable to visually contextual variations in images (e.g., unseen image styles/external interventions). To tackle this, we present a causal property integrated parsing model termed CPI-Parser, which is driven by fundamental causal principles involving two causal properties for human parsing (i.e., the causal diversity and the causal invariance). Specifically, we assume that an image is constructed by a mix of causal factors (the characteristics of body parts) and non-causal factors (external contexts), where only the former ones decide the essence of human parsing. Since causal/non-causal factors are unobservable, the proposed CPI-Parser is required to separate key factors that satisfy the causal properties from an image. In this way, the parser is able to rely on causal factors w.r.t relevant evidence rather than non-causal factors w.r.t spurious correlations, thus alleviating model degradation and yielding improved parsing ability. Notably, the CPI-Parser is designed in a flexible way and can be integrated into any existing MHP frameworks. Extensive experiments conducted on three widely used benchmarks demonstrate the effectiveness and generalizability of our method. Code and models are released ( https://github.com/HAG-uestc/CPI-Parser ) for research purpose.

查看原文本刊更多论文

CPI-Parser：将因果属性整合到多重人格解析中

现有的多重人体解析（MHP）方法采用深度模型来学习实例级表征，以便将每个人分割成不重叠的身体部位。然而，学习到的表征往往包含许多会降低模型泛化能力的虚假相关性，导致学习到的模型容易受到图像中视觉上下文变化（如未见的图像风格/外部干预）的影响。为了解决这个问题，我们提出了一种被称为 CPI-Parser 的因果属性集成解析模型，它由涉及人类解析的两个因果属性（即因果多样性和因果不变性）的基本因果原则驱动。具体来说，我们假设图像是由因果因素（身体部位的特征）和非因果因素（外部语境）混合构建的，其中只有前者决定了人类解析的本质。由于因果/非因果因素是不可观测的，因此所提出的 CPI-Parser 需要将满足因果属性的关键因素从图像中分离出来。这样，解析器就能依靠相关证据中的因果因素，而不是虚假相关的非因果因素，从而减轻模型退化，提高解析能力。值得注意的是，CPI-Parser 设计灵活，可以集成到任何现有的 MHP 框架中。在三个广泛使用的基准上进行的大量实验证明了我们方法的有效性和通用性。代码和模型已发布1，用于研究目的。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society

自引率

0.00%

发文量