{"title":"Multi-scale information transport generative adversarial network for human pose transfer","authors":"Jinsong Zhang , Yu-Kun Lai , Jian Ma , Kun Li","doi":"10.1016/j.displa.2024.102786","DOIUrl":null,"url":null,"abstract":"<div><p>Human pose transfer, a challenging image generation task, aims to transfer a source image from one pose to another. Existing methods often struggle to preserve details in visible regions or predict reasonable pixels for invisible regions due to inaccurate correspondences. In this paper, we design a novel multi-scale information transport generative adversarial network, composed of Information Transport (IT) blocks to establish and refine the correspondences progressively. Specifically, we compute a transport matrix to warp the source image features by integrating an optimal transport solver in our proposed IT block, and use IT blocks to refine the correspondences in different resolutions to preserve rich details of the source image features. The experimental results and applications demonstrate the effectiveness of our proposed method. We further present an image-specific optimization using only a single image. <em>The code is available for research purposes at</em> <span>https://github.com/Zhangjinso/OT-POSE</span><svg><path></path></svg>.</p></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"84 ","pages":"Article 102786"},"PeriodicalIF":3.7000,"publicationDate":"2024-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Displays","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0141938224001501","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
引用次数: 0
Abstract
Human pose transfer, a challenging image generation task, aims to transfer a source image from one pose to another. Existing methods often struggle to preserve details in visible regions or predict reasonable pixels for invisible regions due to inaccurate correspondences. In this paper, we design a novel multi-scale information transport generative adversarial network, composed of Information Transport (IT) blocks to establish and refine the correspondences progressively. Specifically, we compute a transport matrix to warp the source image features by integrating an optimal transport solver in our proposed IT block, and use IT blocks to refine the correspondences in different resolutions to preserve rich details of the source image features. The experimental results and applications demonstrate the effectiveness of our proposed method. We further present an image-specific optimization using only a single image. The code is available for research purposes athttps://github.com/Zhangjinso/OT-POSE.
期刊介绍:
Displays is the international journal covering the research and development of display technology, its effective presentation and perception of information, and applications and systems including display-human interface.
Technical papers on practical developments in Displays technology provide an effective channel to promote greater understanding and cross-fertilization across the diverse disciplines of the Displays community. Original research papers solving ergonomics issues at the display-human interface advance effective presentation of information. Tutorial papers covering fundamentals intended for display technologies and human factor engineers new to the field will also occasionally featured.