{"title":"内部视频绘图的外观一致性和动作一致性学习","authors":"Ruixin Liu, Yuesheng Zhu, GuiBo Luo","doi":"10.1049/cit2.12405","DOIUrl":null,"url":null,"abstract":"<p>Internal learning-based video inpainting methods have shown promising results by exploiting the intrinsic properties of the video to fill in the missing region without external dataset supervision. However, existing internal learning-based video inpainting methods would produce inconsistent structures or blurry textures due to the insufficient utilisation of motion priors within the video sequence. In this paper, the authors propose a new internal learning-based video inpainting model called appearance consistency and motion coherence network (ACMC-Net), which can not only learn the recurrence of appearance prior but can also capture motion coherence prior to improve the quality of the inpainting results. In ACMC-Net, a transformer-based appearance network is developed to capture global context information within the video frame for representing appearance consistency accurately. Additionally, a novel motion coherence learning scheme is proposed to learn the motion prior in a video sequence effectively. Finally, the learnt internal appearance consistency and motion coherence are implicitly propagated to the missing regions to achieve inpainting well. Extensive experiments conducted on the DAVIS dataset show that the proposed model obtains the superior performance in terms of quantitative measurements and produces more visually plausible results compared with the state-of-the-art methods.</p>","PeriodicalId":46211,"journal":{"name":"CAAI Transactions on Intelligence Technology","volume":"10 3","pages":"827-841"},"PeriodicalIF":7.3000,"publicationDate":"2025-02-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/cit2.12405","citationCount":"0","resultStr":"{\"title\":\"Appearance consistency and motion coherence learning for internal video inpainting\",\"authors\":\"Ruixin Liu, Yuesheng Zhu, GuiBo Luo\",\"doi\":\"10.1049/cit2.12405\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>Internal learning-based video inpainting methods have shown promising results by exploiting the intrinsic properties of the video to fill in the missing region without external dataset supervision. However, existing internal learning-based video inpainting methods would produce inconsistent structures or blurry textures due to the insufficient utilisation of motion priors within the video sequence. In this paper, the authors propose a new internal learning-based video inpainting model called appearance consistency and motion coherence network (ACMC-Net), which can not only learn the recurrence of appearance prior but can also capture motion coherence prior to improve the quality of the inpainting results. In ACMC-Net, a transformer-based appearance network is developed to capture global context information within the video frame for representing appearance consistency accurately. Additionally, a novel motion coherence learning scheme is proposed to learn the motion prior in a video sequence effectively. Finally, the learnt internal appearance consistency and motion coherence are implicitly propagated to the missing regions to achieve inpainting well. Extensive experiments conducted on the DAVIS dataset show that the proposed model obtains the superior performance in terms of quantitative measurements and produces more visually plausible results compared with the state-of-the-art methods.</p>\",\"PeriodicalId\":46211,\"journal\":{\"name\":\"CAAI Transactions on Intelligence Technology\",\"volume\":\"10 3\",\"pages\":\"827-841\"},\"PeriodicalIF\":7.3000,\"publicationDate\":\"2025-02-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://onlinelibrary.wiley.com/doi/epdf/10.1049/cit2.12405\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"CAAI Transactions on Intelligence Technology\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1049/cit2.12405\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"CAAI Transactions on Intelligence Technology","FirstCategoryId":"94","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1049/cit2.12405","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
Appearance consistency and motion coherence learning for internal video inpainting
Internal learning-based video inpainting methods have shown promising results by exploiting the intrinsic properties of the video to fill in the missing region without external dataset supervision. However, existing internal learning-based video inpainting methods would produce inconsistent structures or blurry textures due to the insufficient utilisation of motion priors within the video sequence. In this paper, the authors propose a new internal learning-based video inpainting model called appearance consistency and motion coherence network (ACMC-Net), which can not only learn the recurrence of appearance prior but can also capture motion coherence prior to improve the quality of the inpainting results. In ACMC-Net, a transformer-based appearance network is developed to capture global context information within the video frame for representing appearance consistency accurately. Additionally, a novel motion coherence learning scheme is proposed to learn the motion prior in a video sequence effectively. Finally, the learnt internal appearance consistency and motion coherence are implicitly propagated to the missing regions to achieve inpainting well. Extensive experiments conducted on the DAVIS dataset show that the proposed model obtains the superior performance in terms of quantitative measurements and produces more visually plausible results compared with the state-of-the-art methods.
期刊介绍:
CAAI Transactions on Intelligence Technology is a leading venue for original research on the theoretical and experimental aspects of artificial intelligence technology. We are a fully open access journal co-published by the Institution of Engineering and Technology (IET) and the Chinese Association for Artificial Intelligence (CAAI) providing research which is openly accessible to read and share worldwide.