{"title":"将图像去毛刺解耦为两部分:去焦点模糊的层次模型","authors":"Pengwei Liang;Junjun Jiang;Xianming Liu;Jiayi Ma","doi":"10.1109/TCI.2024.3443732","DOIUrl":null,"url":null,"abstract":"Defocus deblurring, especially when facing spatially varying blur due to scene depth, remains a challenging problem. While recent advancements in network architectures have predominantly addressed high-frequency details, the importance of scene understanding for deblurring remains paramount. A crucial aspect of this understanding is \n<italic>contextual information</i>\n, which captures vital high-level semantic cues essential for grasping the context and object outlines. Recognizing and effectively capitalizing on these cues can lead to substantial improvements in image recovery. With this foundation, we propose a novel method that integrates spatial details and contextual information, offering significant advancements in defocus deblurring. Consequently, we introduce a novel hierarchical model, built upon the capabilities of the Vision Transformer (ViT). This model seamlessly encodes both spatial details and contextual information, yielding a robust solution. In particular, our approach decouples the complex deblurring task into two distinct subtasks. The first is handled by a primary feature encoder that transforms blurred images into detailed representations. The second involves a contextual encoder that produces abstract and sharp representations from the primary ones. The combined outputs from these encoders are then merged by a decoder to reproduce the sharp target image. Our evaluation across multiple defocus deblurring datasets demonstrates that the proposed method achieves compelling performance.","PeriodicalId":56022,"journal":{"name":"IEEE Transactions on Computational Imaging","volume":"10 ","pages":"1207-1220"},"PeriodicalIF":4.2000,"publicationDate":"2024-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Decoupling Image Deblurring Into Twofold: A Hierarchical Model for Defocus Deblurring\",\"authors\":\"Pengwei Liang;Junjun Jiang;Xianming Liu;Jiayi Ma\",\"doi\":\"10.1109/TCI.2024.3443732\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Defocus deblurring, especially when facing spatially varying blur due to scene depth, remains a challenging problem. While recent advancements in network architectures have predominantly addressed high-frequency details, the importance of scene understanding for deblurring remains paramount. A crucial aspect of this understanding is \\n<italic>contextual information</i>\\n, which captures vital high-level semantic cues essential for grasping the context and object outlines. Recognizing and effectively capitalizing on these cues can lead to substantial improvements in image recovery. With this foundation, we propose a novel method that integrates spatial details and contextual information, offering significant advancements in defocus deblurring. Consequently, we introduce a novel hierarchical model, built upon the capabilities of the Vision Transformer (ViT). This model seamlessly encodes both spatial details and contextual information, yielding a robust solution. In particular, our approach decouples the complex deblurring task into two distinct subtasks. The first is handled by a primary feature encoder that transforms blurred images into detailed representations. The second involves a contextual encoder that produces abstract and sharp representations from the primary ones. The combined outputs from these encoders are then merged by a decoder to reproduce the sharp target image. Our evaluation across multiple defocus deblurring datasets demonstrates that the proposed method achieves compelling performance.\",\"PeriodicalId\":56022,\"journal\":{\"name\":\"IEEE Transactions on Computational Imaging\",\"volume\":\"10 \",\"pages\":\"1207-1220\"},\"PeriodicalIF\":4.2000,\"publicationDate\":\"2024-08-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Computational Imaging\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10637737/\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Computational Imaging","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10637737/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
Decoupling Image Deblurring Into Twofold: A Hierarchical Model for Defocus Deblurring
Defocus deblurring, especially when facing spatially varying blur due to scene depth, remains a challenging problem. While recent advancements in network architectures have predominantly addressed high-frequency details, the importance of scene understanding for deblurring remains paramount. A crucial aspect of this understanding is
contextual information
, which captures vital high-level semantic cues essential for grasping the context and object outlines. Recognizing and effectively capitalizing on these cues can lead to substantial improvements in image recovery. With this foundation, we propose a novel method that integrates spatial details and contextual information, offering significant advancements in defocus deblurring. Consequently, we introduce a novel hierarchical model, built upon the capabilities of the Vision Transformer (ViT). This model seamlessly encodes both spatial details and contextual information, yielding a robust solution. In particular, our approach decouples the complex deblurring task into two distinct subtasks. The first is handled by a primary feature encoder that transforms blurred images into detailed representations. The second involves a contextual encoder that produces abstract and sharp representations from the primary ones. The combined outputs from these encoders are then merged by a decoder to reproduce the sharp target image. Our evaluation across multiple defocus deblurring datasets demonstrates that the proposed method achieves compelling performance.
期刊介绍:
The IEEE Transactions on Computational Imaging will publish articles where computation plays an integral role in the image formation process. Papers will cover all areas of computational imaging ranging from fundamental theoretical methods to the latest innovative computational imaging system designs. Topics of interest will include advanced algorithms and mathematical techniques, model-based data inversion, methods for image and signal recovery from sparse and incomplete data, techniques for non-traditional sensing of image data, methods for dynamic information acquisition and extraction from imaging sensors, software and hardware for efficient computation in imaging systems, and highly novel imaging system design.