Nanjun Yuan, Fan Yang, Yuefeng Zhang, Luxia Ai, Wenbing Tao
{"title":"Learning hierarchical image feature for efficient image rectification","authors":"Nanjun Yuan, Fan Yang, Yuefeng Zhang, Luxia Ai, Wenbing Tao","doi":"10.1016/j.neucom.2025.130646","DOIUrl":null,"url":null,"abstract":"<div><div>Image stitching methods often use single-homography or multi-homography estimation for alignment, resulting in images with undesirable irregular boundaries. To address this, cropping and image inpainting are the common operations but discard image regions or introduce content that differs from reality. Recently, deep learning-based methods improve the content fidelity of the rectified images, while suffering from distortion, artifacts, and discontinuous deformations between adjacent image regions. In this work, we propose an efficient network based on the transformer (Rectformer) for image rectification. Specifically, we propose the Global and Local Features (GLF) module, which consists of the Hybrid Self-Attention module and Dynamic Convolution module to capture hierarchical image features. We further introduce two auxiliary losses for better image rectification, bidirectional contextual (BC) loss and deformation consistency (DC) loss. The bidirectional contextual loss encourages the model to preserve image local structure information. The loss of deformation consistency improves the network’s geometric recovery and generalization capabilities through a self-supervised learning strategy. Finally, extensive experiments demonstrate that our method outperforms the existing state-of-the-art methods for rotation correction and rectangling.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"648 ","pages":"Article 130646"},"PeriodicalIF":5.5000,"publicationDate":"2025-06-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neurocomputing","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0925231225013189","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Image stitching methods often use single-homography or multi-homography estimation for alignment, resulting in images with undesirable irregular boundaries. To address this, cropping and image inpainting are the common operations but discard image regions or introduce content that differs from reality. Recently, deep learning-based methods improve the content fidelity of the rectified images, while suffering from distortion, artifacts, and discontinuous deformations between adjacent image regions. In this work, we propose an efficient network based on the transformer (Rectformer) for image rectification. Specifically, we propose the Global and Local Features (GLF) module, which consists of the Hybrid Self-Attention module and Dynamic Convolution module to capture hierarchical image features. We further introduce two auxiliary losses for better image rectification, bidirectional contextual (BC) loss and deformation consistency (DC) loss. The bidirectional contextual loss encourages the model to preserve image local structure information. The loss of deformation consistency improves the network’s geometric recovery and generalization capabilities through a self-supervised learning strategy. Finally, extensive experiments demonstrate that our method outperforms the existing state-of-the-art methods for rotation correction and rectangling.
期刊介绍:
Neurocomputing publishes articles describing recent fundamental contributions in the field of neurocomputing. Neurocomputing theory, practice and applications are the essential topics being covered.