Yang Zheng, Qing Li, Jiangyun Li, Zhenghao Xi, Jie Liu
{"title":"通过双分支网络增强特征一致性和多层次特征挖掘,加强多视图地理定位中的语义信息表征","authors":"Yang Zheng, Qing Li, Jiangyun Li, Zhenghao Xi, Jie Liu","doi":"10.1049/ipr2.70071","DOIUrl":null,"url":null,"abstract":"<p>Metric learning is fundamental to multi-view geo-localization, as it aims to establish a distance metric that minimizes the feature space distance between similar data points while maximizing the separation between dissimilar ones. However, in Siamese networks employed for metric learning, individual branches may exhibit discrepancies in their interpretation of semantic information from input data, resulting in semantically inconsistent feature representations. To address this issue, a method is designed to enhance significant region consistency within multi-view spaces by integrating feature consistency enhancement (FCE) and multi-level feature mining (MLFM) techniques into a dual-branch network. The FCE method emphasizes critical components of the input data, ensuring feature consistency between the two branches. Additionally, the MLFM mechanism facilitates feature integration across multiple levels, thereby enabling a more comprehensive extraction of semantic information. This approach enhances semantic understanding and promotes feature consistency across branches. The proposed method achieves AP values of 82.38% for drone-to-satellite and 77.36% for satellite-to-drone image matching. Notably, the method maintains computational efficiency without significantly affecting inference time. Additionally, improvements are observed in R@1, R@5 and R@10 metrics. The experimental results show that integrating FCE and MLFM into the dual-branch network improves semantic representation and outperforms existing methods.</p>","PeriodicalId":56303,"journal":{"name":"IET Image Processing","volume":"19 1","pages":""},"PeriodicalIF":2.0000,"publicationDate":"2025-04-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/ipr2.70071","citationCount":"0","resultStr":"{\"title\":\"Enhancing Semantic Information Representation in Multi-View Geo-Localization through Dual-Branch Network with Feature Consistency Enhancement and Multi-Level Feature Mining\",\"authors\":\"Yang Zheng, Qing Li, Jiangyun Li, Zhenghao Xi, Jie Liu\",\"doi\":\"10.1049/ipr2.70071\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>Metric learning is fundamental to multi-view geo-localization, as it aims to establish a distance metric that minimizes the feature space distance between similar data points while maximizing the separation between dissimilar ones. However, in Siamese networks employed for metric learning, individual branches may exhibit discrepancies in their interpretation of semantic information from input data, resulting in semantically inconsistent feature representations. To address this issue, a method is designed to enhance significant region consistency within multi-view spaces by integrating feature consistency enhancement (FCE) and multi-level feature mining (MLFM) techniques into a dual-branch network. The FCE method emphasizes critical components of the input data, ensuring feature consistency between the two branches. Additionally, the MLFM mechanism facilitates feature integration across multiple levels, thereby enabling a more comprehensive extraction of semantic information. This approach enhances semantic understanding and promotes feature consistency across branches. The proposed method achieves AP values of 82.38% for drone-to-satellite and 77.36% for satellite-to-drone image matching. Notably, the method maintains computational efficiency without significantly affecting inference time. Additionally, improvements are observed in R@1, R@5 and R@10 metrics. The experimental results show that integrating FCE and MLFM into the dual-branch network improves semantic representation and outperforms existing methods.</p>\",\"PeriodicalId\":56303,\"journal\":{\"name\":\"IET Image Processing\",\"volume\":\"19 1\",\"pages\":\"\"},\"PeriodicalIF\":2.0000,\"publicationDate\":\"2025-04-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://onlinelibrary.wiley.com/doi/epdf/10.1049/ipr2.70071\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IET Image Processing\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1049/ipr2.70071\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IET Image Processing","FirstCategoryId":"94","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1049/ipr2.70071","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
Enhancing Semantic Information Representation in Multi-View Geo-Localization through Dual-Branch Network with Feature Consistency Enhancement and Multi-Level Feature Mining
Metric learning is fundamental to multi-view geo-localization, as it aims to establish a distance metric that minimizes the feature space distance between similar data points while maximizing the separation between dissimilar ones. However, in Siamese networks employed for metric learning, individual branches may exhibit discrepancies in their interpretation of semantic information from input data, resulting in semantically inconsistent feature representations. To address this issue, a method is designed to enhance significant region consistency within multi-view spaces by integrating feature consistency enhancement (FCE) and multi-level feature mining (MLFM) techniques into a dual-branch network. The FCE method emphasizes critical components of the input data, ensuring feature consistency between the two branches. Additionally, the MLFM mechanism facilitates feature integration across multiple levels, thereby enabling a more comprehensive extraction of semantic information. This approach enhances semantic understanding and promotes feature consistency across branches. The proposed method achieves AP values of 82.38% for drone-to-satellite and 77.36% for satellite-to-drone image matching. Notably, the method maintains computational efficiency without significantly affecting inference time. Additionally, improvements are observed in R@1, R@5 and R@10 metrics. The experimental results show that integrating FCE and MLFM into the dual-branch network improves semantic representation and outperforms existing methods.
期刊介绍:
The IET Image Processing journal encompasses research areas related to the generation, processing and communication of visual information. The focus of the journal is the coverage of the latest research results in image and video processing, including image generation and display, enhancement and restoration, segmentation, colour and texture analysis, coding and communication, implementations and architectures as well as innovative applications.
Principal topics include:
Generation and Display - Imaging sensors and acquisition systems, illumination, sampling and scanning, quantization, colour reproduction, image rendering, display and printing systems, evaluation of image quality.
Processing and Analysis - Image enhancement, restoration, segmentation, registration, multispectral, colour and texture processing, multiresolution processing and wavelets, morphological operations, stereoscopic and 3-D processing, motion detection and estimation, video and image sequence processing.
Implementations and Architectures - Image and video processing hardware and software, design and construction, architectures and software, neural, adaptive, and fuzzy processing.
Coding and Transmission - Image and video compression and coding, compression standards, noise modelling, visual information networks, streamed video.
Retrieval and Multimedia - Storage of images and video, database design, image retrieval, video annotation and editing, mixed media incorporating visual information, multimedia systems and applications, image and video watermarking, steganography.
Applications - Innovative application of image and video processing technologies to any field, including life sciences, earth sciences, astronomy, document processing and security.
Current Special Issue Call for Papers:
Evolutionary Computation for Image Processing - https://digital-library.theiet.org/files/IET_IPR_CFP_EC.pdf
AI-Powered 3D Vision - https://digital-library.theiet.org/files/IET_IPR_CFP_AIPV.pdf
Multidisciplinary advancement of Imaging Technologies: From Medical Diagnostics and Genomics to Cognitive Machine Vision, and Artificial Intelligence - https://digital-library.theiet.org/files/IET_IPR_CFP_IST.pdf
Deep Learning for 3D Reconstruction - https://digital-library.theiet.org/files/IET_IPR_CFP_DLR.pdf