一种基于模板感知动态卷积的深度模板匹配和平面位姿估计方法

IF 7.5 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
Ke Jia , Ji Zhou , Hanxin Li , Zhigan Zhou , Haojie Chu , Xiaojie Li
{"title":"一种基于模板感知动态卷积的深度模板匹配和平面位姿估计方法","authors":"Ke Jia ,&nbsp;Ji Zhou ,&nbsp;Hanxin Li ,&nbsp;Zhigan Zhou ,&nbsp;Haojie Chu ,&nbsp;Xiaojie Li","doi":"10.1016/j.eswa.2025.129813","DOIUrl":null,"url":null,"abstract":"<div><div>In industrial inspection and component alignment tasks, template matching requires efficient estimation of a target’s position and geometric state (rotation and scaling) under complex backgrounds to support precise downstream operations. Traditional methods rely on exhaustive enumeration of angles and scales, leading to low efficiency under compound transformations. Meanwhile, most deep learning-based approaches only estimate similarity scores without explicitly modeling geometric pose, making them inadequate for real-world deployment. To overcome these limitations, we propose a lightweight end-to-end framework that reformulates template matching as joint localization and geometric regression, outputting the center coordinates, rotation angle, and independent horizontal and vertical scales. A Template-Aware Dynamic Convolution Module (TDCM) dynamically injects template features at inference to guide generalizable matching. The compact network integrates depthwise separable convolutions and pixel shuffle for efficient matching. To enable geometric-annotation-free training, we introduce a rotation-shear-based augmentation strategy with structure-aware pseudo labels. A lightweight refinement module further improves angle and scale precision via local optimization. Experiments show our 3.07M model achieves high precision and <span><math><mo>∼</mo></math></span>14 ms inference under compound transformations. It also demonstrates strong robustness in small-template and multi-object scenarios, making it highly suitable for deployment in real-time industrial applications. The code is available at: <span><span>https://github.com/ZhouJ6610/PoseMatch-TDCM</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"298 ","pages":"Article 129813"},"PeriodicalIF":7.5000,"publicationDate":"2025-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"An efficient deep template matching and in-plane pose estimation method via template-aware dynamic convolution\",\"authors\":\"Ke Jia ,&nbsp;Ji Zhou ,&nbsp;Hanxin Li ,&nbsp;Zhigan Zhou ,&nbsp;Haojie Chu ,&nbsp;Xiaojie Li\",\"doi\":\"10.1016/j.eswa.2025.129813\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>In industrial inspection and component alignment tasks, template matching requires efficient estimation of a target’s position and geometric state (rotation and scaling) under complex backgrounds to support precise downstream operations. Traditional methods rely on exhaustive enumeration of angles and scales, leading to low efficiency under compound transformations. Meanwhile, most deep learning-based approaches only estimate similarity scores without explicitly modeling geometric pose, making them inadequate for real-world deployment. To overcome these limitations, we propose a lightweight end-to-end framework that reformulates template matching as joint localization and geometric regression, outputting the center coordinates, rotation angle, and independent horizontal and vertical scales. A Template-Aware Dynamic Convolution Module (TDCM) dynamically injects template features at inference to guide generalizable matching. The compact network integrates depthwise separable convolutions and pixel shuffle for efficient matching. To enable geometric-annotation-free training, we introduce a rotation-shear-based augmentation strategy with structure-aware pseudo labels. A lightweight refinement module further improves angle and scale precision via local optimization. Experiments show our 3.07M model achieves high precision and <span><math><mo>∼</mo></math></span>14 ms inference under compound transformations. It also demonstrates strong robustness in small-template and multi-object scenarios, making it highly suitable for deployment in real-time industrial applications. The code is available at: <span><span>https://github.com/ZhouJ6610/PoseMatch-TDCM</span><svg><path></path></svg></span>.</div></div>\",\"PeriodicalId\":50461,\"journal\":{\"name\":\"Expert Systems with Applications\",\"volume\":\"298 \",\"pages\":\"Article 129813\"},\"PeriodicalIF\":7.5000,\"publicationDate\":\"2025-09-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Expert Systems with Applications\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0957417425034281\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Expert Systems with Applications","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0957417425034281","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

摘要

在工业检测和组件对准任务中,模板匹配需要在复杂背景下有效估计目标的位置和几何状态(旋转和缩放),以支持精确的下游操作。传统的方法依赖于角度和尺度的穷举枚举,导致复合变换的效率很低。与此同时,大多数基于深度学习的方法只估计相似性分数,而没有明确地建模几何姿态,这使得它们不适合现实世界的部署。为了克服这些限制,我们提出了一个轻量级的端到端框架,该框架将模板匹配重新定义为关节定位和几何回归,输出中心坐标,旋转角度以及独立的水平和垂直尺度。基于模板感知的动态卷积模块(TDCM)在推理中动态注入模板特征,指导可泛化匹配。紧凑的网络集成了深度可分离卷积和像素洗牌,以实现高效匹配。为了实现无几何标注的训练,我们引入了一种基于旋转剪切的增强策略,该策略具有结构感知伪标签。一个轻量级的细化模块通过局部优化进一步提高角度和尺度精度。实验表明,我们的3.07M模型在复合变换下实现了高精度和~ 14 ms的推理。它还在小模板和多对象场景中展示了强大的鲁棒性,使其非常适合在实时工业应用中部署。代码可从https://github.com/ZhouJ6610/PoseMatch-TDCM获得。
本文章由计算机程序翻译,如有差异,请以英文原文为准。

An efficient deep template matching and in-plane pose estimation method via template-aware dynamic convolution

An efficient deep template matching and in-plane pose estimation method via template-aware dynamic convolution
In industrial inspection and component alignment tasks, template matching requires efficient estimation of a target’s position and geometric state (rotation and scaling) under complex backgrounds to support precise downstream operations. Traditional methods rely on exhaustive enumeration of angles and scales, leading to low efficiency under compound transformations. Meanwhile, most deep learning-based approaches only estimate similarity scores without explicitly modeling geometric pose, making them inadequate for real-world deployment. To overcome these limitations, we propose a lightweight end-to-end framework that reformulates template matching as joint localization and geometric regression, outputting the center coordinates, rotation angle, and independent horizontal and vertical scales. A Template-Aware Dynamic Convolution Module (TDCM) dynamically injects template features at inference to guide generalizable matching. The compact network integrates depthwise separable convolutions and pixel shuffle for efficient matching. To enable geometric-annotation-free training, we introduce a rotation-shear-based augmentation strategy with structure-aware pseudo labels. A lightweight refinement module further improves angle and scale precision via local optimization. Experiments show our 3.07M model achieves high precision and 14 ms inference under compound transformations. It also demonstrates strong robustness in small-template and multi-object scenarios, making it highly suitable for deployment in real-time industrial applications. The code is available at: https://github.com/ZhouJ6610/PoseMatch-TDCM.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Expert Systems with Applications
Expert Systems with Applications 工程技术-工程:电子与电气
CiteScore
13.80
自引率
10.60%
发文量
2045
审稿时长
8.7 months
期刊介绍: Expert Systems With Applications is an international journal dedicated to the exchange of information on expert and intelligent systems used globally in industry, government, and universities. The journal emphasizes original papers covering the design, development, testing, implementation, and management of these systems, offering practical guidelines. It spans various sectors such as finance, engineering, marketing, law, project management, information management, medicine, and more. The journal also welcomes papers on multi-agent systems, knowledge management, neural networks, knowledge discovery, data mining, and other related areas, excluding applications to military/defense systems.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信