IEEE transactions on image processing : a publication of the IEEE Signal Processing Society最新文献

筛选
英文 中文
Consistency-Queried Transformer for Audio-Visual Segmentation 基于一致性查询的视听分割变压器
Ying Lv;Zhi Liu;Xiaojun Chang
{"title":"Consistency-Queried Transformer for Audio-Visual Segmentation","authors":"Ying Lv;Zhi Liu;Xiaojun Chang","doi":"10.1109/TIP.2025.3563076","DOIUrl":"10.1109/TIP.2025.3563076","url":null,"abstract":"Audio-visual segmentation (AVS) aims to segment objects in audio-visual content. The effective interaction between audio and visual features has garnered significant attention from the multimodal domain. Despite significant advancements, most existing AVS methods are hampered by multimodal inconsistencies. These inconsistencies primarily manifest as a mismatch between audio and visual information guided by audio cues, wherein visual features often dominate audio modality. To address this issue, we propose the Consistency-Queried Transformer (CQFormer), a novel framework for AVS tasks that leverages the transformer architecture. This framework features a Consistency Query Generator (CQG) and a Query-Aligned Matching (QAM) module. The Noise Contrastive Estimation (NCE) loss function enhances modality matching and consistency by minimizing the distributional differences between audio and visual features, facilitating effective fusion and interaction between these features. Additionally, introducing the consistency query during the decoding stage enhances consistency constraints and object-level semantic information, further improving the accuracy and stability of audio-visual segmentation. Extensive experiments on the popular benchmark of the audio-visual segmentation dataset demonstrate that the proposed CQFormer achieves state-of-the-art performance.","PeriodicalId":94032,"journal":{"name":"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society","volume":"34 ","pages":"2616-2627"},"PeriodicalIF":0.0,"publicationDate":"2025-04-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143884573","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Novel Computational Photography for Soft-Focus Effect in Automatic Post Production 用于自动后期制作软焦效果的新型计算摄影
Hao-Yu Tsai;Morris C.-H. Tsai;Scott C.-H. Huang;Hsiao-Chun Wu
{"title":"Novel Computational Photography for Soft-Focus Effect in Automatic Post Production","authors":"Hao-Yu Tsai;Morris C.-H. Tsai;Scott C.-H. Huang;Hsiao-Chun Wu","doi":"10.1109/TIP.2025.3562071","DOIUrl":"10.1109/TIP.2025.3562071","url":null,"abstract":"The well-known soft-focus effect, which relies on either special optical filters or manual post-production techniques, has been intriguing and powerful in photography for quite a while. Nonetheless, how to impose the soft-focus effect automatically simply using sophisticated image-processing (computational photography) algorithms has never been addressed in the literature to the best of our knowledge. In this work, we would like to make the first-ever attempt to design an automatic, optical-filter-free approach to create the appropriate soft-focus effects desired by individual users. Our approach is first to investigate the physical optical filter, namely <italic>Kenko Black Mist No. 5</i>, and estimate the corresponding kernel matrix (i.e., the system impulse response matrix) using our proposed novel irradiance-domain kernel-matrix estimation framework. Furthermore, we demonstrate that it is not feasible to find a kernel matrix that precisely characterizes the soft-focus effect by just using a pixel-value-domain image (a regular photo) in post production. To combat the aforementioned problem, we establish a novel pixel-value-to-pseudo-irradiance map such that the pseudo irradiance-domain image can be obtained directly from any pixel-value-domain image. Finally the soft-focus effect can be created from the two-dimensional convolution between the pseudo irradiance-domain image and the estimated kernel. To evaluate our proposed automatic scheme for soft-focus effect, we compare the results from our proposed new scheme and the physical optical filter in terms of the DCT-KLD (Kullback-Leibler divergence of discrete cosine transform) and the conventional PSNR (peak-signal-to-noise ratio). Experiments show that our proposed new scheme can achieve very small DCT-KLDs and very large PSNRs over the ground truth, namely the results from the physical optical filter.","PeriodicalId":94032,"journal":{"name":"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society","volume":"34 ","pages":"2560-2574"},"PeriodicalIF":0.0,"publicationDate":"2025-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143866985","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Disentangled Noisy Correspondence Learning 解纠缠的噪声对应学习
Zhuohang Dang;Minnan Luo;Jihong Wang;Chengyou Jia;Haochen Han;Herun Wan;Guang Dai;Xiaojun Chang;Jingdong Wang
{"title":"Disentangled Noisy Correspondence Learning","authors":"Zhuohang Dang;Minnan Luo;Jihong Wang;Chengyou Jia;Haochen Han;Herun Wan;Guang Dai;Xiaojun Chang;Jingdong Wang","doi":"10.1109/TIP.2025.3559457","DOIUrl":"10.1109/TIP.2025.3559457","url":null,"abstract":"Cross-modal retrieval is crucial in understanding latent correspondences across modalities. However, existing methods implicitly assume well-matched training data, which is impractical as real-world data inevitably involves imperfect alignments, i.e., noisy correspondences. Although some works explore similarity-based strategies to address such noise, they suffer from sub-optimal similarity predictions influenced by modality-exclusive information (MEI), e.g., background noise in images and abstract definitions in texts. This issue arises as MEI is not shared across modalities, thus aligning it in training can markedly mislead similarity predictions. Moreover, although intuitive, directly applying previous cross-modal disentanglement methods suffers from limited noise tolerance and disentanglement efficacy. Inspired by the robustness of information bottlenecks against noise, we introduce DisNCL, a novel information-theoretic framework for feature Disentanglement in Noisy Correspondence Learning, to adaptively balance the extraction of modality-invariant information (MII) and MEI with certifiable optimal cross-modal disentanglement efficacy. DisNCL then enhances similarity predictions in modality-invariant subspace, thereby greatly boosting similarity-based alleviation strategy for noisy correspondences. Furthermore, DisNCL introduces soft matching targets to model noisy many-to-many relationships inherent in multi-modal inputs for noise-robust and accurate cross-modal alignment. Extensive experiments confirm DisNCL’s efficacy by 2% average recall improvement. Mutual information estimation and visualization results show that DisNCL learns meaningful MII/MEI subspaces, validating our theoretical analyses.","PeriodicalId":94032,"journal":{"name":"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society","volume":"34 ","pages":"2602-2615"},"PeriodicalIF":0.0,"publicationDate":"2025-04-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143858029","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Unsupervised Range-Nullspace Learning Prior for Multispectral Images Reconstruction 多光谱图像重建的无监督距离-零空间先验学习
Yurong Chen;Yaonan Wang;Hui Zhang
{"title":"Unsupervised Range-Nullspace Learning Prior for Multispectral Images Reconstruction","authors":"Yurong Chen;Yaonan Wang;Hui Zhang","doi":"10.1109/TIP.2025.3560430","DOIUrl":"10.1109/TIP.2025.3560430","url":null,"abstract":"Snapshot Spectral Imaging (SSI) techniques, with the ability to capture both spectral and spatial information in a single exposure, have been found useful in a wide range of applications. SSI systems generally operate within the ‘encoding-decoding’ framework, leveraging the synergism of optical hardware and reconstruction algorithms. Typically, reconstructing desired spectral images from SSI measurements is an ill-posed and challenging problem. Existing studies utilize either model-based or deep learning-based methods, but both have their drawbacks. Model-based algorithms suffer from high computational costs, while supervised learning-based methods rely on large paired training data. In this paper, we propose a novel Unsupervised range-Nullspace learning (UnNull) prior for spectral image reconstruction. UnNull explicitly models the data via subspace decomposition, offering enhanced interpretability and generalization ability. Specifically, UnNull considers that the spectral images can be decomposed into the range and null subspaces. The features projected onto the range subspace are mainly low-frequency information, while features in the nullspace represent high-frequency information. Comprehensive multispectral demosaicing and reconstruction experiments demonstrate the superior performance of our proposed algorithm.","PeriodicalId":94032,"journal":{"name":"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society","volume":"34 ","pages":"2513-2528"},"PeriodicalIF":0.0,"publicationDate":"2025-04-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143849780","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
NP-Hand: Novel Perspective Hand Image Synthesis Guided by Normals NP-Hand:由法线引导的新颖视角手图像合成
Binghui Zuo;Wenqian Sun;Zimeng Zhao;Xiaohan Yuan;Yangang Wang
{"title":"NP-Hand: Novel Perspective Hand Image Synthesis Guided by Normals","authors":"Binghui Zuo;Wenqian Sun;Zimeng Zhao;Xiaohan Yuan;Yangang Wang","doi":"10.1109/TIP.2025.3560241","DOIUrl":"10.1109/TIP.2025.3560241","url":null,"abstract":"Synthesizing multi-view images that are geometrically consistent with a given single-view image is one of the hot issues in AIGC in recent years. Existing methods have achieved impressive performance on objects with symmetry or rigidity, but they are inappropriate for the human hand. Because an image-captured human hand has more diverse poses and less attractive textures. In this paper, we propose NP-Hand, a framework that elegantly combines the diffusion model and generative adversarial network: The multi-step diffusion is trained to synthesize low-resolution novel perspective, while the single-step generator is exploited to further enhance synthesis quality. To maintain the consistency between inputs and synthesis, we creatively introduce normal maps into NP-Hand to guide the whole synthesizing process. Comprehensive evaluations have demonstrated that the proposed framework is superior to existing state-of-the-art models and more suitable for synthesizing hand images with faithful structures and realistic appearance details. The code will be released on our website.","PeriodicalId":94032,"journal":{"name":"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society","volume":"34 ","pages":"2435-2449"},"PeriodicalIF":0.0,"publicationDate":"2025-04-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143849779","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Lightweight RGB-D Salient Object Detection From a Speed-Accuracy Tradeoff Perspective 从速度-精度权衡的角度看轻量级RGB-D显著目标检测
Songsong Duan;Xi Yang;Nannan Wang;Xinbo Gao
{"title":"Lightweight RGB-D Salient Object Detection From a Speed-Accuracy Tradeoff Perspective","authors":"Songsong Duan;Xi Yang;Nannan Wang;Xinbo Gao","doi":"10.1109/TIP.2025.3560488","DOIUrl":"10.1109/TIP.2025.3560488","url":null,"abstract":"Current RGB-D methods usually leverage large-scale backbones to improve accuracy but sacrifice efficiency. Meanwhile, several existing lightweight methods are difficult to achieve high-precision performance. To balance the efficiency and performance, we propose a Speed-Accuracy Tradeoff Network (SATNet) for Lightweight RGB-D SOD from three fundamental perspectives: depth quality, modality fusion, and feature representation. Concerning depth quality, we introduce the Depth Anything Model to generate high-quality depth maps,which effectively alleviates the multi-modal gaps in the current datasets. For modality fusion, we propose a Decoupled Attention Module (DAM) to explore the consistency within and between modalities. Here, the multi-modal features are decoupled into dual-view feature vectors to project discriminable information of feature maps. For feature representation, we develop a Dual Information Representation Module (DIRM) with a bi-directional inverted framework to enlarge the limited feature space generated by the lightweight backbones. DIRM models texture features and saliency features to enrich feature space, and employ two-way prediction heads to optimal its parameters through a bi-directional backpropagation. Finally, we design a Dual Feature Aggregation Module (DFAM) in the decoder to aggregate texture and saliency features. Extensive experiments on five public RGB-D SOD datasets indicate that the proposed SATNet excels state-of-the-art (SOTA) CNN-based heavyweight models and achieves a lightweight framework with 5.2 M parameters and 415 FPS. The code is available at <uri>https://github.com/duan-song/SATNet</uri>","PeriodicalId":94032,"journal":{"name":"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society","volume":"34 ","pages":"2529-2543"},"PeriodicalIF":0.0,"publicationDate":"2025-04-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143849731","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
NeRF-Det++: Incorporating Semantic Cues and Perspective-Aware Depth Supervision for Indoor Multi-View 3D Detection nerf - de++:结合语义线索和视角感知深度监督的室内多视角3D检测
Chenxi Huang;Yuenan Hou;Weicai Ye;Di Huang;Xiaoshui Huang;Binbin Lin;Deng Cai
{"title":"NeRF-Det++: Incorporating Semantic Cues and Perspective-Aware Depth Supervision for Indoor Multi-View 3D Detection","authors":"Chenxi Huang;Yuenan Hou;Weicai Ye;Di Huang;Xiaoshui Huang;Binbin Lin;Deng Cai","doi":"10.1109/TIP.2025.3560240","DOIUrl":"10.1109/TIP.2025.3560240","url":null,"abstract":"NeRF-Det has achieved impressive performance in indoor multi-view 3D detection by innovatively utilizing NeRF to enhance representation learning. Despite its notable performance, we uncover three decisive shortcomings in its current design, including semantic ambiguity, inappropriate sampling, and insufficient utilization of depth supervision. To combat the aforementioned problems, we present three corresponding solutions: 1) Semantic Enhancement. We project the freely available 3D segmentation annotations onto the 2D plane and leverage the corresponding 2D semantic maps as the supervision signal, significantly enhancing the semantic awareness of multi-view detectors. 2) Perspective-Aware Sampling. Instead of employing the uniform sampling strategy, we put forward the perspective-aware sampling policy that samples densely near the camera while sparsely in the distance, more effectively collecting the valuable geometric clues. 3) Ordinal Residual Depth Supervision. As opposed to directly regressing the depth values that are difficult to optimize, we divide the depth range of each scene into a fixed number of ordinal bins and reformulate the depth prediction as the combination of the classification of depth bins as well as the regression of the residual depth values, thereby benefiting the depth learning process. The resulting algorithm, NeRF-Det++, has exhibited appealing performance in the ScanNetV2 and ARKITScenes datasets. Notably, in ScanNetV2, NeRF-Det++ outperforms the competitive NeRF-Det by +1.9% in mAP<inline-formula> <tex-math>$text{@}0.25$ </tex-math></inline-formula> and +3.5% in mAP<inline-formula> <tex-math>$text{@}0.50$ </tex-math></inline-formula>. The code will be publicly available at <uri>https://github.com/mrsempress/NeRF-Detplusplus</uri>","PeriodicalId":94032,"journal":{"name":"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society","volume":"34 ","pages":"2575-2587"},"PeriodicalIF":0.0,"publicationDate":"2025-04-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143847260","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Eigenpose: Occlusion-Robust 3D Human Mesh Reconstruction 特征:遮挡鲁棒三维人体网格重建
Mi-Gyeong Gwon;Gi-Mun Um;Won-Sik Cheong;Wonjun Kim
{"title":"Eigenpose: Occlusion-Robust 3D Human Mesh Reconstruction","authors":"Mi-Gyeong Gwon;Gi-Mun Um;Won-Sik Cheong;Wonjun Kim","doi":"10.1109/TIP.2025.3559788","DOIUrl":"10.1109/TIP.2025.3559788","url":null,"abstract":"A new approach for occlusion-robust 3D human mesh reconstruction from a single image is introduced in this paper. Since occlusion has emerged as a major problem to be resolved in this field, there have been meaningful efforts to deal with various types of occlusions (e.g., person-to-person occlusion, person-to-object occlusion, self-occlusion, etc.). Although many recent studies have shown the remarkable progress, previous regression-based methods still have respective limitations to handle occlusion problems due to the lack of the appearance information. To address this problem, we propose a novel method for human mesh reconstruction based on the pose-relevant subspace analysis. Specifically, we first generate a set of eigenvectors, so-called eigenposes, by conducting the singular value decomposition (SVD) of the pose matrix, which contains diverse poses sampled from the training set. These eigenposes are then linearly combined to construct a target body pose according to fusing coefficients, which are learned through the proposed network. Such combination of principal body postures (i.e., eigenposes) in a global manner gives a great help to cope with partial ambiguities by occlusions. Furthermore, we also propose to exploit a joint injection module that efficiently incorporates the spatial information of visible joints into the encoded feature during the estimation process of fusing coefficients. Experimental results on benchmark datasets demonstrate the ability of the proposed method to robustly reconstruct the human mesh under various occlusions occurring in real-world scenarios. The code and model are publicly available at: <monospace><uri>https://github.com/DCVL-3D/Eigenpose_release</uri></monospace>.","PeriodicalId":94032,"journal":{"name":"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society","volume":"34 ","pages":"2379-2391"},"PeriodicalIF":0.0,"publicationDate":"2025-04-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143841834","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
MuseumMaker: Continual Style Customization Without Catastrophic Forgetting MuseumMaker:持续的风格定制,没有灾难性的忘记补充材料
Chenxi Liu;Gan Sun;Wenqi Liang;Jiahua Dong;Can Qin;Yang Cong
{"title":"MuseumMaker: Continual Style Customization Without Catastrophic Forgetting","authors":"Chenxi Liu;Gan Sun;Wenqi Liang;Jiahua Dong;Can Qin;Yang Cong","doi":"10.1109/TIP.2025.3553024","DOIUrl":"10.1109/TIP.2025.3553024","url":null,"abstract":"Pre-trainedlarge text-to-image (T2I) models with an appropriate text prompt has attracted growing interests in customized image generation fields. However, catastrophic forgetting issue makes it hard to continually synthesize new user-provided styles while retaining the satisfying results amongst learned styles. In this paper, we propose MuseumMaker, a method that enables the synthesis of images by following a set of customized styles in a never-end manner, and gradually accumulates these creative artistic works as a Museum. When facing with a new customization style, we develop a style distillation loss module to extract and learn the styles of the training data for new image generation task. It can minimize the learning biases caused by content of new training images, and address the catastrophic overfitting issue induced by few-shot images. To deal with catastrophic forgetting issue amongst past learned styles, we devise a dual regularization for shared-LoRA module to optimize the direction of model update, which could regularize the diffusion model from both weight and feature aspects, respectively. Meanwhile, to further preserve historical knowledge from past styles and address the limited representability of LoRA, we design a task-wise token learning module where a unique token embedding is learned to denote a new style. As any new user-provided style come, our MuseumMaker can capture the nuances of the new styles while maintaining the details of learned styles. Experimental results on diverse style datasets validate the effectiveness of our proposed MuseumMaker method, showcasing its robustness and versatility across various scenarios.","PeriodicalId":94032,"journal":{"name":"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society","volume":"34 ","pages":"2499-2512"},"PeriodicalIF":0.0,"publicationDate":"2025-04-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143841473","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Accurate and Robust Three-Intersection-Chord-Invariant Ellipse Detection 精确鲁棒的三交弦不变椭圆检测
Guan Xu;Yunkun Wang;Fang Chen;Hui Shen;Xiaotao Li
{"title":"Accurate and Robust Three-Intersection-Chord-Invariant Ellipse Detection","authors":"Guan Xu;Yunkun Wang;Fang Chen;Hui Shen;Xiaotao Li","doi":"10.1109/TIP.2025.3559409","DOIUrl":"10.1109/TIP.2025.3559409","url":null,"abstract":"Ellipse detection is of great significance in the fields of image processing and computer vision. Accurate, stable and direct ellipse detection in real-world images has always been a key issue. Therefore, an ellipse detection method is proposed on the basis of the constructed three-intersection-chord-invariant. First, in the inflexion point detection, the PCA minimum bounding box considering the distribution characteristics of edge points is studied to achieve the more refined line segment screening. Second, a multi-scale inflexion point detection method is proposed to effectively avoid over-segmentation of small arc segments, providing assurance for more reasonable and reliable arc segment combinations. Then, the 20 precisely classified arc segment combinations are refined into 4 combinations. A number of non-homologous arc segment combinations can be quickly removed to reduce incorrect combinations by the constructed midpoint distance constraint and quadrant constraint. Moreover, in order to accurately reflect the strict arc segment combination constraints of geometric features of ellipses, a three-intersection-chord-invariant model of ellipses is established with strong constraint of relative distances among five constraint points, by which a more robust initial ellipse set of homologous arc segment combinations is further obtained. Finally, ellipse validation and clustering are performed on the initial set of ellipses to obtain the high-precision ellipses. The algorithm accuracy of the ellipse detection method is experimentally validated on 6 publicly available datasets and 2 established wheel rim datasets.","PeriodicalId":94032,"journal":{"name":"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society","volume":"34 ","pages":"2392-2407"},"PeriodicalIF":0.0,"publicationDate":"2025-04-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143836777","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信