International Journal of Computer Vision最新文献

筛选
英文 中文
DI-Retinex: Digital-Imaging Retinex Model for Low-Light Image Enhancement DI-Retinex:用于微光图像增强的数字成像Retinex模型
IF 19.5 2区 计算机科学
International Journal of Computer Vision Pub Date : 2025-09-06 DOI: 10.1007/s11263-025-02542-z
Shangquan Sun, Wenqi Ren, Jingyang Peng, Fenglong Song, Xiaochun Cao
{"title":"DI-Retinex: Digital-Imaging Retinex Model for Low-Light Image Enhancement","authors":"Shangquan Sun, Wenqi Ren, Jingyang Peng, Fenglong Song, Xiaochun Cao","doi":"10.1007/s11263-025-02542-z","DOIUrl":"https://doi.org/10.1007/s11263-025-02542-z","url":null,"abstract":"<p>Many existing methods for low-light image enhancement (LLIE) based on Retinex model ignore important factors that affect the validity of this model in digital imaging, such as noise, quantization error, non-linearity, and dynamic range overflow. In this paper, we propose a new expression called Digital-Imaging Retinex model (DI-Retinex) through theoretical and experimental analysis of Retinex model in digital imaging. Our new expression includes an offset term in the enhancement model, which allows for pixel-wise brightness contrast adjustment with a non-linear mapping function. In addition, to solve the low-light enhancement problem in an unsupervised manner, we propose an image-adaptive masked degradation loss in Gamma space. We also design a variance suppression loss for regulating the additional offset term. Extensive experiments show that our proposed method outperforms all existing unsupervised methods in terms of visual quality, model size, and speed. Our algorithm can also assist downstream face detectors in low-light, as it shows the most performance gain after the low-light enhancement compared to other methods. We have released our code and model weights on https://github.com/sunshangquan/Di-Retinex.</p>","PeriodicalId":13752,"journal":{"name":"International Journal of Computer Vision","volume":"42 1","pages":""},"PeriodicalIF":19.5,"publicationDate":"2025-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145007129","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Language-guided Recursive Spatiotemporal Graph Modeling for Video Summarization 基于语言的视频摘要递归时空图建模
IF 19.5 2区 计算机科学
International Journal of Computer Vision Pub Date : 2025-09-04 DOI: 10.1007/s11263-025-02577-2
Jungin Park, Jiyoung Lee, Kwanghoon Sohn
{"title":"Language-guided Recursive Spatiotemporal Graph Modeling for Video Summarization","authors":"Jungin Park, Jiyoung Lee, Kwanghoon Sohn","doi":"10.1007/s11263-025-02577-2","DOIUrl":"https://doi.org/10.1007/s11263-025-02577-2","url":null,"abstract":"<p>Video summarization aims to select keyframes that are visually diverse and can represent the whole story of a given video. Previous approaches have focused on global interlinkability between frames in a video by temporal modeling. However, fine-grained visual entities, such as objects, are also highly related to the main content of the video. Moreover, language-guided video summarization, which has recently been studied, requires a comprehensive linguistic understanding of complex real-world videos. To consider how all the objects are semantically related to each other, this paper regards video summarization as a language-guided spatiotemporal graph modeling problem. We present recursive spatiotemporal graph networks, called <i>VideoGraph</i>, which formulate the objects and frames as nodes of the spatial and temporal graphs, respectively. The nodes in each graph are connected and aggregated with graph edges, representing the semantic relationships between the nodes. To prevent the edges from being configured with visual similarity, we incorporate language queries derived from the video into the graph node representations, enabling them to contain semantic knowledge. In addition, we adopt a recursive strategy to refine initial graphs and correctly classify each frame node as a keyframe. In our experiments, VideoGraph achieves state-of-the-art performance on several benchmarks for generic and query-focused video summarization in both supervised and unsupervised manners. The code is available at https://github.com/park-jungin/videograph.</p>","PeriodicalId":13752,"journal":{"name":"International Journal of Computer Vision","volume":"8 1","pages":""},"PeriodicalIF":19.5,"publicationDate":"2025-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144995790","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Parameterized Low-Rank Regularizer for High-dimensional Visual Data 高维视觉数据的参数化低秩正则化
IF 19.5 2区 计算机科学
International Journal of Computer Vision Pub Date : 2025-09-04 DOI: 10.1007/s11263-025-02569-2
Shuang Xu, Zixiang Zhao, Xiangyong Cao, Jiangjun Peng, Xi-Le Zhao, Deyu Meng, Yulun Zhang, Radu Timofte, Luc Van Gool
{"title":"Parameterized Low-Rank Regularizer for High-dimensional Visual Data","authors":"Shuang Xu, Zixiang Zhao, Xiangyong Cao, Jiangjun Peng, Xi-Le Zhao, Deyu Meng, Yulun Zhang, Radu Timofte, Luc Van Gool","doi":"10.1007/s11263-025-02569-2","DOIUrl":"https://doi.org/10.1007/s11263-025-02569-2","url":null,"abstract":"<p>Factorization models and nuclear norms, two prominent methods for characterizing the low-rank prior, encounter challenges in accurately retrieving low-rank data under severe degradation and lack generalization capabilities. To mitigate these limitations, we propose a Parameterized Low-Rank Regularizer (PLRR), which models low-rank visual data through matrix factorization by utilizing neural networks to parameterize the factor matrices, whose feasible domains are essentially constrained. This approach can be interpreted as imposing an automatically learned penalty on factor matrices. More significantly, the knowledge encoded in network parameters enhances generalization. As a versatile low-rank modeling tool, PLRR exhibits superior performance in various inverse problems, including video foreground extraction, hyperspectral image (HSI) denoising, HSI inpainting, multi-temporal multispectral image (MSI) decloud, and MSI guided blind HSI super-resolution. More significantly, PLRR demonstrates robust generalization capabilities for images with diverse degradations, temporal variations, and scene contexts.\u0000</p>","PeriodicalId":13752,"journal":{"name":"International Journal of Computer Vision","volume":"22 1","pages":""},"PeriodicalIF":19.5,"publicationDate":"2025-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144995789","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
EdgeSAM: Prompt-In-the-Loop Distillation for SAM EdgeSAM: SAM的即时循环蒸馏
IF 19.5 2区 计算机科学
International Journal of Computer Vision Pub Date : 2025-09-02 DOI: 10.1007/s11263-025-02562-9
Chong Zhou, Xiangtai Li, Chen Change Loy, Bo Dai
{"title":"EdgeSAM: Prompt-In-the-Loop Distillation for SAM","authors":"Chong Zhou, Xiangtai Li, Chen Change Loy, Bo Dai","doi":"10.1007/s11263-025-02562-9","DOIUrl":"https://doi.org/10.1007/s11263-025-02562-9","url":null,"abstract":"<p>This paper presents EdgeSAM, an accelerated variant of the Segment Anything Model (SAM), optimized for efficient execution on edge devices with minimal compromise in performance. Our approach involves distilling the original ViT-based SAM image encoder into a purely CNN-based architecture, better suited for edge devices. We carefully benchmark various distillation strategies and demonstrate that task-agnostic encoder distillation fails to capture the full knowledge embodied in SAM.To overcome this bottleneck, we include both the prompt encoder and mask decoder in the distillation process, with box and point prompts in the loop, so that the distilled model can accurately capture the intricate dynamics between user input and mask generation. To mitigate dataset bias issues stemming from point prompt distillation, we incorporate a lightweight module within the encoder.As a result, EdgeSAM achieves a 37-fold speed increase compared to the original SAM, and it also outperforms MobileSAM/EfficientSAM, being over 7 times as fast when deployed on edge devices while enhancing the mIoUs on COCO and LVIS by 2.3/1.5 and 3.1/1.6, respectively. It is also the first SAM variant that can run at over 30 FPS on an iPhone 14. Code and demo are available here https://mmlab-ntu.github.io/project/edgesam/.</p>","PeriodicalId":13752,"journal":{"name":"International Journal of Computer Vision","volume":"28 1","pages":""},"PeriodicalIF":19.5,"publicationDate":"2025-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144930142","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Curvature Learning for Generalization of Hyperbolic Neural Networks 双曲神经网络泛化的曲率学习
IF 19.5 2区 计算机科学
International Journal of Computer Vision Pub Date : 2025-09-01 DOI: 10.1007/s11263-025-02567-4
Xiaomeng Fan, Yuwei Wu, Zhi Gao, Mehrtash Harandi, Yunde Jia
{"title":"Curvature Learning for Generalization of Hyperbolic Neural Networks","authors":"Xiaomeng Fan, Yuwei Wu, Zhi Gao, Mehrtash Harandi, Yunde Jia","doi":"10.1007/s11263-025-02567-4","DOIUrl":"https://doi.org/10.1007/s11263-025-02567-4","url":null,"abstract":"<p>Hyperbolic neural networks (HNNs) have demonstrated notable efficacy in representing real-world data with hierarchical structures via exploiting the geometric properties of hyperbolic spaces characterized by negative curvatures. Curvature plays a crucial role in optimizing HNNs. Inappropriate curvatures may cause HNNs to converge to suboptimal parameters, degrading overall performance. So far, the theoretical foundation of the effect of curvatures on HNNs has not been developed. In this paper, we derive a PAC-Bayesian generalization bound of HNNs, highlighting the role of curvatures in the generalization of HNNs via their effect on the smoothness of the loss landscape. Driven by the derived bound, we propose a sharpness-aware curvature learning method to smooth the loss landscape, thereby improving the generalization of HNNs. In our method, we design a scope sharpness measure for curvatures, which is minimized through a bi-level optimization process. Then, we introduce an implicit differentiation algorithm that efficiently solves the bi-level optimization by approximating gradients of curvatures. We present the approximation error and convergence analyses of the proposed method, showing that the approximation error is upper-bounded, and the proposed method can converge by bounding gradients of HNNs. Experiments on four settings: classification, learning from long-tailed data, learning from noisy data, and few-shot learning show that our method can improve the performance of HNNs.</p>","PeriodicalId":13752,"journal":{"name":"International Journal of Computer Vision","volume":"16 1","pages":""},"PeriodicalIF":19.5,"publicationDate":"2025-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144930140","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Predictive Display for Teleoperation Based on Vector Fields Using Lidar-Camera Fusion 基于激光雷达-相机融合的矢量场远程操作预测显示
IF 19.5 2区 计算机科学
International Journal of Computer Vision Pub Date : 2025-08-31 DOI: 10.1007/s11263-025-02550-z
Gaurav Sharma, Jeff Calder, Rajesh Rajamani
{"title":"Predictive Display for Teleoperation Based on Vector Fields Using Lidar-Camera Fusion","authors":"Gaurav Sharma, Jeff Calder, Rajesh Rajamani","doi":"10.1007/s11263-025-02550-z","DOIUrl":"https://doi.org/10.1007/s11263-025-02550-z","url":null,"abstract":"<p>Teleoperation can enable human intervention to help handle instances of failure in autonomy thus allowing for much safer deployment of autonomous vehicle technology. Successful teleoperation requires recreating the environment around the remote vehicle using camera data received over wireless communication channels. This paper develops a new predictive display system to tackle the significant time delays encountered in receiving camera data over wireless networks. First, a new high gain observer is developed for estimating the position and orientation of the ego vehicle. The novel observer is shown to perform accurate state estimation using only GNSS and gyroscope sensor readings. A vector field method which fuses the delayed camera and Lidar data is then presented. This method uses sparse 3D points obtained from Lidar and transforms them using the state estimates from the high gain observer to generate a sparse vector field for the camera image. Polynomial based interpolation is then performed to obtain the vector field for the complete image which is then remapped to synthesize images for accurate predictive display. The method is evaluated on real-world experimental data from the nuScenes and KITTI datasets. The performance of the high gain observer is also evaluated and compared with that of the EKF. The synthesized images using the vector field based predictive display are compared with ground truth images using various image metrics and offer vastly improved performance compared to delayed images.</p>","PeriodicalId":13752,"journal":{"name":"International Journal of Computer Vision","volume":"162 1","pages":""},"PeriodicalIF":19.5,"publicationDate":"2025-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144930139","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Visual Instruction Tuning towards General-Purpose Multimodal Large Language Model: A Survey 面向通用多模态大语言模型的视觉教学调优研究
IF 19.5 2区 计算机科学
International Journal of Computer Vision Pub Date : 2025-08-30 DOI: 10.1007/s11263-025-02572-7
Jiaxing Huang, Jingyi Zhang, Kai Jiang, Han Qiu, Xiaoqin Zhang, Ling Shao, Shijian Lu, Dacheng Tao
{"title":"Visual Instruction Tuning towards General-Purpose Multimodal Large Language Model: A Survey","authors":"Jiaxing Huang, Jingyi Zhang, Kai Jiang, Han Qiu, Xiaoqin Zhang, Ling Shao, Shijian Lu, Dacheng Tao","doi":"10.1007/s11263-025-02572-7","DOIUrl":"https://doi.org/10.1007/s11263-025-02572-7","url":null,"abstract":"<p>Traditional computer vision generally solves each single task independently by a specialist model with the task instruction implicitly considered and designed in the model architecture. This simply leads to two constraints in: (1) task-specific models where each model is trained for one specific task, hindering its scalability and synergy across diverse tasks; (2) pre-defined and fixed model interfaces that have limited interactivity and adaptability in following user’s task instructions. Visual Instruction Tuning (VIT), which learns from a wide range of vision tasks as described by natural language instructions, has recently been intensively studied to mitigate the constraints of specialist models. It fine-tunes a large vision model with natural language as general task instructions, aiming for a general-purpose multimodal large language model (MLLM) that can follow various language instructions and potentially solve various user-specified vision tasks. This work aims to provide a systematic and comprehensive review of visual instruction tuning that covers six key aspects including: (1) the background of vision task paradigm and its development towards VIT; (2) the foundations of VIT including commonly used network architectures, visual instruction tuning frameworks and objectives, as well as evaluation setups and tasks; (3) widely adopted benchmarks in visual instruction tuning and evaluations; (4) a thorough review of existing VIT techniques as categorized by both vision tasks and method designs, highlighting their major contributions, strengths, as well as constraints; (5) comparison and discussion of VIT methods over various instruction-following benchmarks; (6) challenges, possible research directions and research topics in the future visual instruction tuning study. A project associated with this work has been created at [link].</p>","PeriodicalId":13752,"journal":{"name":"International Journal of Computer Vision","volume":"24 1","pages":""},"PeriodicalIF":19.5,"publicationDate":"2025-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144930138","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SMPL-IKS: A Mixed Analytical-Neural Inverse Kinematics Solver for 3D Human Mesh Recovery SMPL-IKS:一种用于三维人体网格恢复的混合分析-神经逆运动学求解器
IF 19.5 2区 计算机科学
International Journal of Computer Vision Pub Date : 2025-08-30 DOI: 10.1007/s11263-025-02574-5
Zijian Zhang, Muqing Wu, Honghao Qi, Tianyi Ma, Min Zhao
{"title":"SMPL-IKS: A Mixed Analytical-Neural Inverse Kinematics Solver for 3D Human Mesh Recovery","authors":"Zijian Zhang, Muqing Wu, Honghao Qi, Tianyi Ma, Min Zhao","doi":"10.1007/s11263-025-02574-5","DOIUrl":"https://doi.org/10.1007/s11263-025-02574-5","url":null,"abstract":"<p>We present SMPL-IKS, a mixed analytical-neural inverse kinematics solver that operates on the well-known Skinned Multi-Person Linear model (SMPL) to recover human mesh from 3D skeleton. The key challenges in the task are threefold: (1) Shape Mismatching, (2) Error Accumulation, and (3) Rotation Ambiguity. Unlike previous methods that rely on costly vertex up-sampling or iterative optimization, SMPL-IKS directly regresses the SMPL parameters (<i>i.e.</i>, shape and pose parameters) in a clean and efficient way. Specifically, we propose to infer <i>skeleton-to-mesh</i> via three explicit mappings viz. <i>Shape Inverse (SI)</i>, <i>Inverse kinematics (IK)</i>, and <i>Pose Refinement (PR)</i>. SI maps bone length to shape parameters, IK maps bone direction to pose parameters, and PR addresses errors accumulated along the kinematic tree. SMPL-IKS is general and thus extensible to MANO or SMPL-H models. Extensive experiments are conducted on various benchmarks for body-only, hand-only, and body-hand scenarios. Our model surpasses state-of-the-art methods by a large margin while being much more efficient. Data and code are available at https://github.com/Z-Z-J/SMPL-IKS.</p>","PeriodicalId":13752,"journal":{"name":"International Journal of Computer Vision","volume":"31 1","pages":""},"PeriodicalIF":19.5,"publicationDate":"2025-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144930259","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Flexible Camera Calibration using a Collimator System 使用准直器系统的灵活相机校准
IF 19.5 2区 计算机科学
International Journal of Computer Vision Pub Date : 2025-08-29 DOI: 10.1007/s11263-025-02576-3
Shunkun Liang, Banglei Guan, Zhenbao Yu, Dongcai Tan, Pengju Sun, Zibin Liu, Qifeng Yu, Yang Shang
{"title":"Flexible Camera Calibration using a Collimator System","authors":"Shunkun Liang, Banglei Guan, Zhenbao Yu, Dongcai Tan, Pengju Sun, Zibin Liu, Qifeng Yu, Yang Shang","doi":"10.1007/s11263-025-02576-3","DOIUrl":"https://doi.org/10.1007/s11263-025-02576-3","url":null,"abstract":"<p>Camera calibration is a crucial step in photogrammetry and 3D vision applications. This paper introduces a novel camera calibration method using a designed collimator system. Our collimator system provides a reliable and controllable calibration environment for the camera. Exploiting the unique optical geometry property of our collimator system, we introduce an angle invariance constraint and further prove that the relative motion between the calibration target and camera conforms to a spherical motion model. This constraint reduces the original 6DOF relative motion between target and camera to a 3DOF pure rotation motion. Using spherical motion constraint, a closed-form linear solver for multiple images and a minimal solver for two images are proposed for camera calibration. Furthermore, we propose a single collimator image calibration algorithm based on the angle invariance constraint. This algorithm eliminates the requirement for camera motion, providing a novel solution for flexible and fast calibration. The performance of our method is evaluated in both synthetic and real-world experiments, which verify the feasibility of calibration using the collimator system and demonstrate that our method is superior to existing baseline methods. Demo code is available at https://github.com/LiangSK98/CollimatorCalibration.</p>","PeriodicalId":13752,"journal":{"name":"International Journal of Computer Vision","volume":"32 1","pages":""},"PeriodicalIF":19.5,"publicationDate":"2025-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144930137","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Depth from Coupled Optical Differentiation 耦合光学微分的深度
IF 19.5 2区 计算机科学
International Journal of Computer Vision Pub Date : 2025-08-29 DOI: 10.1007/s11263-025-02534-z
Junjie Luo, Yuxuan Liu, Emma Alexander, Qi Guo
{"title":"Depth from Coupled Optical Differentiation","authors":"Junjie Luo, Yuxuan Liu, Emma Alexander, Qi Guo","doi":"10.1007/s11263-025-02534-z","DOIUrl":"https://doi.org/10.1007/s11263-025-02534-z","url":null,"abstract":"<p>We propose depth from coupled optical differentiation, a low-computation passive-lighting 3D sensing mechanism. It is based on our discovery that per-pixel object distance can be rigorously determined by a coupled pair of optical derivatives of a defocused image using a simple, closed-form relationship. Unlike previous depth-from-defocus (DfD) methods that leverage higher-order spatial derivatives of the image to estimate scene depths, the proposed mechanism’s use of only first-order optical derivatives makes it significantly more robust to noise. Furthermore, unlike many previous DfD algorithms with requirements on aperture code, this relationship is proved to be universal to a broad range of aperture codes. We build the first 3D sensor based on depth from coupled optical differentiation. Its optical assembly includes a deformable lens and a motorized iris, which enables dynamic adjustments to the optical power and aperture radius. The sensor captures two pairs of images: one pair with a differential change of optical power and the other with a differential change of aperture scale. From the four images, a depth and confidence map can be generated with only 36 floating point operations per output pixel (FLOPOP), more than ten times lower than the previous lowest passive-lighting depth sensing solution to our knowledge. Additionally, the depth map generated by the proposed sensor demonstrates more than twice the working range of previous DfD methods while using significantly lower computation.</p>","PeriodicalId":13752,"journal":{"name":"International Journal of Computer Vision","volume":"116 1","pages":""},"PeriodicalIF":19.5,"publicationDate":"2025-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144930210","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信