Machine Vision and Applications最新文献_第3页

Fourier feature network for 3D vessel reconstruction from biplane angiograms 从双平面血管造影中重建三维血管的傅立叶特征网络

IF 3.3 4区计算机科学

Machine Vision and Applications Pub Date : 2024-08-01 DOI: 10.1007/s00138-024-01585-5

Sean Wu, Naoki Kaneko, David S. Liebeskind, Fabien Scalzo

{"title":"Fourier feature network for 3D vessel reconstruction from biplane angiograms","authors":"Sean Wu, Naoki Kaneko, David S. Liebeskind, Fabien Scalzo","doi":"10.1007/s00138-024-01585-5","DOIUrl":"https://doi.org/10.1007/s00138-024-01585-5","url":null,"abstract":"3D reconstruction of biplane cerebral angiograms remains a challenging, unsolved research problem due to the loss of depth information and the unknown pixelwise correlation between input images. The occlusions arising from only two views complicate the reconstruction of fine vessel details and the simultaneous addressing of inherent missing information. In this paper, we take an incremental step toward solving this problem by reconstructing the corresponding 2D slice of the cerebral angiogram using biplane 1D image data. We developed a coordinate-based neural network that encodes the 1D image data along with a deterministic Fourier feature mapping from a given input point, resulting in a slice reconstruction that is more spatially accurate. Using only one 1D row of biplane image data, our Fourier feature network reconstructed the corresponding volume slices with a peak signal-to-noise ratio (PSNR) of 26.32 ± 0.36, a structural similarity index measure (SSIM) of 61.38 ± 1.79, a mean squared error (MSE) of 0.0023 ± 0.0002, and a mean absolute error (MAE) of 0.0364 ± 0.0029. Our research has implications for future work aimed at improving backprojection-based reconstruction by first examining individual slices from 1D information as a prerequisite.","PeriodicalId":51116,"journal":{"name":"Machine Vision and Applications","volume":"46 1","pages":""},"PeriodicalIF":3.3,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141867620","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Semi-supervised metric learning incorporating weighted triplet constraint and Riemannian manifold optimization for classification 结合加权三元组约束和黎曼流形优化进行分类的半监督度量学习

IF 3.3 4区计算机科学

Machine Vision and Applications Pub Date : 2024-07-26 DOI: 10.1007/s00138-024-01581-9

Yizhe Xia, Hongjuan Zhang

{"title":"Semi-supervised metric learning incorporating weighted triplet constraint and Riemannian manifold optimization for classification","authors":"Yizhe Xia, Hongjuan Zhang","doi":"10.1007/s00138-024-01581-9","DOIUrl":"https://doi.org/10.1007/s00138-024-01581-9","url":null,"abstract":"Metric learning focuses on finding similarities between data and aims to enlarge the distance between the samples with different labels. This work proposes a semi-supervised metric learning method based on the point-to-class structure of the labeled data, which is computationally less expensive, especially than using point-to-point structure. Specifically, the point-to-class structure is formulated into a new triplet constraint, which could narrow the distance of inner-class data and enlarge the distance of inter-class data simultaneously. Moreover, for measuring dissimilarity between different classes, weights are introduced into the triplet constraint and forms the weighted triplet constraint. Then, two kinds of regularizers such as spatial regularizer are rationally incorporated respectively in this model to mitigate the overfitting phenomenon and preserve the topological structure of the data. Furthermore, Riemannian gradient descent algorithm is adopted to solve the proposed model, since it can fully exploit the geometric structure of Riemannian manifolds and the proposed model can be regarded as a generalization of the unconstrained optimization problem in Euclidean space on Riemannian manifold. By introducing such solution strategy, the variables are constrained to a specific Riemannian manifold in each step of the iterative solution process, thereby enabling efficient and accurate model resolution. Finally, we conduct classification experiments on various data sets and compare the classification performance to state-of-the-art methods. The experimental results demonstrate that our proposed method has better performance in classification, especially for hyperspectral image data.","PeriodicalId":51116,"journal":{"name":"Machine Vision and Applications","volume":"60 1","pages":""},"PeriodicalIF":3.3,"publicationDate":"2024-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141777912","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Weakly supervised collaborative localization learning method for sewer pipe defect detection 用于下水管道缺陷检测的弱监督协同定位学习方法

IF 3.3 4区计算机科学

Machine Vision and Applications Pub Date : 2024-07-25 DOI: 10.1007/s00138-024-01587-3

Yang Yang, Shangqin Yang, Qi Zhao, Honghui Cao, Xinjie Peng

{"title":"Weakly supervised collaborative localization learning method for sewer pipe defect detection","authors":"Yang Yang, Shangqin Yang, Qi Zhao, Honghui Cao, Xinjie Peng","doi":"10.1007/s00138-024-01587-3","DOIUrl":"https://doi.org/10.1007/s00138-024-01587-3","url":null,"abstract":"Long-term corrosion and external disturbances can lead to defects in sewer pipes, which threaten important parts of urban infrastructure. The automatic defect detection algorithm based on closed-circuit televisions (CCTV) has gradually matured using supervised deep learning. However, there are different types and sizes of sewer pipe defects, and relying on human inspection to detect defects is time-consuming and subjective. Therefore, a few-shot, accurate and automatic method for sewer pipe defect with localization and fine-grained classification is needed. Thus, this study constructs a few-shot image-level dataset of 15 categories using the sewer dataset ML-Sewer and then presents a collaborative localization network based on weakly supervised learning to automatically classify and detect defects. Specifically, an attention refinement module (ARM) is designed to obtain classification results and high-level semantic features. Furthermore, considering the correlation between target regions and the extraction of target edge information, we designed a collaborative localization module (CLM) consisting of two branches. Then, to ensure that the network focuses on the complete target area, this study applies an image iteration module (IIM). Finally, the results of the two branches in the CLM are fused to acquire target localization. The experimental results show that the proposed model exhibits favorable performance in detecting sewer pipe defects. The proposed method exhibits prediction classification accuracy that reaches 69.76(%) and a positioning accuracy rate that reaches 65.32(%), which is higher than the performances of other weakly supervised detection models in sewer pipe defect detection.","PeriodicalId":51116,"journal":{"name":"Machine Vision and Applications","volume":"10 1","pages":""},"PeriodicalIF":3.3,"publicationDate":"2024-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141777914","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

An insect vision-inspired neuromorphic vision systems in low-light obstacle avoidance for intelligent vehicles 受昆虫视觉启发的神经形态视觉系统在智能车辆低照度避障中的应用

IF 3.3 4区计算机科学

Machine Vision and Applications Pub Date : 2024-07-25 DOI: 10.1007/s00138-024-01582-8

Haiyang Wang, Songwei Wang, Longlong Qian

{"title":"An insect vision-inspired neuromorphic vision systems in low-light obstacle avoidance for intelligent vehicles","authors":"Haiyang Wang, Songwei Wang, Longlong Qian","doi":"10.1007/s00138-024-01582-8","DOIUrl":"https://doi.org/10.1007/s00138-024-01582-8","url":null,"abstract":"The Lobular Giant Motion Detector (LGMD) is a neuron in the insect visual system that has been extensively studied, especially in locusts. This neuron is highly sensitive to rapidly approaching objects, allowing insects to react quickly to avoid potential threats such as approaching predators or obstacles. In the realm of intelligent vehicles, due to the lack of performance of conventional RGB cameras in extreme light conditions or at high-speed movements. Inspired by biological mechanisms, we have developed a novel neuromorphic dynamic vision sensor (DVS) driven LGMD spiking neural network (SNN) model. SNNs, distinguished by their bio-inspired spiking dynamics, offer a unique advantage in processing time-varying visual data, particularly in scenarios where rapid response and energy efficiency are paramount. Our model incorporates two distinct types of Leaky Integrate-and-Fire (LIF) neuron models and synapse models, which have been instrumental in reducing network latency and enhancing the system’s reaction speed. And addressing the challenge of noise in event streams, we have implemented denoising techniques to ensure the integrity of the input data. Integrating the proposed methods, ultimately, the model was integrated into an intelligent vehicle to conduct real-time obstacle avoidance testing in response to looming objects in simulated real scenarios. The experimental results show that the model’s ability to compensate for the limitations of traditional RGB cameras in detecting looming targets in the dark, and can detect looming targets and implement effective obstacle avoidance in complex and diverse dark environments.","PeriodicalId":51116,"journal":{"name":"Machine Vision and Applications","volume":"40 1","pages":""},"PeriodicalIF":3.3,"publicationDate":"2024-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141777913","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

React: recognize every action everywhere all at once 反应：一次性识别各地的每一个动作

IF 3.3 4区计算机科学

Machine Vision and Applications Pub Date : 2024-07-20 DOI: 10.1007/s00138-024-01561-z

Naga V. S. Raviteja Chappa, Pha Nguyen, Page Daniel Dobbs, Khoa Luu

{"title":"React: recognize every action everywhere all at once","authors":"Naga V. S. Raviteja Chappa, Pha Nguyen, Page Daniel Dobbs, Khoa Luu","doi":"10.1007/s00138-024-01561-z","DOIUrl":"https://doi.org/10.1007/s00138-024-01561-z","url":null,"abstract":"In the realm of computer vision, Group Activity Recognition (GAR) plays a vital role, finding applications in sports video analysis, surveillance, and social scene understanding. This paper introduces Recognize Every Action Everywhere All At Once (REACT), a novel architecture designed to model complex contextual relationships within videos. REACT leverages advanced transformer-based models for encoding intricate contextual relationships, enhancing understanding of group dynamics. Integrated Vision-Language Encoding facilitates efficient capture of spatiotemporal interactions and multi-modal information, enabling comprehensive scene understanding. The model’s precise action localization refines joint understanding of text and video data, enabling precise bounding box retrieval and enhancing semantic links between textual descriptions and visual reality. Actor-Specific Fusion strikes a balance between actor-specific details and contextual information, improving model specificity and robustness in recognizing group activities. Experimental results demonstrate REACT’s superiority over state-of-the-art GAR approaches, achieving higher accuracy in recognizing and understanding group activities across diverse datasets. This work significantly advances group activity recognition, offering a robust framework for nuanced scene comprehension.","PeriodicalId":51116,"journal":{"name":"Machine Vision and Applications","volume":"24 1","pages":""},"PeriodicalIF":3.3,"publicationDate":"2024-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141743258","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Virtual home staging and relighting from a single panorama under natural illumination 在自然光下，通过单个全景图进行虚拟家居分期和重新照明

IF 3.3 4区计算机科学

Machine Vision and Applications Pub Date : 2024-07-11 DOI: 10.1007/s00138-024-01559-7

Guanzhou Ji, Azadeh O. Sawyer, Srinivasa G. Narasimhan

{"title":"Virtual home staging and relighting from a single panorama under natural illumination","authors":"Guanzhou Ji, Azadeh O. Sawyer, Srinivasa G. Narasimhan","doi":"10.1007/s00138-024-01559-7","DOIUrl":"https://doi.org/10.1007/s00138-024-01559-7","url":null,"abstract":"Virtual staging technique can digitally showcase a variety of real-world scenes. However, relighting indoor scenes from a single image is challenging due to unknown scene geometry, material properties, and outdoor spatially-varying lighting. In this study, we use the High Dynamic Range (HDR) technique to capture an indoor panorama and its paired outdoor hemispherical photograph, and we develop a novel inverse rendering approach for scene relighting and editing. Our method consists of four key components: (1) panoramic furniture detection and removal, (2) automatic floor layout design, (3) global rendering with scene geometry, new furniture objects, and the real-time outdoor photograph, and (4) virtual staging with new camera position, outdoor illumination, scene texture, and electrical light. The results demonstrate that a single indoor panorama can be used to generate high-quality virtual scenes under new environmental conditions. Additionally, we contribute a new calibrated HDR (Cali-HDR) dataset that consists of 137 paired indoor and outdoor photographs. The animation for virtual rendered scenes is available here.","PeriodicalId":51116,"journal":{"name":"Machine Vision and Applications","volume":"32 1","pages":""},"PeriodicalIF":3.3,"publicationDate":"2024-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141584987","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Evaluation of data augmentation techniques on subjective tasks 评估主观任务的数据增强技术

IF 3.3 4区计算机科学

Machine Vision and Applications Pub Date : 2024-07-11 DOI: 10.1007/s00138-024-01574-8

Luis Gonzalez-Naharro, M. Julia Flores, Jesus Martínez-Gómez, Jose M. Puerta

引用次数: 0

Continual learning approaches to hand–eye calibration in robots 机器人手眼校准的持续学习方法

IF 3.3 4区计算机科学

Machine Vision and Applications Pub Date : 2024-07-10 DOI: 10.1007/s00138-024-01572-w

Ozan Bahadir, Jan Paul Siebert, Gerardo Aragon-Camarasa

{"title":"Continual learning approaches to hand–eye calibration in robots","authors":"Ozan Bahadir, Jan Paul Siebert, Gerardo Aragon-Camarasa","doi":"10.1007/s00138-024-01572-w","DOIUrl":"https://doi.org/10.1007/s00138-024-01572-w","url":null,"abstract":"This study addresses the problem of hand–eye calibration in robotic systems by developing Continual Learning (CL)-based approaches. Traditionally, robots require explicit models to transfer knowledge from camera observations to their hands or base. However, this poses limitations, as the hand–eye calibration parameters are typically valid only for the current camera configuration. We, therefore, propose a flexible and autonomous hand–eye calibration system that can adapt to changes in camera pose over time. Three CL-based approaches are introduced: the naive CL approach, the reservoir rehearsal approach, and the hybrid approach combining reservoir sampling with new data evaluation. The naive CL approach suffers from catastrophic forgetting, while the reservoir rehearsal approach mitigates this issue by sampling uniformly from past data. The hybrid approach further enhances performance by incorporating reservoir sampling and assessing new data for novelty. Experiments conducted in simulated and real-world environments demonstrate that the CL-based approaches, except for the naive approach, achieve competitive performance compared to traditional batch learning-based methods. This suggests that treating hand–eye calibration as a time sequence problem enables the extension of the learned space without complete retraining. The adaptability of the CL-based approaches facilitates accommodating changes in camera pose, leading to an improved hand–eye calibration system.","PeriodicalId":51116,"journal":{"name":"Machine Vision and Applications","volume":"41 1","pages":""},"PeriodicalIF":3.3,"publicationDate":"2024-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141584984","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

MDUNet: deep-prior unrolling network with multi-parameter data integration for low-dose computed tomography reconstruction MDUNet：用于低剂量计算机断层扫描重建的多参数数据集成深度优先解卷网络

IF 3.3 4区计算机科学

Machine Vision and Applications Pub Date : 2024-07-09 DOI: 10.1007/s00138-024-01568-6

Temitope Emmanuel Komolafe, Nizhuan Wang, Yuchi Tian, Adegbola Oyedotun Adeniji, Liang Zhou

{"title":"MDUNet: deep-prior unrolling network with multi-parameter data integration for low-dose computed tomography reconstruction","authors":"Temitope Emmanuel Komolafe, Nizhuan Wang, Yuchi Tian, Adegbola Oyedotun Adeniji, Liang Zhou","doi":"10.1007/s00138-024-01568-6","DOIUrl":"https://doi.org/10.1007/s00138-024-01568-6","url":null,"abstract":"The goal of this study is to reconstruct a high-quality computed tomography (CT) image from low-dose acquisition using an unrolling deep learning-based reconstruction network with less computational complexity and a more generalized model. We propose a MDUNet: Multi-parameters deep-prior unrolling network that employs the cascaded convolutional and deconvolutional blocks to unroll the model-based iterative reconstruction within a finite number of iterations by data-driven training. Furthermore, the embedded data consistency constraint in MDUNet ensures that the input low-dose images and the low-dose sinograms are consistent as well as incorporate the physics imaging geometry. Additionally, multi-parameter training was employed to enhance the model's generalization during the training process. Experimental results based on AAPM Low-dose CT datasets show that the proposed MDUNet significantly outperforms other state-of-the-art (SOTA) methods quantitatively and qualitatively. Also, the cascaded blocks reduce the computational complexity with reduced training parameters and generalize well on different datasets. In addition, the proposed MDUNet is validated on 8 different organs of interest, with more detailed structures recovered and high-quality images generated. The experimental results demonstrate that the proposed MDUNet generates favorable improvement over other competing methods in terms of visual quality, quantitative performance, and computational efficiency. The MDUNet has improved image quality with reduced computational cost and good generalization which effectively lowers radiation dose and reduces scanning time, making it favorable for future clinical deployment.","PeriodicalId":51116,"journal":{"name":"Machine Vision and Applications","volume":"2016 1","pages":""},"PeriodicalIF":3.3,"publicationDate":"2024-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141573669","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A framework of specialized knowledge distillation for Siamese tracker on challenging attributes 针对具有挑战性属性的连体跟踪器的专业知识提炼框架

IF 3.3 4区计算机科学

Machine Vision and Applications Pub Date : 2024-07-09 DOI: 10.1007/s00138-024-01578-4

Yiding Li, Atsushi Shimada, Tsubasa Minematsu, Cheng Tang

{"title":"A framework of specialized knowledge distillation for Siamese tracker on challenging attributes","authors":"Yiding Li, Atsushi Shimada, Tsubasa Minematsu, Cheng Tang","doi":"10.1007/s00138-024-01578-4","DOIUrl":"https://doi.org/10.1007/s00138-024-01578-4","url":null,"abstract":"In recent years, Siamese network-based trackers have achieved significant improvements in real-time tracking. Despite their success, performance bottlenecks caused by unavoidably complex scenarios in target-tracking tasks are becoming increasingly non-negligible. For example, occlusion and fast motion are factors that can easily cause tracking failures and are labeled in many high-quality tracking databases as challenging attributes. In addition, Siamese trackers tend to suffer from high memory costs, which restricts their applicability to mobile devices with tight memory budgets. To address these issues, we propose a Specialized teachers Distilled Siamese Tracker (SDST) framework to learn a student tracker, which is small, fast, and has enhanced performance in challenging attributes. SDST introduces two types of teachers for multi-teacher distillation: general teacher and specialized teachers. The former imparts basic knowledge to the students. The latter is used to transfer specialized knowledge to students, which helps improve their performance in challenging attributes. For students to efficiently capture critical knowledge from the two types of teachers, SDST is equipped with a carefully designed multi-teacher knowledge distillation model. Our model contains two processes: general teacher-student knowledge transfer and specialized teachers-student knowledge transfer. Extensive empirical evaluations of several popular Siamese trackers demonstrated the generality and effectiveness of our framework. Moreover, the results on Large-scale Single Object Tracking (LaSOT) show that the proposed method achieves a significant improvement of more than 2–4% in most challenging attributes. SDST also maintained high overall performance while achieving compression rates of up to 8x and framerates of 252 FPS and obtaining outstanding accuracy on all challenging attributes.\u0000","PeriodicalId":51116,"journal":{"name":"Machine Vision and Applications","volume":"16 1","pages":""},"PeriodicalIF":3.3,"publicationDate":"2024-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141573673","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0