Visual Computing for Industry Biomedicine and Art最新文献

筛选
英文 中文
Dual modality prompt learning for visual question-grounded answering in robotic surgery 机器人手术中视觉问题解答的双模式提示学习
IF 2.8 4区 计算机科学
Visual Computing for Industry Biomedicine and Art Pub Date : 2024-04-22 DOI: 10.1186/s42492-024-00160-z
Yue Zhang, Wanshu Fan, Peixi Peng, Xin Yang, Dongsheng Zhou, Xiaopeng Wei
{"title":"Dual modality prompt learning for visual question-grounded answering in robotic surgery","authors":"Yue Zhang, Wanshu Fan, Peixi Peng, Xin Yang, Dongsheng Zhou, Xiaopeng Wei","doi":"10.1186/s42492-024-00160-z","DOIUrl":"https://doi.org/10.1186/s42492-024-00160-z","url":null,"abstract":"With recent advancements in robotic surgery, notable strides have been made in visual question answering (VQA). Existing VQA systems typically generate textual answers to questions but fail to indicate the location of the relevant content within the image. This limitation restricts the interpretative capacity of the VQA models and their ability to explore specific image regions. To address this issue, this study proposes a grounded VQA model for robotic surgery, capable of localizing a specific region during answer prediction. Drawing inspiration from prompt learning in language models, a dual-modality prompt model was developed to enhance precise multimodal information interactions. Specifically, two complementary prompters were introduced to effectively integrate visual and textual prompts into the encoding process of the model. A visual complementary prompter merges visual prompt knowledge with visual information features to guide accurate localization. The textual complementary prompter aligns visual information with textual prompt knowledge and textual information, guiding textual information towards a more accurate inference of the answer. Additionally, a multiple iterative fusion strategy was adopted for comprehensive answer reasoning, to ensure high-quality generation of textual and grounded answers. The experimental results validate the effectiveness of the model, demonstrating its superiority over existing methods on the EndoVis-18 and EndoVis-17 datasets.","PeriodicalId":29931,"journal":{"name":"Visual Computing for Industry Biomedicine and Art","volume":"40 1","pages":""},"PeriodicalIF":2.8,"publicationDate":"2024-04-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140636841","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Automated analysis of pectoralis major thickness in pec-fly exercises: evolving from manual measurement to deep learning techniques 自动分析胸肌练习中的胸大肌厚度:从人工测量到深度学习技术的演变
IF 2.8 4区 计算机科学
Visual Computing for Industry Biomedicine and Art Pub Date : 2024-04-16 DOI: 10.1186/s42492-024-00159-6
Shangyu Cai, Yongsheng Lin, Haoxin Chen, Zihao Huang, Yongjin Zhou, Yongping Zheng
{"title":"Automated analysis of pectoralis major thickness in pec-fly exercises: evolving from manual measurement to deep learning techniques","authors":"Shangyu Cai, Yongsheng Lin, Haoxin Chen, Zihao Huang, Yongjin Zhou, Yongping Zheng","doi":"10.1186/s42492-024-00159-6","DOIUrl":"https://doi.org/10.1186/s42492-024-00159-6","url":null,"abstract":"This study addresses a limitation of prior research on pectoralis major (PMaj) thickness changes during the pectoralis fly exercise using a wearable ultrasound imaging setup. Although previous studies used manual measurement and subjective evaluation, it is important to acknowledge the subsequent limitations of automating widespread applications. We then employed a deep learning model for image segmentation and automated measurement to solve the problem and study the additional quantitative supplementary information that could be provided. Our results revealed increased PMaj thickness changes in the coronal plane within the probe detection region when real-time ultrasound imaging (RUSI) visual biofeedback was incorporated, regardless of load intensity (50% or 80% of one-repetition maximum). Additionally, participants showed uniform thickness changes in the PMaj in response to enhanced RUSI biofeedback. Notably, the differences in PMaj thickness changes between load intensities were reduced by RUSI biofeedback, suggesting altered muscle activation strategies. We identified the optimal measurement location for the maximal PMaj thickness close to the rib end and emphasized the lightweight applicability of our model for fitness training and muscle assessment. Further studies can refine load intensities, investigate diverse parameters, and employ different network models to enhance accuracy. This study contributes to our understanding of the effects of muscle physiology and exercise training.","PeriodicalId":29931,"journal":{"name":"Visual Computing for Industry Biomedicine and Art","volume":"49 1","pages":""},"PeriodicalIF":2.8,"publicationDate":"2024-04-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140586928","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Three-dimensional reconstruction of industrial parts from a single image. 通过单张图像进行工业部件的三维重建。
IF 3.2 4区 计算机科学
Visual Computing for Industry Biomedicine and Art Pub Date : 2024-03-27 DOI: 10.1186/s42492-024-00158-7
Zhenxing Xu, Aizeng Wang, Fei Hou, Gang Zhao
{"title":"Three-dimensional reconstruction of industrial parts from a single image.","authors":"Zhenxing Xu, Aizeng Wang, Fei Hou, Gang Zhao","doi":"10.1186/s42492-024-00158-7","DOIUrl":"10.1186/s42492-024-00158-7","url":null,"abstract":"<p><p>This study proposes an image-based three-dimensional (3D) vector reconstruction of industrial parts that can generate non-uniform rational B-splines (NURBS) surfaces with high fidelity and flexibility. The contributions of this study include three parts: first, a dataset of two-dimensional images is constructed for typical industrial parts, including hexagonal head bolts, cylindrical gears, shoulder rings, hexagonal nuts, and cylindrical roller bearings; second, a deep learning algorithm is developed for parameter extraction of 3D industrial parts, which can determine the final 3D parameters and pose information of the reconstructed model using two new nets, CAD-ClassNet and CAD-ReconNet; and finally, a 3D vector shape reconstruction of mechanical parts is presented to generate NURBS from the obtained shape parameters. The final reconstructed models show that the proposed approach is highly accurate, efficient, and practical.</p>","PeriodicalId":29931,"journal":{"name":"Visual Computing for Industry Biomedicine and Art","volume":"7 1","pages":"7"},"PeriodicalIF":3.2,"publicationDate":"2024-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11329437/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140294782","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
PlaqueNet: deep learning enabled coronary artery plaque segmentation from coronary computed tomography angiography. PlaqueNet:通过深度学习从冠状动脉计算机断层扫描血管造影中分割冠状动脉斑块。
IF 3.2 4区 计算机科学
Visual Computing for Industry Biomedicine and Art Pub Date : 2024-03-22 DOI: 10.1186/s42492-024-00157-8
Linyuan Wang, Xiaofeng Zhang, Congyu Tian, Shu Chen, Yongzhi Deng, Xiangyun Liao, Qiong Wang, Weixin Si
{"title":"PlaqueNet: deep learning enabled coronary artery plaque segmentation from coronary computed tomography angiography.","authors":"Linyuan Wang, Xiaofeng Zhang, Congyu Tian, Shu Chen, Yongzhi Deng, Xiangyun Liao, Qiong Wang, Weixin Si","doi":"10.1186/s42492-024-00157-8","DOIUrl":"10.1186/s42492-024-00157-8","url":null,"abstract":"<p><p>Cardiovascular disease, primarily caused by atherosclerotic plaque formation, is a significant health concern. The early detection of these plaques is crucial for targeted therapies and reducing the risk of cardiovascular diseases. This study presents PlaqueNet, a solution for segmenting coronary artery plaques from coronary computed tomography angiography (CCTA) images. For feature extraction, the advanced residual net module was utilized, which integrates a deepwise residual optimization module into network branches, enhances feature extraction capabilities, avoiding information loss, and addresses gradient issues during training. To improve segmentation accuracy, a depthwise atrous spatial pyramid pooling based on bicubic efficient channel attention (DASPP-BICECA) module is introduced. The BICECA component amplifies the local feature sensitivity, whereas the DASPP component expands the network's information-gathering scope, resulting in elevated segmentation accuracy. Additionally, BINet, a module for joint network loss evaluation, is proposed. It optimizes the segmentation model without affecting the segmentation results. When combined with the DASPP-BICECA module, BINet enhances overall efficiency. The CCTA segmentation algorithm proposed in this study outperformed the other three comparative algorithms, achieving an intersection over Union of 87.37%, Dice of 93.26%, accuracy of 93.12%, mean intersection over Union of 93.68%, mean Dice of 96.63%, and mean pixel accuracy value of 96.55%.</p>","PeriodicalId":29931,"journal":{"name":"Visual Computing for Industry Biomedicine and Art","volume":"7 1","pages":"6"},"PeriodicalIF":3.2,"publicationDate":"2024-03-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11349722/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140185849","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Flipover outperforms dropout in deep learning 在深度学习中,Flipover 优于 Dropout
IF 2.8 4区 计算机科学
Visual Computing for Industry Biomedicine and Art Pub Date : 2024-02-22 DOI: 10.1186/s42492-024-00153-y
Yuxuan Liang, Chuang Niu, Pingkun Yan, Ge Wang
{"title":"Flipover outperforms dropout in deep learning","authors":"Yuxuan Liang, Chuang Niu, Pingkun Yan, Ge Wang","doi":"10.1186/s42492-024-00153-y","DOIUrl":"https://doi.org/10.1186/s42492-024-00153-y","url":null,"abstract":"Flipover, an enhanced dropout technique, is introduced to improve the robustness of artificial neural networks. In contrast to dropout, which involves randomly removing certain neurons and their connections, flipover randomly selects neurons and reverts their outputs using a negative multiplier during training. This approach offers stronger regularization than conventional dropout, refining model performance by (1) mitigating overfitting, matching or even exceeding the efficacy of dropout; (2) amplifying robustness to noise; and (3) enhancing resilience against adversarial attacks. Extensive experiments across various neural networks affirm the effectiveness of flipover in deep learning.","PeriodicalId":29931,"journal":{"name":"Visual Computing for Industry Biomedicine and Art","volume":"139 1","pages":""},"PeriodicalIF":2.8,"publicationDate":"2024-02-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139922238","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Correction: Multi-task approach based on combined CNN-transformer for efficient segmentation and classification of breast tumors in ultrasound images. 更正:基于组合式 CNN 变换器的多任务方法,用于对超声图像中的乳腺肿瘤进行高效分割和分类。
IF 2.8 4区 计算机科学
Visual Computing for Industry Biomedicine and Art Pub Date : 2024-02-09 DOI: 10.1186/s42492-024-00156-9
Jaouad Tagnamas, Hiba Ramadan, Ali Yahyaouy, Hamid Tairi
{"title":"Correction: Multi-task approach based on combined CNN-transformer for efficient segmentation and classification of breast tumors in ultrasound images.","authors":"Jaouad Tagnamas, Hiba Ramadan, Ali Yahyaouy, Hamid Tairi","doi":"10.1186/s42492-024-00156-9","DOIUrl":"10.1186/s42492-024-00156-9","url":null,"abstract":"","PeriodicalId":29931,"journal":{"name":"Visual Computing for Industry Biomedicine and Art","volume":"7 1","pages":"5"},"PeriodicalIF":2.8,"publicationDate":"2024-02-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10858012/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139708045","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Convolutional neural network based data interpretable framework for Alzheimer's treatment planning. 基于卷积神经网络的阿尔茨海默氏症治疗规划数据可解释框架。
IF 3.2 4区 计算机科学
Visual Computing for Industry Biomedicine and Art Pub Date : 2024-02-01 DOI: 10.1186/s42492-024-00154-x
Sazia Parvin, Sonia Farhana Nimmy, Md Sarwar Kamal
{"title":"Convolutional neural network based data interpretable framework for Alzheimer's treatment planning.","authors":"Sazia Parvin, Sonia Farhana Nimmy, Md Sarwar Kamal","doi":"10.1186/s42492-024-00154-x","DOIUrl":"10.1186/s42492-024-00154-x","url":null,"abstract":"<p><p>Alzheimer's disease (AD) is a neurological disorder that predominantly affects the brain. In the coming years, it is expected to spread rapidly, with limited progress in diagnostic techniques. Various machine learning (ML) and artificial intelligence (AI) algorithms have been employed to detect AD using single-modality data. However, recent developments in ML have enabled the application of these methods to multiple data sources and input modalities for AD prediction. In this study, we developed a framework that utilizes multimodal data (tabular data, magnetic resonance imaging (MRI) images, and genetic information) to classify AD. As part of the pre-processing phase, we generated a knowledge graph from the tabular data and MRI images. We employed graph neural networks for knowledge graph creation, and region-based convolutional neural network approach for image-to-knowledge graph generation. Additionally, we integrated various explainable AI (XAI) techniques to interpret and elucidate the prediction outcomes derived from multimodal data. Layer-wise relevance propagation was used to explain the layer-wise outcomes in the MRI images. We also incorporated submodular pick local interpretable model-agnostic explanations to interpret the decision-making process based on the tabular data provided. Genetic expression values play a crucial role in AD analysis. We used a graphical gene tree to identify genes associated with the disease. Moreover, a dashboard was designed to display XAI outcomes, enabling experts and medical professionals to easily comprehend the prediction results.</p>","PeriodicalId":29931,"journal":{"name":"Visual Computing for Industry Biomedicine and Art","volume":"7 1","pages":"3"},"PeriodicalIF":3.2,"publicationDate":"2024-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10830981/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139651820","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multi-task approach based on combined CNN-transformer for efficient segmentation and classification of breast tumors in ultrasound images. 基于组合式 CNN 变换器的多任务方法,用于对超声图像中的乳腺肿瘤进行高效分割和分类。
IF 3.2 4区 计算机科学
Visual Computing for Industry Biomedicine and Art Pub Date : 2024-01-26 DOI: 10.1186/s42492-024-00155-w
Jaouad Tagnamas, Hiba Ramadan, Ali Yahyaouy, Hamid Tairi
{"title":"Multi-task approach based on combined CNN-transformer for efficient segmentation and classification of breast tumors in ultrasound images.","authors":"Jaouad Tagnamas, Hiba Ramadan, Ali Yahyaouy, Hamid Tairi","doi":"10.1186/s42492-024-00155-w","DOIUrl":"10.1186/s42492-024-00155-w","url":null,"abstract":"<p><p>Accurate segmentation of breast ultrasound (BUS) images is crucial for early diagnosis and treatment of breast cancer. Further, the task of segmenting lesions in BUS images continues to pose significant challenges due to the limitations of convolutional neural networks (CNNs) in capturing long-range dependencies and obtaining global context information. Existing methods relying solely on CNNs have struggled to address these issues. Recently, ConvNeXts have emerged as a promising architecture for CNNs, while transformers have demonstrated outstanding performance in diverse computer vision tasks, including the analysis of medical images. In this paper, we propose a novel breast lesion segmentation network CS-Net that combines the strengths of ConvNeXt and Swin Transformer models to enhance the performance of the U-Net architecture. Our network operates on BUS images and adopts an end-to-end approach to perform segmentation. To address the limitations of CNNs, we design a hybrid encoder that incorporates modified ConvNeXt convolutions and Swin Transformer. Furthermore, to enhance capturing the spatial and channel attention in feature maps we incorporate the Coordinate Attention Module. Second, we design an Encoder-Decoder Features Fusion Module that facilitates the fusion of low-level features from the encoder with high-level semantic features from the decoder during the image reconstruction. Experimental results demonstrate the superiority of our network over state-of-the-art image segmentation methods for BUS lesions segmentation.</p>","PeriodicalId":29931,"journal":{"name":"Visual Computing for Industry Biomedicine and Art","volume":"7 1","pages":"2"},"PeriodicalIF":3.2,"publicationDate":"2024-01-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10811315/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139564831","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
CT-based radiomics: predicting early outcomes after percutaneous transluminal renal angioplasty in patients with severe atherosclerotic renal artery stenosis. 基于 CT 的放射组学:预测严重动脉粥样硬化性肾动脉狭窄患者经皮腔内肾血管成形术后的早期预后。
IF 2.8 4区 计算机科学
Visual Computing for Industry Biomedicine and Art Pub Date : 2024-01-12 DOI: 10.1186/s42492-023-00152-5
Jia Fu, Mengjie Fang, Zhiyong Lin, Jianxing Qiu, Min Yang, Jie Tian, Di Dong, Yinghua Zou
{"title":"CT-based radiomics: predicting early outcomes after percutaneous transluminal renal angioplasty in patients with severe atherosclerotic renal artery stenosis.","authors":"Jia Fu, Mengjie Fang, Zhiyong Lin, Jianxing Qiu, Min Yang, Jie Tian, Di Dong, Yinghua Zou","doi":"10.1186/s42492-023-00152-5","DOIUrl":"10.1186/s42492-023-00152-5","url":null,"abstract":"<p><p>This study aimed to comprehensively evaluate non-contrast computed tomography (CT)-based radiomics for predicting early outcomes in patients with severe atherosclerotic renal artery stenosis (ARAS) after percutaneous transluminal renal angioplasty (PTRA). A total of 52 patients were retrospectively recruited, and their clinical characteristics and pretreatment CT images were collected. During a median follow-up period of 3.7 mo, 18 patients were confirmed to have benefited from the treatment, defined as a 20% improvement from baseline in the estimated glomerular filtration rate. A deep learning network trained via self-supervised learning was used to enhance the imaging phenotype characteristics. Radiomics features, comprising 116 handcrafted features and 78 deep learning features, were extracted from the affected renal and perirenal adipose regions. More features from the latter were correlated with early outcomes, as determined by univariate analysis, and were visually represented in radiomics heatmaps and volcano plots. After using consensus clustering and the least absolute shrinkage and selection operator method for feature selection, five machine learning models were evaluated. Logistic regression yielded the highest leave-one-out cross-validation accuracy of 0.780 (95%CI: 0.660-0.880) for the renal signature, while the support vector machine achieved 0.865 (95%CI: 0.769-0.942) for the perirenal adipose signature. SHapley Additive exPlanations was used to visually interpret the prediction mechanism, and a histogram feature and a deep learning feature were identified as the most influential factors for the renal signature and perirenal adipose signature, respectively. Multivariate analysis revealed that both signatures served as independent predictive factors. When combined, they achieved an area under the receiver operating characteristic curve of 0.888 (95%CI: 0.784-0.992), indicating that the imaging phenotypes from both regions complemented each other. In conclusion, non-contrast CT-based radiomics can be leveraged to predict the early outcomes of PTRA, thereby assisting in identifying patients with ARAS suitable for this treatment, with perirenal adipose tissue providing added predictive value.</p>","PeriodicalId":29931,"journal":{"name":"Visual Computing for Industry Biomedicine and Art","volume":"7 1","pages":"1"},"PeriodicalIF":2.8,"publicationDate":"2024-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10784441/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139425625","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Adaptive feature extraction method for capsule endoscopy images 胶囊内窥镜图像的自适应特征提取方法
IF 2.8 4区 计算机科学
Visual Computing for Industry Biomedicine and Art Pub Date : 2023-12-11 DOI: 10.1186/s42492-023-00151-6
Dingchang Wu, Yinghui Wang, Haomiao Ma, Lingyu Ai, Jinlong Yang, Shaojie Zhang, Wei Li
{"title":"Adaptive feature extraction method for capsule endoscopy images","authors":"Dingchang Wu, Yinghui Wang, Haomiao Ma, Lingyu Ai, Jinlong Yang, Shaojie Zhang, Wei Li","doi":"10.1186/s42492-023-00151-6","DOIUrl":"https://doi.org/10.1186/s42492-023-00151-6","url":null,"abstract":"The traditional feature-extraction method of oriented FAST and rotated BRIEF (ORB) detects image features based on a fixed threshold; however, ORB descriptors do not distinguish features well in capsule endoscopy images. Therefore, a new feature detector that uses a new method for setting thresholds, called the adaptive threshold FAST and FREAK in capsule endoscopy images (AFFCEI), is proposed. This method, first constructs an image pyramid and then calculates the thresholds of pixels based on the gray value contrast of all pixels in the local neighborhood of the image, to achieve adaptive image feature extraction in each layer of the pyramid. Subsequently, the features are expressed by the FREAK descriptor, which can enhance the discrimination of the features extracted from the stomach image. Finally, a refined matching is obtained by applying the grid-based motion statistics algorithm to the result of Hamming distance, whereby mismatches are rejected using the RANSAC algorithm. Compared with the ASIFT method, which previously had the best performance, the average running time of AFFCEI was 4/5 that of ASIFT, and the average matching score improved by 5% when tracking features in a moving capsule endoscope.","PeriodicalId":29931,"journal":{"name":"Visual Computing for Industry Biomedicine and Art","volume":"27 1-2","pages":""},"PeriodicalIF":2.8,"publicationDate":"2023-12-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138566135","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信