Mustapha Hemis, Hamza Kheddar, Sami Bourouis, Nasir Saleem
{"title":"Deep Learning Techniques for Hand Vein Biometrics: A Comprehensive Review","authors":"Mustapha Hemis, Hamza Kheddar, Sami Bourouis, Nasir Saleem","doi":"arxiv-2409.07128","DOIUrl":"https://doi.org/arxiv-2409.07128","url":null,"abstract":"Biometric authentication has garnered significant attention as a secure and\u0000efficient method of identity verification. Among the various modalities, hand\u0000vein biometrics, including finger vein, palm vein, and dorsal hand vein\u0000recognition, offer unique advantages due to their high accuracy, low\u0000susceptibility to forgery, and non-intrusiveness. The vein patterns within the\u0000hand are highly complex and distinct for each individual, making them an ideal\u0000biometric identifier. Additionally, hand vein recognition is contactless,\u0000enhancing user convenience and hygiene compared to other modalities such as\u0000fingerprint or iris recognition. Furthermore, the veins are internally located,\u0000rendering them less susceptible to damage or alteration, thus enhancing the\u0000security and reliability of the biometric system. The combination of these\u0000factors makes hand vein biometrics a highly effective and secure method for\u0000identity verification. This review paper delves into the latest advancements in\u0000deep learning techniques applied to finger vein, palm vein, and dorsal hand\u0000vein recognition. It encompasses all essential fundamentals of hand vein\u0000biometrics, summarizes publicly available datasets, and discusses\u0000state-of-the-art metrics used for evaluating the three modes. Moreover, it\u0000provides a comprehensive overview of suggested approaches for finger, palm,\u0000dorsal, and multimodal vein techniques, offering insights into the best\u0000performance achieved, data augmentation techniques, and effective transfer\u0000learning methods, along with associated pretrained deep learning models.\u0000Additionally, the review addresses research challenges faced and outlines\u0000future directions and perspectives, encouraging researchers to enhance existing\u0000methods and propose innovative techniques.","PeriodicalId":501289,"journal":{"name":"arXiv - EE - Image and Video Processing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142200254","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Doyoung Park, Jinsoo Kim, Qi Chang, Shuang Leng, Liang Zhong, Lohendran Baskaran
{"title":"RICAU-Net: Residual-block Inspired Coordinate Attention U-Net for Segmentation of Small and Sparse Calcium Lesions in Cardiac CT","authors":"Doyoung Park, Jinsoo Kim, Qi Chang, Shuang Leng, Liang Zhong, Lohendran Baskaran","doi":"arxiv-2409.06993","DOIUrl":"https://doi.org/arxiv-2409.06993","url":null,"abstract":"The Agatston score, which is the sum of the calcification in the four main\u0000coronary arteries, has been widely used in the diagnosis of coronary artery\u0000disease (CAD). However, many studies have emphasized the importance of the\u0000vessel-specific Agatston score, as calcification in a specific vessel is\u0000significantly correlated with the occurrence of coronary heart disease (CHD).\u0000In this paper, we propose the Residual-block Inspired Coordinate Attention\u0000U-Net (RICAU-Net), which incorporates coordinate attention in two distinct\u0000manners and a customized combo loss function for lesion-specific coronary\u0000artery calcium (CAC) segmentation. This approach aims to tackle the high\u0000class-imbalance issue associated with small and sparse lesions, particularly\u0000for CAC in the left main coronary artery (LM) which is generally small and the\u0000scarcest in the dataset due to its anatomical structure. The proposed method\u0000was compared with six different methods using Dice score, precision, and\u0000recall. Our approach achieved the highest per-lesion Dice scores for all four\u0000lesions, especially for CAC in LM compared to other methods. The ablation\u0000studies demonstrated the significance of positional information from the\u0000coordinate attention and the customized loss function in segmenting small and\u0000sparse lesions with a high class-imbalance problem.","PeriodicalId":501289,"journal":{"name":"arXiv - EE - Image and Video Processing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142200262","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Feiyang Jia, Zhineng Chen, Ziying Song, Lin Liu, Caiyan Jia
{"title":"CWT-Net: Super-resolution of Histopathology Images Using a Cross-scale Wavelet-based Transformer","authors":"Feiyang Jia, Zhineng Chen, Ziying Song, Lin Liu, Caiyan Jia","doi":"arxiv-2409.07092","DOIUrl":"https://doi.org/arxiv-2409.07092","url":null,"abstract":"Super-resolution (SR) aims to enhance the quality of low-resolution images\u0000and has been widely applied in medical imaging. We found that the design\u0000principles of most existing methods are influenced by SR tasks based on\u0000real-world images and do not take into account the significance of the\u0000multi-level structure in pathological images, even if they can achieve\u0000respectable objective metric evaluations. In this work, we delve into two\u0000super-resolution working paradigms and propose a novel network called CWT-Net,\u0000which leverages cross-scale image wavelet transform and Transformer\u0000architecture. Our network consists of two branches: one dedicated to learning\u0000super-resolution and the other to high-frequency wavelet features. To generate\u0000high-resolution histopathology images, the Transformer module shares and fuses\u0000features from both branches at various stages. Notably, we have designed a\u0000specialized wavelet reconstruction module to effectively enhance the wavelet\u0000domain features and enable the network to operate in different modes, allowing\u0000for the introduction of additional relevant information from cross-scale\u0000images. Our experimental results demonstrate that our model significantly\u0000outperforms state-of-the-art methods in both performance and visualization\u0000evaluations and can substantially boost the accuracy of image diagnostic\u0000networks.","PeriodicalId":501289,"journal":{"name":"arXiv - EE - Image and Video Processing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142200258","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"3DGCQA: A Quality Assessment Database for 3D AI-Generated Contents","authors":"Yingjie Zhou, Zicheng Zhang, Farong Wen, Jun Jia, Yanwei Jiang, Xiaohong Liu, Xiongkuo Min, Guangtao Zhai","doi":"arxiv-2409.07236","DOIUrl":"https://doi.org/arxiv-2409.07236","url":null,"abstract":"Although 3D generated content (3DGC) offers advantages in reducing production\u0000costs and accelerating design timelines, its quality often falls short when\u0000compared to 3D professionally generated content. Common quality issues\u0000frequently affect 3DGC, highlighting the importance of timely and effective\u0000quality assessment. Such evaluations not only ensure a higher standard of 3DGCs\u0000for end-users but also provide critical insights for advancing generative\u0000technologies. To address existing gaps in this domain, this paper introduces a\u0000novel 3DGC quality assessment dataset, 3DGCQA, built using 7 representative\u0000Text-to-3D generation methods. During the dataset's construction, 50 fixed\u0000prompts are utilized to generate contents across all methods, resulting in the\u0000creation of 313 textured meshes that constitute the 3DGCQA dataset. The\u0000visualization intuitively reveals the presence of 6 common distortion\u0000categories in the generated 3DGCs. To further explore the quality of the 3DGCs,\u0000subjective quality assessment is conducted by evaluators, whose ratings reveal\u0000significant variation in quality across different generation methods.\u0000Additionally, several objective quality assessment algorithms are tested on the\u00003DGCQA dataset. The results expose limitations in the performance of existing\u0000algorithms and underscore the need for developing more specialized quality\u0000assessment methods. To provide a valuable resource for future research and\u0000development in 3D content generation and quality assessment, the dataset has\u0000been open-sourced in https://github.com/zyj-2000/3DGCQA.","PeriodicalId":501289,"journal":{"name":"arXiv - EE - Image and Video Processing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142200246","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Quantifying Knee Cartilage Shape and Lesion: From Image to Metrics","authors":"Yongcheng Yao, Weitian Chen","doi":"arxiv-2409.07361","DOIUrl":"https://doi.org/arxiv-2409.07361","url":null,"abstract":"Imaging features of knee articular cartilage have been shown to be potential\u0000imaging biomarkers for knee osteoarthritis. Despite recent methodological\u0000advancements in image analysis techniques like image segmentation,\u0000registration, and domain-specific image computing algorithms, only a few works\u0000focus on building fully automated pipelines for imaging feature extraction. In\u0000this study, we developed a deep-learning-based medical image analysis\u0000application for knee cartilage morphometrics, CartiMorph Toolbox (CMT). We\u0000proposed a 2-stage joint template learning and registration network, CMT-reg.\u0000We trained the model using the OAI-ZIB dataset and assessed its performance in\u0000template-to-image registration. The CMT-reg demonstrated competitive results\u0000compared to other state-of-the-art models. We integrated the proposed model\u0000into an automated pipeline for the quantification of cartilage shape and lesion\u0000(full-thickness cartilage loss, specifically). The toolbox provides a\u0000comprehensive, user-friendly solution for medical image analysis and data\u0000visualization. The software and models are available at\u0000https://github.com/YongchengYAO/CMT-AMAI24paper .","PeriodicalId":501289,"journal":{"name":"arXiv - EE - Image and Video Processing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142200244","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Efficient One-Step Diffusion Refinement for Snapshot Compressive Imaging","authors":"Yunzhen Wang, Haijin Zeng, Shaoguang Huang, Hongyu Chen, Hongyan Zhang","doi":"arxiv-2409.07417","DOIUrl":"https://doi.org/arxiv-2409.07417","url":null,"abstract":"Coded Aperture Snapshot Spectral Imaging (CASSI) is a crucial technique for\u0000capturing three-dimensional multispectral images (MSIs) through the complex\u0000inverse task of reconstructing these images from coded two-dimensional\u0000measurements. Current state-of-the-art methods, predominantly end-to-end, face\u0000limitations in reconstructing high-frequency details and often rely on\u0000constrained datasets like KAIST and CAVE, resulting in models with poor\u0000generalizability. In response to these challenges, this paper introduces a\u0000novel one-step Diffusion Probabilistic Model within a self-supervised\u0000adaptation framework for Snapshot Compressive Imaging (SCI). Our approach\u0000leverages a pretrained SCI reconstruction network to generate initial\u0000predictions from two-dimensional measurements. Subsequently, a one-step\u0000diffusion model produces high-frequency residuals to enhance these initial\u0000predictions. Additionally, acknowledging the high costs associated with\u0000collecting MSIs, we develop a self-supervised paradigm based on the Equivariant\u0000Imaging (EI) framework. Experimental results validate the superiority of our\u0000model compared to previous methods, showcasing its simplicity and adaptability\u0000to various end-to-end or unfolding techniques.","PeriodicalId":501289,"journal":{"name":"arXiv - EE - Image and Video Processing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142200243","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Towards Predicting Temporal Changes in a Patient's Chest X-ray Images based on Electronic Health Records","authors":"Daeun Kyung, Junu Kim, Tackeun Kim, Edward Choi","doi":"arxiv-2409.07012","DOIUrl":"https://doi.org/arxiv-2409.07012","url":null,"abstract":"Chest X-ray imaging (CXR) is an important diagnostic tool used in hospitals\u0000to assess patient conditions and monitor changes over time. Generative models,\u0000specifically diffusion-based models, have shown promise in generating realistic\u0000synthetic X-rays. However, these models mainly focus on conditional generation\u0000using single-time-point data, i.e., typically CXRs taken at a specific time\u0000with their corresponding reports, limiting their clinical utility, particularly\u0000for capturing temporal changes. To address this limitation, we propose a novel\u0000framework, EHRXDiff, which predicts future CXR images by integrating previous\u0000CXRs with subsequent medical events, e.g., prescriptions, lab measures, etc.\u0000Our framework dynamically tracks and predicts disease progression based on a\u0000latent diffusion model, conditioned on the previous CXR image and a history of\u0000medical events. We comprehensively evaluate the performance of our framework\u0000across three key aspects, including clinical consistency, demographic\u0000consistency, and visual realism. We demonstrate that our framework generates\u0000high-quality, realistic future images that capture potential temporal changes,\u0000suggesting its potential for further development as a clinical simulation tool.\u0000This could offer valuable insights for patient monitoring and treatment\u0000planning in the medical field.","PeriodicalId":501289,"journal":{"name":"arXiv - EE - Image and Video Processing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142200266","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mohammed Alsaafin, Musab Alsheikh, Saeed Anwar, Muhammad Usman
{"title":"Attention Down-Sampling Transformer, Relative Ranking and Self-Consistency for Blind Image Quality Assessment","authors":"Mohammed Alsaafin, Musab Alsheikh, Saeed Anwar, Muhammad Usman","doi":"arxiv-2409.07115","DOIUrl":"https://doi.org/arxiv-2409.07115","url":null,"abstract":"The no-reference image quality assessment is a challenging domain that\u0000addresses estimating image quality without the original reference. We introduce\u0000an improved mechanism to extract local and non-local information from images\u0000via different transformer encoders and CNNs. The utilization of Transformer\u0000encoders aims to mitigate locality bias and generate a non-local representation\u0000by sequentially processing CNN features, which inherently capture local visual\u0000structures. Establishing a stronger connection between subjective and objective\u0000assessments is achieved through sorting within batches of images based on\u0000relative distance information. A self-consistency approach to self-supervision\u0000is presented, explicitly addressing the degradation of no-reference image\u0000quality assessment (NR-IQA) models under equivariant transformations. Our\u0000approach ensures model robustness by maintaining consistency between an image\u0000and its horizontally flipped equivalent. Through empirical evaluation of five\u0000popular image quality assessment datasets, the proposed model outperforms\u0000alternative algorithms in the context of no-reference image quality assessment\u0000datasets, especially on smaller datasets. Codes are available at\u0000href{https://github.com/mas94/ADTRS}{https://github.com/mas94/ADTRS}","PeriodicalId":501289,"journal":{"name":"arXiv - EE - Image and Video Processing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142200255","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Event-based Mosaicing Bundle Adjustment","authors":"Shuang Guo, Guillermo Gallego","doi":"arxiv-2409.07365","DOIUrl":"https://doi.org/arxiv-2409.07365","url":null,"abstract":"We tackle the problem of mosaicing bundle adjustment (i.e., simultaneous\u0000refinement of camera orientations and scene map) for a purely rotating event\u0000camera. We formulate the problem as a regularized non-linear least squares\u0000optimization. The objective function is defined using the linearized event\u0000generation model in the camera orientations and the panoramic gradient map of\u0000the scene. We show that this BA optimization has an exploitable block-diagonal\u0000sparsity structure, so that the problem can be solved efficiently. To the best\u0000of our knowledge, this is the first work to leverage such sparsity to speed up\u0000the optimization in the context of event-based cameras, without the need to\u0000convert events into image-like representations. We evaluate our method, called\u0000EMBA, on both synthetic and real-world datasets to show its effectiveness (50%\u0000photometric error decrease), yielding results of unprecedented quality. In\u0000addition, we demonstrate EMBA using high spatial resolution event cameras,\u0000yielding delicate panoramas in the wild, even without an initial map. Project\u0000page: https://github.com/tub-rip/emba","PeriodicalId":501289,"journal":{"name":"arXiv - EE - Image and Video Processing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142200281","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Performance Assessment of Feature Detection Methods for 2-D FS Sonar Imagery","authors":"Hitesh Kyatham, Shahriar Negahdaripour, Michael Xu, Xiaomin Lin, Miao Yu, Yiannis Aloimonos","doi":"arxiv-2409.07004","DOIUrl":"https://doi.org/arxiv-2409.07004","url":null,"abstract":"Underwater robot perception is crucial in scientific subsea exploration and\u0000commercial operations. The key challenges include non-uniform lighting and poor\u0000visibility in turbid environments. High-frequency forward-look sonar cameras\u0000address these issues, by providing high-resolution imagery at maximum range of\u0000tens of meters, despite complexities posed by high degree of speckle noise, and\u0000lack of color and texture. In particular, robust feature detection is an\u0000essential initial step for automated object recognition, localization,\u0000navigation, and 3-D mapping. Various local feature detectors developed for RGB\u0000images are not well-suited for sonar data. To assess their performances, we\u0000evaluate a number of feature detectors using real sonar images from five\u0000different sonar devices. Performance metrics such as detection accuracy, false\u0000positives, and robustness to variations in target characteristics and sonar\u0000devices are applied to analyze the experimental results. The study would\u0000provide a deeper insight into the bottlenecks of feature detection for sonar\u0000data, and developing more effective methods","PeriodicalId":501289,"journal":{"name":"arXiv - EE - Image and Video Processing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142200261","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}