arXiv - EE - Image and Video Processing最新文献

筛选
英文 中文
NT-ViT: Neural Transcoding Vision Transformers for EEG-to-fMRI Synthesis NT-ViT:用于脑电图-fMRI 合成的神经转码视觉变换器
arXiv - EE - Image and Video Processing Pub Date : 2024-09-18 DOI: arxiv-2409.11836
Romeo Lanzino, Federico Fontana, Luigi Cinque, Francesco Scarcello, Atsuto Maki
{"title":"NT-ViT: Neural Transcoding Vision Transformers for EEG-to-fMRI Synthesis","authors":"Romeo Lanzino, Federico Fontana, Luigi Cinque, Francesco Scarcello, Atsuto Maki","doi":"arxiv-2409.11836","DOIUrl":"https://doi.org/arxiv-2409.11836","url":null,"abstract":"This paper introduces the Neural Transcoding Vision Transformer (modelname),\u0000a generative model designed to estimate high-resolution functional Magnetic\u0000Resonance Imaging (fMRI) samples from simultaneous Electroencephalography (EEG)\u0000data. A key feature of modelname is its Domain Matching (DM) sub-module which\u0000effectively aligns the latent EEG representations with those of fMRI volumes,\u0000enhancing the model's accuracy and reliability. Unlike previous methods that\u0000tend to struggle with fidelity and reproducibility of images, modelname\u0000addresses these challenges by ensuring methodological integrity and\u0000higher-quality reconstructions which we showcase through extensive evaluation\u0000on two benchmark datasets; modelname outperforms the current state-of-the-art\u0000by a significant margin in both cases, e.g. achieving a $10times$ reduction in\u0000RMSE and a $3.14times$ increase in SSIM on the Oddball dataset. An ablation\u0000study also provides insights into the contribution of each component to the\u0000model's overall effectiveness. This development is critical in offering a new\u0000approach to lessen the time and financial constraints typically linked with\u0000high-resolution brain imaging, thereby aiding in the swift and precise\u0000diagnosis of neurological disorders. Although it is not a replacement for\u0000actual fMRI but rather a step towards making such imaging more accessible, we\u0000believe that it represents a pivotal advancement in clinical practice and\u0000neuroscience research. Code is available at\u0000url{https://github.com/rom42pla/ntvit}.","PeriodicalId":501289,"journal":{"name":"arXiv - EE - Image and Video Processing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142269817","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
World of Forms: Deformable Geometric Templates for One-Shot Surface Meshing in Coronary CT Angiography 形态世界:用于冠状动脉 CT 血管造影中一次性表面网格化的可变形几何模板
arXiv - EE - Image and Video Processing Pub Date : 2024-09-18 DOI: arxiv-2409.11837
Rudolf L. M. van Herten, Ioannis Lagogiannis, Jelmer M. Wolterink, Steffen Bruns, Eva R. Meulendijks, Damini Dey, Joris R. de Groot, José P. Henriques, R. Nils Planken, Simone Saitta, Ivana Išgum
{"title":"World of Forms: Deformable Geometric Templates for One-Shot Surface Meshing in Coronary CT Angiography","authors":"Rudolf L. M. van Herten, Ioannis Lagogiannis, Jelmer M. Wolterink, Steffen Bruns, Eva R. Meulendijks, Damini Dey, Joris R. de Groot, José P. Henriques, R. Nils Planken, Simone Saitta, Ivana Išgum","doi":"arxiv-2409.11837","DOIUrl":"https://doi.org/arxiv-2409.11837","url":null,"abstract":"Deep learning-based medical image segmentation and surface mesh generation\u0000typically involve a sequential pipeline from image to segmentation to meshes,\u0000often requiring large training datasets while making limited use of prior\u0000geometric knowledge. This may lead to topological inconsistencies and\u0000suboptimal performance in low-data regimes. To address these challenges, we\u0000propose a data-efficient deep learning method for direct 3D anatomical object\u0000surface meshing using geometric priors. Our approach employs a multi-resolution\u0000graph neural network that operates on a prior geometric template which is\u0000deformed to fit object boundaries of interest. We show how different templates\u0000may be used for the different surface meshing targets, and introduce a novel\u0000masked autoencoder pretraining strategy for 3D spherical data. The proposed\u0000method outperforms nnUNet in a one-shot setting for segmentation of the\u0000pericardium, left ventricle (LV) cavity and the LV myocardium. Similarly, the\u0000method outperforms other lumen segmentation operating on multi-planar\u0000reformatted images. Results further indicate that mesh quality is on par with\u0000or improves upon marching cubes post-processing of voxel mask predictions,\u0000while remaining flexible in the choice of mesh triangulation prior, thus paving\u0000the way for more accurate and topologically consistent 3D medical object\u0000surface meshing.","PeriodicalId":501289,"journal":{"name":"arXiv - EE - Image and Video Processing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142263131","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Autopet III challenge: Incorporating anatomical knowledge into nnUNet for lesion segmentation in PET/CT Autopet III 挑战赛:将解剖学知识纳入 nnUNet,在 PET/CT 中进行病灶分割
arXiv - EE - Image and Video Processing Pub Date : 2024-09-18 DOI: arxiv-2409.12155
Hamza Kalisch, Fabian Hörst, Ken Herrmann, Jens Kleesiek, Constantin Seibold
{"title":"Autopet III challenge: Incorporating anatomical knowledge into nnUNet for lesion segmentation in PET/CT","authors":"Hamza Kalisch, Fabian Hörst, Ken Herrmann, Jens Kleesiek, Constantin Seibold","doi":"arxiv-2409.12155","DOIUrl":"https://doi.org/arxiv-2409.12155","url":null,"abstract":"Lesion segmentation in PET/CT imaging is essential for precise tumor\u0000characterization, which supports personalized treatment planning and enhances\u0000diagnostic precision in oncology. However, accurate manual segmentation of\u0000lesions is time-consuming and prone to inter-observer variability. Given the\u0000rising demand and clinical use of PET/CT, automated segmentation methods,\u0000particularly deep-learning-based approaches, have become increasingly more\u0000relevant. The autoPET III Challenge focuses on advancing automated segmentation\u0000of tumor lesions in PET/CT images in a multitracer multicenter setting,\u0000addressing the clinical need for quantitative, robust, and generalizable\u0000solutions. Building on previous challenges, the third iteration of the autoPET\u0000challenge introduces a more diverse dataset featuring two different tracers\u0000(FDG and PSMA) from two clinical centers. To this extent, we developed a\u0000classifier that identifies the tracer of the given PET/CT based on the Maximum\u0000Intensity Projection of the PET scan. We trained two individual\u0000nnUNet-ensembles for each tracer where anatomical labels are included as a\u0000multi-label task to enhance the model's performance. Our final submission\u0000achieves cross-validation Dice scores of 76.90% and 61.33% for the publicly\u0000available FDG and PSMA datasets, respectively. The code is available at\u0000https://github.com/hakal104/autoPETIII/ .","PeriodicalId":501289,"journal":{"name":"arXiv - EE - Image and Video Processing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142263127","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Few-Shot Learning Approach on Tuberculosis Classification Based on Chest X-Ray Images 基于胸部 X 射线图像的肺结核分类 "少量学习 "方法
arXiv - EE - Image and Video Processing Pub Date : 2024-09-18 DOI: arxiv-2409.11644
A. A. G. Yogi Pramana, Faiz Ihza Permana, Muhammad Fazil Maulana, Dzikri Rahadian Fudholi
{"title":"Few-Shot Learning Approach on Tuberculosis Classification Based on Chest X-Ray Images","authors":"A. A. G. Yogi Pramana, Faiz Ihza Permana, Muhammad Fazil Maulana, Dzikri Rahadian Fudholi","doi":"arxiv-2409.11644","DOIUrl":"https://doi.org/arxiv-2409.11644","url":null,"abstract":"Tuberculosis (TB) is caused by the bacterium Mycobacterium tuberculosis,\u0000primarily affecting the lungs. Early detection is crucial for improving\u0000treatment effectiveness and reducing transmission risk. Artificial intelligence\u0000(AI), particularly through image classification of chest X-rays, can assist in\u0000TB detection. However, class imbalance in TB chest X-ray datasets presents a\u0000challenge for accurate classification. In this paper, we propose a few-shot\u0000learning (FSL) approach using the Prototypical Network algorithm to address\u0000this issue. We compare the performance of ResNet-18, ResNet-50, and VGG16 in\u0000feature extraction from the TBX11K Chest X-ray dataset. Experimental results\u0000demonstrate classification accuracies of 98.93% for ResNet-18, 98.60% for\u0000ResNet-50, and 33.33% for VGG16. These findings indicate that the proposed\u0000method outperforms others in mitigating data imbalance, which is particularly\u0000beneficial for disease classification applications.","PeriodicalId":501289,"journal":{"name":"arXiv - EE - Image and Video Processing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142269816","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
ABHINAW: A method for Automatic Evaluation of Typography within AI-Generated Images ABHINAW:自动评估人工智能生成图像中排版的方法
arXiv - EE - Image and Video Processing Pub Date : 2024-09-18 DOI: arxiv-2409.11874
Abhinaw Jagtap, Nachiket Tapas, R. G. Brajesh
{"title":"ABHINAW: A method for Automatic Evaluation of Typography within AI-Generated Images","authors":"Abhinaw Jagtap, Nachiket Tapas, R. G. Brajesh","doi":"arxiv-2409.11874","DOIUrl":"https://doi.org/arxiv-2409.11874","url":null,"abstract":"In the fast-evolving field of Generative AI, platforms like MidJourney,\u0000DALL-E, and Stable Diffusion have transformed Text-to-Image (T2I) Generation.\u0000However, despite their impressive ability to create high-quality images, they\u0000often struggle to generate accurate text within these images. Theoretically, if\u0000we could achieve accurate text generation in AI images in a ``zero-shot''\u0000manner, it would not only make AI-generated images more meaningful but also\u0000democratize the graphic design industry. The first step towards this goal is to\u0000create a robust scoring matrix for evaluating text accuracy in AI-generated\u0000images. Although there are existing bench-marking methods like CLIP SCORE and\u0000T2I-CompBench++, there's still a gap in systematically evaluating text and\u0000typography in AI-generated images, especially with diffusion-based methods. In\u0000this paper, we introduce a novel evaluation matrix designed explicitly for\u0000quantifying the performance of text and typography generation within\u0000AI-generated images. We have used letter by letter matching strategy to compute\u0000the exact matching scores from the reference text to the AI generated text. Our\u0000novel approach to calculate the score takes care of multiple redundancies such\u0000as repetition of words, case sensitivity, mixing of words, irregular\u0000incorporation of letters etc. Moreover, we have developed a Novel method named\u0000as brevity adjustment to handle excess text. In addition we have also done a\u0000quantitative analysis of frequent errors arise due to frequently used words and\u0000less frequently used words. Project page is available at:\u0000https://github.com/Abhinaw3906/ABHINAW-MATRIX.","PeriodicalId":501289,"journal":{"name":"arXiv - EE - Image and Video Processing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142269823","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multi-Sensor Deep Learning for Glacier Mapping 冰川测绘的多传感器深度学习
arXiv - EE - Image and Video Processing Pub Date : 2024-09-18 DOI: arxiv-2409.12034
Codruţ-Andrei Diaconu, Konrad Heidler, Jonathan L. Bamber, Harry Zekollari
{"title":"Multi-Sensor Deep Learning for Glacier Mapping","authors":"Codruţ-Andrei Diaconu, Konrad Heidler, Jonathan L. Bamber, Harry Zekollari","doi":"arxiv-2409.12034","DOIUrl":"https://doi.org/arxiv-2409.12034","url":null,"abstract":"The more than 200,000 glaciers outside the ice sheets play a crucial role in\u0000our society by influencing sea-level rise, water resource management, natural\u0000hazards, biodiversity, and tourism. However, only a fraction of these glaciers\u0000benefit from consistent and detailed in-situ observations that allow for\u0000assessing their status and changes over time. This limitation can, in part, be\u0000overcome by relying on satellite-based Earth Observation techniques.\u0000Satellite-based glacier mapping applications have historically mainly relied on\u0000manual and semi-automatic detection methods, while recently, a fast and notable\u0000transition to deep learning techniques has started. This chapter reviews how combining multi-sensor remote sensing data and deep\u0000learning allows us to better delineate (i.e. map) glaciers and detect their\u0000temporal changes. We explain how relying on deep learning multi-sensor\u0000frameworks to map glaciers benefits from the extensive availability of regional\u0000and global glacier inventories. We also analyse the rationale behind glacier\u0000mapping, the benefits of deep learning methodologies, and the inherent\u0000challenges in integrating multi-sensor earth observation data with deep\u0000learning algorithms. While our review aims to provide a broad overview of glacier mapping efforts,\u0000we highlight a few setups where deep learning multi-sensor remote sensing\u0000applications have a considerable potential added value. This includes\u0000applications for debris-covered and rock glaciers that are visually difficult\u0000to distinguish from surroundings and for calving glaciers that are in contact\u0000with the ocean. These specific cases are illustrated through a series of visual\u0000imageries, highlighting some significant advantages and challenges when\u0000detecting glacier changes, including dealing with seasonal snow cover, changing\u0000debris coverage, and distinguishing glacier fronts from the surrounding sea\u0000ice.","PeriodicalId":501289,"journal":{"name":"arXiv - EE - Image and Video Processing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142269821","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
LFIC-DRASC: Deep Light Field Image Compression Using Disentangled Representation and Asymmetrical Strip Convolution LFIC-DRASC:利用分离表示和非对称条带卷积进行深度光场图像压缩
arXiv - EE - Image and Video Processing Pub Date : 2024-09-18 DOI: arxiv-2409.11711
Shiyu Feng, Yun Zhang, Linwei Zhu, Sam Kwong
{"title":"LFIC-DRASC: Deep Light Field Image Compression Using Disentangled Representation and Asymmetrical Strip Convolution","authors":"Shiyu Feng, Yun Zhang, Linwei Zhu, Sam Kwong","doi":"arxiv-2409.11711","DOIUrl":"https://doi.org/arxiv-2409.11711","url":null,"abstract":"Light-Field (LF) image is emerging 4D data of light rays that is capable of\u0000realistically presenting spatial and angular information of 3D scene. However,\u0000the large data volume of LF images becomes the most challenging issue in\u0000real-time processing, transmission, and storage. In this paper, we propose an\u0000end-to-end deep LF Image Compression method Using Disentangled Representation\u0000and Asymmetrical Strip Convolution (LFIC-DRASC) to improve coding efficiency.\u0000Firstly, we formulate the LF image compression problem as learning a\u0000disentangled LF representation network and an image encoding-decoding network.\u0000Secondly, we propose two novel feature extractors that leverage the structural\u0000prior of LF data by integrating features across different dimensions.\u0000Meanwhile, disentangled LF representation network is proposed to enhance the LF\u0000feature disentangling and decoupling. Thirdly, we propose the LFIC-DRASC for LF\u0000image compression, where two Asymmetrical Strip Convolution (ASC) operators,\u0000i.e. horizontal and vertical, are proposed to capture long-range correlation in\u0000LF feature space. These two ASC operators can be combined with the square\u0000convolution to further decouple LF features, which enhances the model ability\u0000in representing intricate spatial relationships. Experimental results\u0000demonstrate that the proposed LFIC-DRASC achieves an average of 20.5% bit rate\u0000reductions comparing with the state-of-the-art methods.","PeriodicalId":501289,"journal":{"name":"arXiv - EE - Image and Video Processing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142263132","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
multiPI-TransBTS: A Multi-Path Learning Framework for Brain Tumor Image Segmentation Based on Multi-Physical Information multiPI-TransBTS:基于多物理信息的脑肿瘤图像分割多路径学习框架
arXiv - EE - Image and Video Processing Pub Date : 2024-09-18 DOI: arxiv-2409.12167
Hongjun Zhu, Jiaohang Huang, Kuo Chen, Xuehui Ying, Ying Qian
{"title":"multiPI-TransBTS: A Multi-Path Learning Framework for Brain Tumor Image Segmentation Based on Multi-Physical Information","authors":"Hongjun Zhu, Jiaohang Huang, Kuo Chen, Xuehui Ying, Ying Qian","doi":"arxiv-2409.12167","DOIUrl":"https://doi.org/arxiv-2409.12167","url":null,"abstract":"Brain Tumor Segmentation (BraTS) plays a critical role in clinical diagnosis,\u0000treatment planning, and monitoring the progression of brain tumors. However,\u0000due to the variability in tumor appearance, size, and intensity across\u0000different MRI modalities, automated segmentation remains a challenging task. In\u0000this study, we propose a novel Transformer-based framework, multiPI-TransBTS,\u0000which integrates multi-physical information to enhance segmentation accuracy.\u0000The model leverages spatial information, semantic information, and multi-modal\u0000imaging data, addressing the inherent heterogeneity in brain tumor\u0000characteristics. The multiPI-TransBTS framework consists of an encoder, an\u0000Adaptive Feature Fusion (AFF) module, and a multi-source, multi-scale feature\u0000decoder. The encoder incorporates a multi-branch architecture to separately\u0000extract modality-specific features from different MRI sequences. The AFF module\u0000fuses information from multiple sources using channel-wise and element-wise\u0000attention, ensuring effective feature recalibration. The decoder combines both\u0000common and task-specific features through a Task-Specific Feature Introduction\u0000(TSFI) strategy, producing accurate segmentation outputs for Whole Tumor (WT),\u0000Tumor Core (TC), and Enhancing Tumor (ET) regions. Comprehensive evaluations on\u0000the BraTS2019 and BraTS2020 datasets demonstrate the superiority of\u0000multiPI-TransBTS over the state-of-the-art methods. The model consistently\u0000achieves better Dice coefficients, Hausdorff distances, and Sensitivity scores,\u0000highlighting its effectiveness in addressing the BraTS challenges. Our results\u0000also indicate the need for further exploration of the balance between precision\u0000and recall in the ET segmentation task. The proposed framework represents a\u0000significant advancement in BraTS, with potential implications for improving\u0000clinical outcomes for brain tumor patients.","PeriodicalId":501289,"journal":{"name":"arXiv - EE - Image and Video Processing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142263126","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Denoising diffusion models for high-resolution microscopy image restoration 用于高分辨率显微图像复原的去噪扩散模型
arXiv - EE - Image and Video Processing Pub Date : 2024-09-18 DOI: arxiv-2409.12078
Pamela Osuna-Vargas, Maren H. Wehrheim, Lucas Zinz, Johanna Rahm, Ashwin Balakrishnan, Alexandra Kaminer, Mike Heilemann, Matthias Kaschube
{"title":"Denoising diffusion models for high-resolution microscopy image restoration","authors":"Pamela Osuna-Vargas, Maren H. Wehrheim, Lucas Zinz, Johanna Rahm, Ashwin Balakrishnan, Alexandra Kaminer, Mike Heilemann, Matthias Kaschube","doi":"arxiv-2409.12078","DOIUrl":"https://doi.org/arxiv-2409.12078","url":null,"abstract":"Advances in microscopy imaging enable researchers to visualize structures at\u0000the nanoscale level thereby unraveling intricate details of biological\u0000organization. However, challenges such as image noise, photobleaching of\u0000fluorophores, and low tolerability of biological samples to high light doses\u0000remain, restricting temporal resolutions and experiment durations. Reduced\u0000laser doses enable longer measurements at the cost of lower resolution and\u0000increased noise, which hinders accurate downstream analyses. Here we train a\u0000denoising diffusion probabilistic model (DDPM) to predict high-resolution\u0000images by conditioning the model on low-resolution information. Additionally,\u0000the probabilistic aspect of the DDPM allows for repeated generation of images\u0000that tend to further increase the signal-to-noise ratio. We show that our model\u0000achieves a performance that is better or similar to the previously\u0000best-performing methods, across four highly diverse datasets. Importantly,\u0000while any of the previous methods show competitive performance for some, but\u0000not all datasets, our method consistently achieves high performance across all\u0000four data sets, suggesting high generalizability.","PeriodicalId":501289,"journal":{"name":"arXiv - EE - Image and Video Processing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142263128","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Computational Imaging for Long-Term Prediction of Solar Irradiance 用于长期预测太阳辐照度的计算成像技术
arXiv - EE - Image and Video Processing Pub Date : 2024-09-18 DOI: arxiv-2409.12016
Leron Julian, Haejoon Lee, Soummya Kar, Aswin C. Sankaranarayanan
{"title":"Computational Imaging for Long-Term Prediction of Solar Irradiance","authors":"Leron Julian, Haejoon Lee, Soummya Kar, Aswin C. Sankaranarayanan","doi":"arxiv-2409.12016","DOIUrl":"https://doi.org/arxiv-2409.12016","url":null,"abstract":"The occlusion of the sun by clouds is one of the primary sources of\u0000uncertainties in solar power generation, and is a factor that affects the\u0000wide-spread use of solar power as a primary energy source. Real-time\u0000forecasting of cloud movement and, as a result, solar irradiance is necessary\u0000to schedule and allocate energy across grid-connected photovoltaic systems.\u0000Previous works monitored cloud movement using wide-angle field of view imagery\u0000of the sky. However, such images have poor resolution for clouds that appear\u0000near the horizon, which reduces their effectiveness for long term prediction of\u0000solar occlusion. Specifically, to be able to predict occlusion of the sun over\u0000long time periods, clouds that are near the horizon need to be detected, and\u0000their velocities estimated precisely. To enable such a system, we design and\u0000deploy a catadioptric system that delivers wide-angle imagery with uniform\u0000spatial resolution of the sky over its field of view. To enable prediction over\u0000a longer time horizon, we design an algorithm that uses carefully selected\u0000spatio-temporal slices of the imagery using estimated wind direction and\u0000velocity as inputs. Using ray-tracing simulations as well as a real testbed\u0000deployed outdoors, we show that the system is capable of predicting solar\u0000occlusion as well as irradiance for tens of minutes in the future, which is an\u0000order of magnitude improvement over prior work.","PeriodicalId":501289,"journal":{"name":"arXiv - EE - Image and Video Processing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142263233","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信