Jing Wang, Bicao Li, Jie Huang, Miaomiao Wei, Mengxing Song, Zongmin Wang
{"title":"Lisnet: A Covid-19 Lung Infection Segmentation Network Based on Edge Supervision and Multi-Scale Context Aggregation","authors":"Jing Wang, Bicao Li, Jie Huang, Miaomiao Wei, Mengxing Song, Zongmin Wang","doi":"10.1109/ICIP46576.2022.9897957","DOIUrl":"https://doi.org/10.1109/ICIP46576.2022.9897957","url":null,"abstract":"Corona Virus Disease 2019 (COVID-19) spread globally in early 2020, leading to a new health crisis. Automatic segmentation of lung infections from computed tomography (CT) images provides an important basis for early diagnosis of COVID-19 quickly. In this paper, we propose an effective COVID-19 Lung Infection Segmentation Network (LISNet) based on edge supervision and multi-scale context aggregation. More specifically, an Edge Supervision module is introduced to the feature extraction part to enhance the low contrast between lesions and normal tissues. In addition, the Multi-scale Feature Fusion module is added to enhance the segmentation ability of different scales Lesions. Finally, the Context Aggregation module is used to aggregate high- and low-level features and generate global information. Experiments demonstrate that our method outperforms other state-of-the-art methods on the public COVID-19 CT segmentation dataset.","PeriodicalId":387035,"journal":{"name":"2022 IEEE International Conference on Image Processing (ICIP)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133185828","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"IID-NORD: A Comprehensive Intrinsic Image Decomposition Dataset","authors":"Diclehan Ulucan, Oguzhan Ulucan, M. Ebner","doi":"10.1109/ICIP46576.2022.9897456","DOIUrl":"https://doi.org/10.1109/ICIP46576.2022.9897456","url":null,"abstract":"The goal of intrinsic image decomposition is to recover low level features of images. Most of the studies tend to consider only reflectance and shading, even though it is known that increasing the number of intrinsics is beneficial for many applications. Existing intrinsic image datasets are quite limited. In this study, a dataset is introduced to provide a comprehensive benchmark to the field of intrinsic image decomposition. IID-NORD contains a large number of scenes and for each scene ground truth reflectance, shading, surface normal vectors, light vectors and depth map is provided to allow detailed decomposition. Moreover, diverse illuminants, viewing angles, and dynamic shadows are used to prevent any bias. To the best of available knowledge, IID-NORD is the most comprehensive dataset in the field of intrinsic image decomposition. IID-NORD will be available on the first author’s official webpage.","PeriodicalId":387035,"journal":{"name":"2022 IEEE International Conference on Image Processing (ICIP)","volume":"520 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133240634","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Efficient Feature Compression for the Object Tracking Task","authors":"R. Henzel, K. Misra, Tianying Ji","doi":"10.1109/ICIP46576.2022.9897802","DOIUrl":"https://doi.org/10.1109/ICIP46576.2022.9897802","url":null,"abstract":"In object tracking systems, often clients capture video, encode it and transmit it to a server that performs the actual machine task. In this paper we propose an alternative architecture, where we instead transmit features to the server. Specifically, we partition the Joint Detection and Embedding (JDE) person tracking network into client and server side sub-networks and code the intermediate tensors i.e. features. The features are compressed for transmission using a Deep Neural Network (DNN) we design and train specifically for carrying out the tracking task. The DNN uses trainable non-uniform quantizers, conditional probability estimators, hierarchical coding; concepts that have been used in the past for neural networks based image and video compression. Additionally, the DNN includes a novel parameterized dual-path layer that comprises of an autoencoder in one path and a convolution layer in the other. The tensor output by each path is added before being consumed by subsequent layers. The parameter value for this dual-path layer controls the output channel count and correspondingly the bitrate of transmitted bitstream. We demonstrate that our model improves coding efficiency by 43.67% over state-of-the-art Versatile Video Coding standard that codes the source video in pixel domain.","PeriodicalId":387035,"journal":{"name":"2022 IEEE International Conference on Image Processing (ICIP)","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131460751","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Cu-Net: Towards Continuous Multi-Class Contour Detection for Retinal Layer Segmentation In Oct Images","authors":"Ashuta Bhattarai, C. Kambhamettu, Jing Jin","doi":"10.1109/ICIP46576.2022.9897516","DOIUrl":"https://doi.org/10.1109/ICIP46576.2022.9897516","url":null,"abstract":"Recent deep learning-based contour detection studies show high accuracy in single-class boundary detection problems. However, this performance does not translate well in a multi-class scenario where continuous contours are required. Our research presents CU-Net, a U-Net-based network with residual-net encoders which can produce accurate and uninterrupted contour lines for multiple classes. The critical factor behind this concept is our continuity module, containing an interpolation layer and a novel activation function that converts discrete signals into smooth contours. We find the application of our approach in medical imaging problems like retinal layer segmentation from optical coherence tomography (OCT) scans. We applied our method to an expert annotated OCT dataset of children with sickle-cell disease. To compare with benchmarks, we evaluated our network on DME and HC-MS datasets. We achieved an overall mean absolute distance of 6.48 ± 2.04µM and 1.97 ± 0.89µM, respectively 1.03 and 1.4 times less than the current state-of-the-art.","PeriodicalId":387035,"journal":{"name":"2022 IEEE International Conference on Image Processing (ICIP)","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128906317","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Nicolas Monnier, F. Orieux, N. Gac, C. Tasse, E. Raffin, D. Guibert
{"title":"Fast Sky to Sky Interpolation for Radio Interferometric Imaging","authors":"Nicolas Monnier, F. Orieux, N. Gac, C. Tasse, E. Raffin, D. Guibert","doi":"10.1109/ICIP46576.2022.9897317","DOIUrl":"https://doi.org/10.1109/ICIP46576.2022.9897317","url":null,"abstract":"Reconstruction of radio interferometric images re-quires the processing of data in Fourier space that dot not have regular coordinates, preventing direct use of the Fast Fourier Transform. The most common solution is to rely on interpolation algorithms, called gridding, that are computationally expensive. In this paper, we propose an algorithmic reinterpretation, named sky to sky method, to reduce the computation cost of the gridding operation and its adjoint, the degridding, when used successively. We analyze the impact of interpolation step size regarding the computation cost and the reconstruction error. We also illustrate this optimization on a full reconstruction with gradient descent and CLEAN algorithm. Finally, we obtain acceleration factors between 1.2 and 16.4 without additional approximation.","PeriodicalId":387035,"journal":{"name":"2022 IEEE International Conference on Image Processing (ICIP)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131331602","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Context Relation Fusion Model for Visual Question Answering","authors":"Haotian Zhang, Wei Wu","doi":"10.1109/ICIP46576.2022.9897563","DOIUrl":"https://doi.org/10.1109/ICIP46576.2022.9897563","url":null,"abstract":"Traditional VQA models tend to rely on language priors as a shortcut to answer questions and neglect visual information. To solve this problem, the latest approaches divide language priors into \"good\" language context and \"bad\" language bias through global features to benefit the language context and suppress the language bias. However, language priors cannot be meticulously divided by global features. In this paper, we propose a novel Context Relation Fusion Model (CRFM), which produces comprehensive contextual features forcing the VQA model to more carefully distinguish language priors into \"good\" language context and \"bad\" language bias. Specifically, we utilize the Visual Relation Fusion Model (VRFM) and Question Relation Fusion Model (QRFM) to learn local critical contextual information and then perform information enhancement through the Attended Features Fusion Model (AFFM). Experiments show that our CRFM achieves state-of-the-art performance on the VQA-CP v2 dataset.","PeriodicalId":387035,"journal":{"name":"2022 IEEE International Conference on Image Processing (ICIP)","volume":"199 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127401213","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Hua-mei Chen, Erik Blasch, Nichole Sullivan, Genshe Chen
{"title":"Trajectory-Based Pattern of Life Analysis","authors":"Hua-mei Chen, Erik Blasch, Nichole Sullivan, Genshe Chen","doi":"10.1109/ICIP46576.2022.9897585","DOIUrl":"https://doi.org/10.1109/ICIP46576.2022.9897585","url":null,"abstract":"The large amounts of movement data collected from various sources such as wide area motion imagery (WAMI) and mobile phone apps call for innovative technologies to infer valuable information from these data. In this paper, we present two such tools to extract pattern of life (PoL) information from vehicle trajectories. The first tool, intersection traffic analysis (ITA) detects abnormal traffic patterns in major street intersections; while the second one, frequent trajectory patterns analysis (FTPA) discerns the most frequent trajectory patterns in a given time-interval in the region of concern. Both tools support comprehensive trajectory-based pattern of life situation awareness. Using both simulated trajectories and the measurements extracted from WAMI imagery demonstrate ITA and FTPA utility.","PeriodicalId":387035,"journal":{"name":"2022 IEEE International Conference on Image Processing (ICIP)","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133758514","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Practical Bulk Denoising Of Large Binary Images","authors":"Ignacio Ramírez Paulino","doi":"10.1109/ICIP46576.2022.9897678","DOIUrl":"https://doi.org/10.1109/ICIP46576.2022.9897678","url":null,"abstract":"This paper explores the problem of removing real, non-simulated noise from a large body of large binary images. Three denoising methods are evaluated for their efficacy and speed: the well known DUDE, a novel variant of it which we call the Quorum Denoiser, and an adaptation of the Non-Local Means (NLM) method for binary images, B-NLM which, to our knowledge, is faster than other known variants. The methods are compared and tested both on simulated noise (as a benchmark) and on the real life images. All three methods produce good results on real noise. However, despite being optimized, the B-NLM is significantly slower than the other two, whose speeds are comparable to a plain median filter. Overall, the Quorum denoiser appears to be the best option, both in quality (real and simulated) and speed.","PeriodicalId":387035,"journal":{"name":"2022 IEEE International Conference on Image Processing (ICIP)","volume":"212 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133787260","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Two Distillation Perspectives Based on Tanimoto Coefficient","authors":"Hongqiao Shu","doi":"10.1109/ICIP46576.2022.9897375","DOIUrl":"https://doi.org/10.1109/ICIP46576.2022.9897375","url":null,"abstract":"Knowledge distillation is a process which uses a complex teacher model to guide the training of a smaller student model. The output from the teacher model’s last hidden layer is commonly used as knowledge. This paper proposes a novel method on how to use this knowledge to guide the student model. Tanimoto coefficient is used to measure the length and angle information of sample pair. Knowledge distillation is conducted from two perspectives. The first perspective is to calculate a Tanimoto similarity matrix for every training sample pair within a batch for the teacher model, and then use this matrix to guide the student model. The second perspective is to calculate a Tanimoto diversity between the teacher model and the student model for every training sample and minimize the diversity. On FOOD101 and VOC2007 datasets, the top1-accuracy and mAP obtained by our method is higher than that of existing distillation methods.","PeriodicalId":387035,"journal":{"name":"2022 IEEE International Conference on Image Processing (ICIP)","volume":"49 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133731424","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The BRIO-TA Dataset: Understanding Anomalous Assembly Process in Manufacturing","authors":"Kosuke Moriwaki, Gaku Nakano, Tetsuo Inoshita","doi":"10.1109/ICIP46576.2022.9897369","DOIUrl":"https://doi.org/10.1109/ICIP46576.2022.9897369","url":null,"abstract":"In this paper, we introduce a new video dataset for action segmentation, the BRIO-TA (BRIO Toy Assembly) dataset, which is designed to simulate operations in factory assembly. In contrast with existing datasets, BRIO-TA consists of two types of scenarios: normal work processes and anomalous work processes. Anomalies are further categorized into incorrect processes, omissions, and abnormal durations. The subjects in the videos are asked to perform either normal work or one of the three anomalies, and all video frames are manually annotated into 23 action classes. In addition, we propose a new metric called anomaly section accuracy (ASA) for evaluating the detection accuracy of anomalous segments in a video. With the new dataset and metric, we report that the state-of-the-art methods show a significantly low ASA, while they work for normal work segments. Demo videos are available at https://github.com/Tarmo-moriwaki/BRIO-TA_sample and the full dataset will be released after publication.","PeriodicalId":387035,"journal":{"name":"2022 IEEE International Conference on Image Processing (ICIP)","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115420169","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}