P. Dobes, Jakub Špaňhel, Vojtech Bartl, Roman Juránek, A. Herout
{"title":"Density-Based Vehicle Counting with Unsupervised Scale Selection","authors":"P. Dobes, Jakub Špaňhel, Vojtech Bartl, Roman Juránek, A. Herout","doi":"10.1109/DICTA51227.2020.9363401","DOIUrl":"https://doi.org/10.1109/DICTA51227.2020.9363401","url":null,"abstract":"A significant hurdle within any counting task is the variance in a scale of the objects to be counted. While size changes of some extent can be induced by perspective distortion, more severe scale differences can easily occur, e.g. in case of images taken by a drone from different elevations above the ground. The aim of our work is to overcome this issue by leveraging only lightweight dot annotations and a minimum level of training supervision. We propose a modification to the Stacked Hourglass network which enables the model to process multiple input scales and to automatically select the most suitable candidate using a quality score. We alter the training procedure to enable learning of the quality scores while avoiding their direct supervision, and thus without requiring any additional annotation effort. We evaluate our method on three standard datasets: PUCPR+, TRANCOS and CARPK. The obtained results are on par with current state-of-the-art methods while being more robust towards significant variations in input scale.","PeriodicalId":348164,"journal":{"name":"2020 Digital Image Computing: Techniques and Applications (DICTA)","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122199414","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Veena Dodballapur, Yang Song, Heng Huang, Mei Chen, Wojciech Chrzanowski, Weidong (Tom) Cai
{"title":"Dual-Stage Domain Adaptive Mitosis Detection for Histopathology Images","authors":"Veena Dodballapur, Yang Song, Heng Huang, Mei Chen, Wojciech Chrzanowski, Weidong (Tom) Cai","doi":"10.1109/DICTA51227.2020.9363411","DOIUrl":"https://doi.org/10.1109/DICTA51227.2020.9363411","url":null,"abstract":"Histopathology images for mitosis detection vary in appearance due to the non-standard method of preparing the tissues as well as differences in scanner hardware. This makes automatic machine learning based mitosis detection very challenging because of domain shift between the training and testing datasets. In this paper, we propose a method of addressing this domain shift problem by using a two-stage domain adaptive neural network. In the first stage, we use domain adaptive Mask R-CNN to generate masks for mitotic regions. Thus generated masks are used by a second domain adaptive convolutional neural network to perform finer mitosis detection. Our method achieved state-of-the-art performance on both ICPR 2012 and 2014 datasets. We demonstrate that using a domain agnostic approach achieves better generalization and mitosis cell localization for the trained models.","PeriodicalId":348164,"journal":{"name":"2020 Digital Image Computing: Techniques and Applications (DICTA)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129903613","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Using Environmental Context to Synthesis Missing Pixels","authors":"Thaer F. Ali, A. Woodley","doi":"10.1109/DICTA51227.2020.9363419","DOIUrl":"https://doi.org/10.1109/DICTA51227.2020.9363419","url":null,"abstract":"Satellites have proven to be a technology that can help in a variety of environmental and human development contexts. However, at times some pixels in the satellite images are not captured. These uncaptured pixels are called missing pixels. Having these missing pixels means that important data for research and satellite imagery-based applications is lost. Therefore, people have developed pixel synthesis methods. This paper presents a new pixel synthesis method called the Iterative Self-Organizing Data Analysis Techniques Algorithm - Integration of Geostatistical and Temporal Missing Pixels' Properties (ISODATA-IGTMPP). The method is built upon the Integration of Geostatistical and Temporal Missing Pixels' Properties (IG TMPP) method and adds a seminal clustering technique called the Iterative Self-Organizing Data Analysis Techniques Algorithm (ISODATA). The clustering technique allows a new way of predicting the missing pixel from their environmental class with benefit of the spatial and temporal properties. Here, the ISODATA-IGTMPP method was tested on the Spatial-Temporal Change in the Environment Context (STCEC) dataset and was compared with results of four missing pixel predicting methods. The method shows the best performing results and preforms very well across different environment types.","PeriodicalId":348164,"journal":{"name":"2020 Digital Image Computing: Techniques and Applications (DICTA)","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116324707","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Pseudo Supervised Solar Panel Mapping based on Deep Convolutional Networks with Label Correction Strategy in Aerial Images","authors":"Jue Zhang, X. Jia, Jiankun Hu","doi":"10.1109/DICTA51227.2020.9363379","DOIUrl":"https://doi.org/10.1109/DICTA51227.2020.9363379","url":null,"abstract":"Solar panel mapping has gained a rising interest in renewable energy field with the aid of remote sensing imagery. Significant previous work is based on fully supervised learning with classical classifiers or convolutional neural networks (CNNs), which often require manual annotations of pixel-wise ground-truth to provide accurate supervision. Weakly supervised methods can accept image-wise annotations which can help reduce the cost for pixel-level labelling. Inevitable performance gap, however, exists between weakly and fully supervised methods in mapping accuracy. To address this problem, we propose a pseudo supervised deep convolutional network with label correction strategy (PS-CNNLC) for solar panels mapping. It combines the benefits of both weak and strong supervision to provide accurate solar panel extraction. First, a convolutional neural network is trained with positive and negative samples with image-level labels. It is then used to automatically identify more positive samples from randomly selected unlabeled images. The feature maps of the positive samples are further processed by gradient-weighted class activation mapping to generate initial mapping results, which are taken as initial pseudo labels as they are generally coarse and incomplete. A progressive label correction strategy is designed to refine the initial pseudo labels and train an end-to-end target mapping network iteratively, thereby improving the model reliability. Comprehensive evaluations and ablation study conducted validate the superiority of the proposed PS-CNNLC.","PeriodicalId":348164,"journal":{"name":"2020 Digital Image Computing: Techniques and Applications (DICTA)","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126541107","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Vis-CRF: A Simplified Filtering Model for Vision","authors":"Nasim Nematzadeh, D. Powers, T. Lewis","doi":"10.1109/DICTA51227.2020.9363403","DOIUrl":"https://doi.org/10.1109/DICTA51227.2020.9363403","url":null,"abstract":"Over the last decade, a variety of new neurophysiological experiments have deepened our understanding about retinal cells functionality, leading to new insights as to how, when and where retinal processing takes place, and the nature of the retinal representation and encoding sent to the cortex for further processing. Based on these neurobiological discoveries, we provide computer simulation evidence to suggest that Geometrical illusions are explained in part, by the interaction of multiscale visual processing performed in the retina supporting previous studies [1, 2]. The output of our retinal stage model, named Vis-CRF which is a filtering vision model is presented here for a sample of Café Wall pattern and for an illusory pattern, in which the final percept arises from multiple scale processing of Difference of Gaussians (DoG) and the perceptual interaction of foreground and background elements.","PeriodicalId":348164,"journal":{"name":"2020 Digital Image Computing: Techniques and Applications (DICTA)","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127558557","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
X. Ye, Fengchao Xiong, Jianfeng Lu, Haifeng Zhao, Jun Zhou
{"title":"M2-Net: A Multi-scale Multi-level Feature Enhanced Network for Object Detection in Optical Remote Sensing Images","authors":"X. Ye, Fengchao Xiong, Jianfeng Lu, Haifeng Zhao, Jun Zhou","doi":"10.1109/DICTA51227.2020.9363420","DOIUrl":"https://doi.org/10.1109/DICTA51227.2020.9363420","url":null,"abstract":"Object detection in remote sensing images is a challenging task due to diversified orientation, complex background, dense distribution and scale variation of objects. In this paper, we tackle this problem by proposing a novel multi-scale multi-level feature enhanced network ($M$2-Net) that integrates a Feature Map Enhancement (FME) module and a Feature Fusion Block (FFB) into Rotational RetinaNet. The FME module aims to enhance the weak features by factorizing the convolutional operation into two similar branches instead of one single branch, which helps to broaden receptive field with less parameters. This module is embedded into different layers in the backbone network to capture multi-scale semantics and location information for detection. The FFB module is used to shorten the information propagation path between low-level high-resolution features in shallow layers and high-level semantic features in deep layers, facilitating more effective feature fusion and object detection especially those with small sizes. Experimental results on three benchmark datasets show that our method not only outperforms many one-stage detectors but also achieves competitive accuracy with lower time cost than two-stage detectors.","PeriodicalId":348164,"journal":{"name":"2020 Digital Image Computing: Techniques and Applications (DICTA)","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114461377","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Automatic Assessment of Open Street Maps Database Quality using Aerial Imagery","authors":"Boris Repasky, Timothy Payne, A. Dick","doi":"10.1109/DICTA51227.2020.9363412","DOIUrl":"https://doi.org/10.1109/DICTA51227.2020.9363412","url":null,"abstract":"Open data initiatives such as OpenStreetMap (OSM) are a powerful crowd sourced approach to data collection. However due to their crowd-sourced nature the quality of the database heavily depends on the enthusiasm and determination of the public. We propose a novel method based on variational autoencoder generative adversarial networks (VAE-GAN) together with an information theoretic measure of database quality based on the expected discrimination information between the original image and labels generated from OSM data. Experiments on overhead aerial imagery and segmentation masks generated from OSM data show that our proposed discrimination information measure is a promising measure to regional database quality in OSM.","PeriodicalId":348164,"journal":{"name":"2020 Digital Image Computing: Techniques and Applications (DICTA)","volume":"107 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116458686","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"HCI for Elderly, Measuring Visual Complexity of Webpages Based on Machine Learning","authors":"Zahra Sadeghi, E. Homayounvala, M. Borhani","doi":"10.1109/DICTA51227.2020.9363381","DOIUrl":"https://doi.org/10.1109/DICTA51227.2020.9363381","url":null,"abstract":"The increasing number of elderly persons, aged 65 and over, highlights the problem of improving their experience with computers and the web considering their preferences and needs. Elderlies' skills like cognitive, haptic, visual, and motor skills are reduced by age. The visual complexity of web pages has a major influence on the quality of user experience of elderly users according to their reduced abilities. Therefore, it is quite beneficial if the visual complexity of web pages could be measured and reduced in applications and websites which are designed for them. In this way a personalized less complex version of the website could be provided for older users. In this article, a new approach for measuring the visual complexity is proposed by using both Human-Computer Interaction (HCI) and machine learning methods. Six features are considered for complexity measurements. Experimental results demonstrated that the trained proposed machine learning approach increases the accuracy of classification of applications and websites based on their visual complexity up to 82% which is more than its competitors. Besides, a feature selection algorithm indicates that features such as clutter and equilibrium were selected to have the most influence on the classification of webpages based on their visual complexity.","PeriodicalId":348164,"journal":{"name":"2020 Digital Image Computing: Techniques and Applications (DICTA)","volume":"126 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124005656","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"UNET-Based Multi-Task Architecture for Brain Lesion Segmentation","authors":"Ava Assadi Abolvardi, Len Hamey, K. Ho-Shon","doi":"10.1109/DICTA51227.2020.9363397","DOIUrl":"https://doi.org/10.1109/DICTA51227.2020.9363397","url":null,"abstract":"Image segmentation is the task of extracting the region of interest in images and is one of the main applications of computer vision in the medical domain. Like other computer vision tasks, deep learning is the main solution to image segmentation problems. Deep learning methods are data-hungry and need a huge amount of data for training. On the other side, data shortage is always a problem, especially in the medical domain. Multi-task learning is a technique which helps the deep model to learn better representation from data distribution by introducing related auxiliary tasks. In this study, we investigate a research question to whether it is better to provide this auxiliary information as an input to the network, or is it better to use this task and design a multi-output network. Our findings suggest that however, the multi-output manner improves the overall performance, but the best result achieves when this extra information serves as auxiliary input information.","PeriodicalId":348164,"journal":{"name":"2020 Digital Image Computing: Techniques and Applications (DICTA)","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125870631","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Dadong Wang, Y. Arzhaeva, Liton Devnath, Maoying Qiao, Saeed K. Amirgholipour, Qiyu Liao, R. McBean, J. Hillhouse, S. Luo, David Meredith, K. Newbigin, Deborah Yates
{"title":"Automated Pneumoconiosis Detection on Chest X-Rays Using Cascaded Learning with Real and Synthetic Radiographs","authors":"Dadong Wang, Y. Arzhaeva, Liton Devnath, Maoying Qiao, Saeed K. Amirgholipour, Qiyu Liao, R. McBean, J. Hillhouse, S. Luo, David Meredith, K. Newbigin, Deborah Yates","doi":"10.1109/DICTA51227.2020.9363416","DOIUrl":"https://doi.org/10.1109/DICTA51227.2020.9363416","url":null,"abstract":"Pneumoconiosis is an incurable respiratory disease caused by long-term inhalation of respirable dust. Due to small pneumoconiosis incidence and restrictions on sharing of patient data, the number of available pneumoconiosis X-rays is insufficient, which introduces significant challenges for training deep learning models. In this paper, we use both real and synthetic pneumoconiosis radiographs to train a cascaded machine learning framework for the automated detection of pneumoconiosis, including a machine learning based pixel classifier for lung field segmentation, and Cycle-Consistent Adversarial Networks (CycleGAN) for generating abundant lung field images for training, and a Convolutional Neural Network (CNN) based image classier. Experiments are conducted to compare the classification results from several state-of-the-art machine learning models and ours. Our proposed model outperforms the others and achieves an overall classification accuracy of 90.24%, a specificity of 88.46% and an excellent sensitivity of 93.33% for detecting pneumoconiosis.","PeriodicalId":348164,"journal":{"name":"2020 Digital Image Computing: Techniques and Applications (DICTA)","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123433839","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}