{"title":"An improved method for pylon extraction and vegetation encroachment analysis in high voltage transmission lines using LiDAR data","authors":"Nosheen Munir, M. Awrangjeb, Bela Stantic","doi":"10.1109/DICTA51227.2020.9363391","DOIUrl":"https://doi.org/10.1109/DICTA51227.2020.9363391","url":null,"abstract":"The maintenance of high-voltage power lines rights-of-way due to vegetation intrusions is important for electric power distribution companies for safe and secure delivery of electricity. However, the monitoring becomes more challenging if power line corridor (PLC) exists in complex environment such as mountainous terrains or forests. To overcome these challenges, this paper aims to provide an automated method for extraction of individual pylons and monitoring of vegetation near the PLC in hilly terrain. The proposed method starts off by dividing the large dataset into small manageable datasets. A voxel grid is formed on each dataset to separate power lines from pylons and vegetation. The power line points are converted into a binary image to get the individual spans. These span points are used to find nearby vegetation and pylon points and individual pylons and vegetation are further separated using a statistical analysis. Finally, the height and location of extracted vegetation with reference to power lines are estimated and separated into danger and clearance zones. The experiment on two large Australian datasets shows that the proposed method provides high completeness and correctness of 96.5% and 99% for pylons, respectively. Moreover, the growing vegetation beneath and around the PLC that can harm the power lines is identified.","PeriodicalId":348164,"journal":{"name":"2020 Digital Image Computing: Techniques and Applications (DICTA)","volume":"71 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124847170","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A. Suman, Yash Khemchandani, Md. Asikuzzaman, A. Webb, D. Perriman, M. Tahtali, M. Pickering
{"title":"Evaluation of U-Net CNN Approaches for Human Neck MRI Segmentation","authors":"A. Suman, Yash Khemchandani, Md. Asikuzzaman, A. Webb, D. Perriman, M. Tahtali, M. Pickering","doi":"10.1109/DICTA51227.2020.9363385","DOIUrl":"https://doi.org/10.1109/DICTA51227.2020.9363385","url":null,"abstract":"The segmentation of neck muscles is useful for the diagnoses and planning of medical interventions for neck pain-related conditions such as whiplash and cervical dystonia. Neck muscles are tightly grouped, have similar appearance to each other and display large anatomical variability between subjects. They also exhibit low contrast with background organs in magnetic resonance (MR) images. These characteristics make the segmentation of neck muscles a challenging task. Due to the significant success of the U-Net architecture for deep learning-based segmentation, numerous versions of this approach have emerged for the task of medical image segmentation. This paper presents an evaluation of 10 U-Net CNN approaches, 6 direct (U-Net, CRF-Unet, A-Unet, MFP-Unet, R2Unet and U-Net++) and 4 modified (R2A-Unet, R2A-Unet++, PMS-Unet and MS-Unet). The modifications are inspired by recent multi-scale and multi-stream techniques for deep learning algorithms. T1 weighted axial MR images of the neck, at the distal end of the C3 vertebrae, from 45 subjects with real-time data augmentation were used in our evaluation of neck muscle segmentation approaches. The analysis of our numerical results indicates that the R2Unet architecture achieves the best accuracy.","PeriodicalId":348164,"journal":{"name":"2020 Digital Image Computing: Techniques and Applications (DICTA)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128756122","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"SL3D - Single Look 3D Object Detection based on RGB-D Images","authors":"G. Erabati, Helder Araújo","doi":"10.1109/DICTA51227.2020.9363404","DOIUrl":"https://doi.org/10.1109/DICTA51227.2020.9363404","url":null,"abstract":"We present SL3D, Single Look 3D object detection approach to detect the 3D objects from the RGB-D image pair. The approach is a proposal free, single-stage 3D object detection method from RGB-D images by leveraging multi-scale feature fusion of RGB and depth feature maps, and multi-layer predictions. The method takes pair of RGB and depth images as an input and outputs predicted 3D bounding boxes. The neural network SL3D, comprises of two modules: multi-scale feature fusion and multi-layer prediction. The multi-scale feature fusion module fuses the multi-scale features from RGB and depth feature maps, which are later used by the multi-layer prediction module for 3D object detection. Each location of prediction layer is attached with a set of predefined 3D prior boxes to account for varying shapes of 3D objects. The output of the network regresses the predicted 3D bounding boxes as an offset to the set of 3D prior boxes and duplicate 3D bounding boxes are removed by applying 3D non-maximum suppression. The network is trained end-to-end on publicly available SUN RGB-D dataset. The SL3D approach with ResNeXt50 achieves 31.77 mAP on SUN RGB-D test dataset with an inference speed of approximately 4 fps, and with MobileNetV2, it achieves approximately 15 fps with a reduction of around 2 mAP. The quantitative results show that the proposed method achieves competitive performance to state-of-the-art methods on SUN RGB-D dataset with near real-time inference speed.","PeriodicalId":348164,"journal":{"name":"2020 Digital Image Computing: Techniques and Applications (DICTA)","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125629192","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"CNN to Capsule Network Transformation","authors":"Takumi Sato, K. Hotta","doi":"10.1109/DICTA51227.2020.9363395","DOIUrl":"https://doi.org/10.1109/DICTA51227.2020.9363395","url":null,"abstract":"Capsule Network has been recently proposed which outperforms CNN in specific tasks. Due to the network architecture differences between Capsule Network and CNN, Capsule Network could not use transfer learning which is very frequently used in CNN. In this paper, we propose a transfer learning method which can easily transfer CNN to Capsule Network. We achieved by stacking pre-trained CNN and used the proposed capsule random transformer to interact individual CNN each other which will form a Capsule Network. We applied this method to U-net and achieved to create a capsule based method that has similar accuracy compared to U-net. We show the results on cell segmentation dataset. Our capsule network successfully archives higher accuracy compared to other Capsule Network based semantic segmentation methods.","PeriodicalId":348164,"journal":{"name":"2020 Digital Image Computing: Techniques and Applications (DICTA)","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134459595","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
J. Kugelman, D. Alonso-Caneiro, Scott A. Read, Stephen J. Vincent, F. Chen, M. Collins
{"title":"Dual image and mask synthesis with GANs for semantic segmentation in optical coherence tomography","authors":"J. Kugelman, D. Alonso-Caneiro, Scott A. Read, Stephen J. Vincent, F. Chen, M. Collins","doi":"10.1109/DICTA51227.2020.9363402","DOIUrl":"https://doi.org/10.1109/DICTA51227.2020.9363402","url":null,"abstract":"In recent years, deep learning-based OCT segmentation methods have addressed many of the limitations of traditional segmentation approaches and are capable of performing rapid, consistent and accurate segmentation of the chorio-retinal layers. However, robust deep learning methods require a sufficiently large and diverse dataset for training, which is not always feasible in many biomedical applications. Generative adversarial networks (GANs) have demonstrated the capability of producing realistic and diverse high-resolution images for a range of modalities and datasets, including for data augmentation, a powerful application of GAN methods. In this study we propose the use of a StyleGAN inspired approach to generate chorio-retinal optical coherence tomography (OCT) images with a high degree of realism and diversity. We utilize the method to synthesize image and segmentation mask pairs that can be used to train a deep learning semantic segmentation method for subsequent boundary delineation of three chorioretinal layer boundaries. By pursuing a dual output solution rather than a mask-to-image translation solution, we remove an unnecessary constraint on the generated images and enable the synthesis of new unseen area mask labels. The results are encouraging with near comparable performance observed when training using purely synthetic data, compared to the real data. Moreover, training using a combination of real and synthetic data results in zero measurable performance loss, further demonstrating the reliability of this technique and feasibility for data augmentation in future work.","PeriodicalId":348164,"journal":{"name":"2020 Digital Image Computing: Techniques and Applications (DICTA)","volume":"52 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127432971","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Base-Package Recommendation Framework Based on Consumer Behaviours in IPTV Platform","authors":"Kuruparan Shanmugalingam, Ruwinda Ranganayake, Chanaka Gunawardhana, Rajitha Navarathna","doi":"10.1109/DICTA51227.2020.9363400","DOIUrl":"https://doi.org/10.1109/DICTA51227.2020.9363400","url":null,"abstract":"Internet Protocol TeleVision (IPTV) provides many services such as live television streaming, time-shifted media, and Video On Demand (VOD). However, many customers do not engage properly with their subscribed packages due to a lack of knowledge and poor guidance. Many customers fail to identify the proper IPTV service package based on their needs and to utilise their current package to the maximum. In this paper, we propose a base-package recommendation model with a novel customer scoring-meter based on customers behaviour. Initially, our paper describes an algorithm to measure customers engagement score, which illustrates a novel approach to track customer engagement with the IPTV service provider. Next, the content-based recommendation system, which uses vector representation of subscribers and base packages details is described. We show the significance of our approach using local IPTV service provider data set qualitatively. The proposed approach can significantly improve user retention, long term revenue and customer satisfaction.","PeriodicalId":348164,"journal":{"name":"2020 Digital Image Computing: Techniques and Applications (DICTA)","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125813890","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Hamish Pratt, B. Evans, T. Rowntree, I. Reid, S. Wiederman
{"title":"Recurrent Motion Neural Network for Low Resolution Drone Detection","authors":"Hamish Pratt, B. Evans, T. Rowntree, I. Reid, S. Wiederman","doi":"10.1109/DICTA51227.2020.9363377","DOIUrl":"https://doi.org/10.1109/DICTA51227.2020.9363377","url":null,"abstract":"Drones are becoming increasingly prevalent in everyday usage with many commercial applications in fields such as construction work and agricultural surveying. Despite their common commercial use, drones have been recently used with malicious intent, such as airline disruptions at Gatwick Airport. With the emerging issue of safety concerns for the public and other airspace users, detecting and monitoring active drones in an area is crucial. This paper introduces a recurrent convolutional neural network (CNN) specifically designed for drone detection. This CNN can detect drones from down-sampled images by exploiting the temporal information of drones in flight and outperforms a state-of-the-art conventional object detector. Due to the lightweight and low resolution nature of this network, it can be mounted on a small processor and run at near real-time speeds.","PeriodicalId":348164,"journal":{"name":"2020 Digital Image Computing: Techniques and Applications (DICTA)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128902836","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Dhaval Vaghjiani, Sajib Saha, Yann Connan, Shaun Frost, Y. Kanagasingam
{"title":"Visualizing and Understanding Inherent Image Features in CNN-based Glaucoma Detection","authors":"Dhaval Vaghjiani, Sajib Saha, Yann Connan, Shaun Frost, Y. Kanagasingam","doi":"10.1109/DICTA51227.2020.9363369","DOIUrl":"https://doi.org/10.1109/DICTA51227.2020.9363369","url":null,"abstract":"Convolutional neural network (CNN)-based methods have achieved state-of-the-art performance in glaucoma detection. Despite this, these methods are often criticized for offering no opportunity to understand how classification decisions are made. In this paper, we develop an innovative visualization strategy that allows the inherent image features contributing to glaucoma detection at different CNN layers to be understood. We also develop a set of interpretable notions to better comprehend the contributing image features involved in the disease detection process. Extensive experiments are conducted on publicly available glaucoma datasets. Results show that the optic cup is the most influential ocular component for glaucoma detection (overall Intersection over Union (IoU) score of 0.18), followed by the neuro-retinal rim (NR) with IoU score 0.17. With an overall IoU score of 0.16 vessels in the photograph also play a considerable role in the disease detection.","PeriodicalId":348164,"journal":{"name":"2020 Digital Image Computing: Techniques and Applications (DICTA)","volume":"67 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122545088","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Attentive Inception Module based Convolutional Neural Network for Image Enhancement","authors":"Purbaditya Bhattacharya, U. Zölzer","doi":"10.1109/DICTA51227.2020.9363375","DOIUrl":"https://doi.org/10.1109/DICTA51227.2020.9363375","url":null,"abstract":"In this paper, the problem of image enhancement in the form of single image superresolution and compression artifact reduction is addressed by proposing a convolutional neural network with an inception module containing an attention mechanism. The inception module in the network contains parallel branches of convolution layers employing filters with multiple receptive fields via filter dilation. The aggregated multi-scale features are subsequently filtered via an attention mechanism which allows learned feature map weighting in order to reduce redundancy. Additionally, a long skip attentive connection is also introduced in order to process the penultimate feature layer of the proposed network. Addition of the aforementioned attention modules introduce a dynamic nature to the model which would otherwise consist of static trained filters. Experiments are performed with multiple network depths and architectures in order to assess their contributions. The final network is evaluated on the benchmark datasets for the aforementioned tasks, and the results indicate a very good performance.","PeriodicalId":348164,"journal":{"name":"2020 Digital Image Computing: Techniques and Applications (DICTA)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133986203","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Network-based structure flow estimation","authors":"Shu Liu, Nick Barnes, R. Mahony, Haolei Ye","doi":"10.1109/DICTA51227.2020.9363398","DOIUrl":"https://doi.org/10.1109/DICTA51227.2020.9363398","url":null,"abstract":"Structure flow is a novel three-dimensional motion representation that differs from scene flow in that it is directly associated with image change. Due to its close connection with both optical flow and divergence in images, it is well suited to estimation from monocular vision. To acquire an accurate measurement of structure flow, we design a method that employs the spatial pyramid structure and the network-based method. We investigate the current motion field datasets and validate the performance of our method by comparing its two-dimensional component of motion field with the previous works. In general, we experimentally show two conclusions: 1. Our motion estimator employs only RGB images and outperforms the previous work that utilizes RGB-D images. 2. The estimated structure flow map is a more effective representation for demonstrating the motion field compared with the widely-accepted scene flow via monocular vision.","PeriodicalId":348164,"journal":{"name":"2020 Digital Image Computing: Techniques and Applications (DICTA)","volume":"200 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124488334","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}