2018 Digital Image Computing: Techniques and Applications (DICTA)最新文献_第5页

A New Approach using Characteristic Video Signals to Improve the Stability of Manufacturing Processes 利用特征视频信号提高制造过程稳定性的新方法

2018 Digital Image Computing: Techniques and Applications (DICTA) Pub Date : 2018-12-01 DOI: 10.1109/DICTA.2018.8615860

Frederic Ringsleben, Maik Benndorf, T. Haenselmann, R. Boiger, Manfred Mücke, M. Fehr, Dirk Motthes

引用次数: 0

Blur Kernel Estimation Model with Combined Constraints for Blind Image Deblurring 结合约束的模糊核估计模型用于图像去模糊

2018 Digital Image Computing: Techniques and Applications (DICTA) Pub Date : 2018-12-01 DOI: 10.1109/DICTA.2018.8615815

Ying Liao, Weihong Li, Jinkai Cui, W. Gong

引用次数: 1

Image Caption Generator with Novel Object Injection 具有新颖对象注入的图像标题生成器

2018 Digital Image Computing: Techniques and Applications (DICTA) Pub Date : 2018-12-01 DOI: 10.1109/DICTA.2018.8615810

Mirza Muhammad Ali Baig, Mian Ihtisham Shah, Muhammad Abdullah Wajahat, Nauman Zafar, Omar Arif

{"title":"Image Caption Generator with Novel Object Injection","authors":"Mirza Muhammad Ali Baig, Mian Ihtisham Shah, Muhammad Abdullah Wajahat, Nauman Zafar, Omar Arif","doi":"10.1109/DICTA.2018.8615810","DOIUrl":"https://doi.org/10.1109/DICTA.2018.8615810","url":null,"abstract":"Image captioning is a field within artificial intelligence that is progressing rapidly and it has a lot of potentials. A major problem when working in this field is the limited amount of data that is available to us as is. The only dataset considered suitable enough for the task is the Microsoft: Common Objects in Context (MSCOCO) dataset, which contains about 120,000 training images. This covers about 80 object classes, which is an insufficient amount if we want to create robust solutions that aren't limited to the constraints of the data at hand. In order to overcome this problem, we propose a solution that incorporates Zero-Shot Learning concepts in order to identify unknown objects and classes by using semantic word embeddings and existing state-of-the-art object identification algorithms. Our proposed model, Image Captioning using Novel Word Injection, uses a pre-trained caption generator and works on the output of the generator to inject objects that are not present in the dataset into the caption. We evaluate the model on standardized metrics, namely, BLEU, CIDEr and ROUGE-L. The results, qualitatively and quantitatively, outperform the underlying model.","PeriodicalId":130057,"journal":{"name":"2018 Digital Image Computing: Techniques and Applications (DICTA)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120848108","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

Lane Detection Under Adverse Conditions Based on Dual Color Space 基于双颜色空间的不利条件下车道检测

2018 Digital Image Computing: Techniques and Applications (DICTA) Pub Date : 2018-12-01 DOI: 10.1109/DICTA.2018.8615785

Nima Zarbakht, J. Zou

引用次数: 5

MS-GAN: GAN-Based Semantic Segmentation of Multiple Sclerosis Lesions in Brain Magnetic Resonance Imaging MS-GAN:基于gan的多发性硬化症病变脑磁共振成像语义分割

2018 Digital Image Computing: Techniques and Applications (DICTA) Pub Date : 2018-12-01 DOI: 10.1109/DICTA.2018.8615771

C. Zhang, Yang Song, Sidong Liu, S. Lill, Chenyu Wang, Zihao Tang, Yuyi You, Yang Gao, A. Klistorner, M. Barnett, Weidong (Tom) Cai

{"title":"MS-GAN: GAN-Based Semantic Segmentation of Multiple Sclerosis Lesions in Brain Magnetic Resonance Imaging","authors":"C. Zhang, Yang Song, Sidong Liu, S. Lill, Chenyu Wang, Zihao Tang, Yuyi You, Yang Gao, A. Klistorner, M. Barnett, Weidong (Tom) Cai","doi":"10.1109/DICTA.2018.8615771","DOIUrl":"https://doi.org/10.1109/DICTA.2018.8615771","url":null,"abstract":"Automated segmentation of multiple sclerosis (MS) lesions in brain imaging is challenging due to the high variability in lesion characteristics. Based on the generative adversarial network (GAN), we propose a semantic segmentation framework MS-GAN to localize MS lesions in multimodal brain magnetic resonance imaging (MRI), which consists of one multimodal encoder-decoder generator G and multiple discriminators D corresponding to the multiple input modalities. For the design of the generator, we adopt an encoder-decoder deep learning architecture with bypass of spatial information from encoder to the corresponding decoder, which helps to reduce the network parameters while improving the localization performance. Our generator is also designed to integrate multimodal imaging data in end-to-end learning with multi-path encoding and cross-modality fusion. An additional classification-related constraint is proposed for the adversarial training process of the GAN model, with the aim of alleviating the hard-to-converge issue in classification-based image-to-image translation problems. For evaluation, we collected a database of 126 cases from patients with relapsing MS. We also experimented with other semantic segmentation models as well as patch-based deep learning methods for performance comparison. The results show that our method provides more accurate segmentation than the state-of-the-art techniques.","PeriodicalId":130057,"journal":{"name":"2018 Digital Image Computing: Techniques and Applications (DICTA)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130664531","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 28

Image Restoration Based on Deep Convolutional Network in Wavefront Coding Imaging System 波前编码成像系统中基于深度卷积网络的图像恢复

2018 Digital Image Computing: Techniques and Applications (DICTA) Pub Date : 2018-12-01 DOI: 10.1109/DICTA.2018.8615824

Haoyuan Du, Liquan Dong, Ming Liu, Yuejin Zhao, W. Jia, Xiaohua Liu, Mei Hui, Lingqin Kong, Q. Hao

{"title":"Image Restoration Based on Deep Convolutional Network in Wavefront Coding Imaging System","authors":"Haoyuan Du, Liquan Dong, Ming Liu, Yuejin Zhao, W. Jia, Xiaohua Liu, Mei Hui, Lingqin Kong, Q. Hao","doi":"10.1109/DICTA.2018.8615824","DOIUrl":"https://doi.org/10.1109/DICTA.2018.8615824","url":null,"abstract":"Wavefront coding (WFC) is a prosperous technology for extending depth of field (DOF) in the incoherent imaging system. Digital recovery of the WFC technique is a classical ill-conditioned problem by removing the blurring effect and suppressing the noise. Traditional approaches relying on image heuristics suffer from high frequency noise amplification and processing artifacts. This paper investigates a general framework of neural networks for restoring images in WFC. To our knowledge, this is the first attempt for applying convolutional networks in WFC. The blur and additive noise are considered simultaneously. Two solutions respectively exploiting fully convolutional networks (FCN) and conditional Generative Adversarial Networks (CGAN) are presented. The FCN based on minimizing the mean squared reconstruction error (MSE) in pixel space gets high PSNR. On the other side, the CGAN based on perceptual loss optimization criterion retrieves more textures. We conduct comparison experiments to demonstrate the performance at different noise levels from the training configuration. We also reveal the image quality on non-natural test target image and defocused situation. The results indicate that the proposed networks outperform traditional approaches for restoring high frequency details and suppressing noise effectively.","PeriodicalId":130057,"journal":{"name":"2018 Digital Image Computing: Techniques and Applications (DICTA)","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133891562","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

[Copyright notice] (版权)

2018 Digital Image Computing: Techniques and Applications (DICTA) Pub Date : 2018-12-01 DOI: 10.1109/dicta.2018.8615751

引用次数: 0

Absolute and Relative Pose Estimation of a Multi-View Camera System using 2D-3D Line Pairs and Vertical Direction 基于2D-3D线对和垂直方向的多视点相机系统绝对和相对姿态估计

2018 Digital Image Computing: Techniques and Applications (DICTA) Pub Date : 2018-12-01 DOI: 10.1109/DICTA.2018.8615792

Hichem Abdellali, Z. Kato

引用次数: 7

Image Analytics for Train Crowd Estimation 列车人群估计的图像分析

2018 Digital Image Computing: Techniques and Applications (DICTA) Pub Date : 2018-12-01 DOI: 10.1109/DICTA.2018.8615794

Choon Giap Goh, Wee Han Lim, Justus Chua, I. Atmosukarto

{"title":"Image Analytics for Train Crowd Estimation","authors":"Choon Giap Goh, Wee Han Lim, Justus Chua, I. Atmosukarto","doi":"10.1109/DICTA.2018.8615794","DOIUrl":"https://doi.org/10.1109/DICTA.2018.8615794","url":null,"abstract":"Overcrowding is a common problem faced by train commuters in many countries. While waiting for the train at the stations, commuters tend to cluster and queue at doors that are closest to escalators and elevators that lead towards the station entrances and exits. This scenario results in trains not being fully utilized in terms of their capacity. As cabins with certain door positions tend to be more crowded than the rest of the cabins. The objective of this paper is to provide a methodology to estimate the crowd density within cabins of incoming trains, while leveraging on the existing train CCTV infrastructures. Providing the train cabin density information to commuters who are waiting for the incoming train allows the commuters to better select which cabin to board based on the provided density information. This will facilitate a better commuting experience without incurring a high cost for the train operator. To achieve this objective, we have adopted the usage of deep convolutional neural networks to analyze the footage from the existing security camera inside the trains and classify the images frames based the crowd level of train cabins. Three different experiments were conducted to train and test different convolutional neural network models. All models are able to make classification with an accuracy rate of over 90%.","PeriodicalId":130057,"journal":{"name":"2018 Digital Image Computing: Techniques and Applications (DICTA)","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130177876","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

Automated Military Vehicle Detection from Low-Altitude Aerial Images 从低空航拍图像自动检测军用车辆

2018 Digital Image Computing: Techniques and Applications (DICTA) Pub Date : 2018-12-01 DOI: 10.1109/DICTA.2018.8615865

F. Kamran, M. Shahzad, F. Shafait

{"title":"Automated Military Vehicle Detection from Low-Altitude Aerial Images","authors":"F. Kamran, M. Shahzad, F. Shafait","doi":"10.1109/DICTA.2018.8615865","DOIUrl":"https://doi.org/10.1109/DICTA.2018.8615865","url":null,"abstract":"Detection and identification of military vehicles from aerial images is of great practical interest particularly for defense sector as it aids in predicting enemys move and hence, build early precautionary measures. Although due to advancement in the domain of self-driving cars, a vast literature of published algorithms exists that use the terrestrial data to solve the problem of vehicle detection in natural scenes. Directly translating these algorithms towards detection of both military and non-military vehicles in aerial images is not straight forward owing to high variability in scale, illumination and orientation together with articulations both in shape and structure. Moreover, unlike availability of terrestrial benchmark datasets such as Baidu Research Open-Access Dataset etc., there does not exist well-annotated datasets encompassing both military and non-military vehicles in aerial images which as a consequence limit the applicability of the state-of-the-art deep learning based object detection algorithms that have shown great success in the recent years. To this end, we have prepared a dataset of low-altitude aerial images that comprises of both real data (taken from military shows videos) and toy data (downloaded from YouTube videos). The dataset has been categorized into three main types, i.e., military vehicle, non-military vehicle and other non-vehicular objects. In total, there are 15,086 (11,733 toy and 3,353 real) vehicle images exhibiting a variety of different shapes, scales and orientations. To analyze the adequacy of the prepared dataset, we employed the state-of-the-art object detection algorithms to distinguish military and non-military vehicles. The experimental results show that the training of deep architectures using the customized/prepared dataset allows to recognize seven types of military and four types of non-military vehicles.","PeriodicalId":130057,"journal":{"name":"2018 Digital Image Computing: Techniques and Applications (DICTA)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128540476","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 12