{"title":"Strategies for Merging Hyperspectral Data of Different Spectral and Spatial Resoultion","authors":"R. Illmann, M. Rosenberger, G. Notni","doi":"10.1109/DICTA.2018.8615875","DOIUrl":"https://doi.org/10.1109/DICTA.2018.8615875","url":null,"abstract":"Increasing applications for hyperspectral measurement make increasing demands on the handling of big measurement data. Push broom imaging is a promising measurement technique for many applications. The combined registration of hyperspectral and spatial data reveal a lot of information about the measurement object. An exemplary well-known further processing technique is to extract feature vectors from such a dataset. For increasing quality and quantity of possible information, it is advantageously to have a spectral wide range dataset. Nevertheless, different spectral data mainly needs different imaging systems. A major problem in using hyperspectral data from different hyperspectral imaging systems is the combination of those to a wide range data set, called spectral cube. The aim of this work is to show which methods are principal conceivable and usable under different circumstances for merging such datasets with a profound analytical view. In addition, some work that was done in the theory and the design of a calibration model prototype is included.","PeriodicalId":130057,"journal":{"name":"2018 Digital Image Computing: Techniques and Applications (DICTA)","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123928487","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Single Hierarchical Network for Face, Action Unit and Emotion Detection","authors":"Shreyank Jyoti, Garima Sharma, Abhinav Dhall","doi":"10.1109/DICTA.2018.8615852","DOIUrl":"https://doi.org/10.1109/DICTA.2018.8615852","url":null,"abstract":"The deep neural network shows a consequential performance for a set of specific tasks. A system designed for some correlated task altogether can be feasible for ‘in the wild’ applications. This paper proposes a method for the face localization, Action Unit (AU) and emotion detection. The three different tasks are performed by a simultaneous hierarchical network which exploits the way of learning of neural networks. Such network can represent more relevant features than the individual network. Due to more complex structures and very deep networks, the deployment of neural networks for real life applications is a challenging task. The paper focuses to find an efficient trade-off between the performance and the complexity of the given tasks. This is done by exploring the advantages of optimization of the network for the given tasks by using separable convolutions, binarization and quantization. Four different databases (AffectNet, EmotioNet, RAF-DB and WiderFace) are used to evaluate the performance of our proposed approach by having a separate task specific database.","PeriodicalId":130057,"journal":{"name":"2018 Digital Image Computing: Techniques and Applications (DICTA)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116671814","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mohammad Manzurul Islam, G. Karmakar, J. Kamruzzaman, Manzur Murshed, G. Kahandawa, N. Parvin
{"title":"Detecting Splicing and Copy-Move Attacks in Color Images","authors":"Mohammad Manzurul Islam, G. Karmakar, J. Kamruzzaman, Manzur Murshed, G. Kahandawa, N. Parvin","doi":"10.1109/DICTA.2018.8615874","DOIUrl":"https://doi.org/10.1109/DICTA.2018.8615874","url":null,"abstract":"Image sensors are generating limitless digital images every day. Image forgery like splicing and copy-move are very common type of attacks that are easy to execute using sophisticated photo editing tools. As a result, digital forensics has attracted much attention to identify such tampering on digital images. In this paper, a passive (blind) image tampering identification method based on Discrete Cosine Transformation (DCT) and Local Binary Pattern (LBP) has been proposed. First, the chroma components of an image is divided into fixed sized non-overlapping blocks and 2D block DCT is applied to identify the changes due to forgery in local frequency distribution of the image. Then a texture descriptor, LBP is applied on the magnitude component of the 2D-DCT array to enhance the artifacts introduced by the tampering operation. The resulting LBP image is again divided into non-overlapping blocks. Finally, summations of corresponding inter-cell values of all the LBP blocks are computed and arranged as a feature vector. These features are fed into a Support Vector Machine (SVM) with Radial Basis Function (RBF) as kernel to distinguish forged images from authentic ones. The proposed method has been experimented extensively on three publicly available well-known image splicing and copy-move detection benchmark datasets of color images. Results demonstrate the superiority of the proposed method over recently proposed state-of-the-art approaches in terms of well accepted performance metrics such as accuracy, area under ROC curve and others.","PeriodicalId":130057,"journal":{"name":"2018 Digital Image Computing: Techniques and Applications (DICTA)","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116824616","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Long-Term Recurrent Predictive Model for Intent Prediction of Pedestrians via Inverse Reinforcement Learning","authors":"Khaled Saleh, M. Hossny, S. Nahavandi","doi":"10.1109/DICTA.2018.8615854","DOIUrl":"https://doi.org/10.1109/DICTA.2018.8615854","url":null,"abstract":"Recently, the problem of intent and trajectory prediction of pedestrians in urban traffic environments has got some attention from the intelligent transportation research community. One of the main challenges that make this problem even harder is the uncertainty exists in the actions of pedestrians in urban traffic environments, as well as the difficulty in inferring their end goals. In this work, we are proposing a data-driven framework based on Inverse Reinforcement Learning (IRL) and the bidirectional recurrent neural network architecture (B-LSTM) for long-term prediction of pedestrians' trajectories. We evaluated our framework on real-life datasets for agent behavior modeling in traffic environments and it has achieved an overall average displacement error of only 2.93 and 4.12 pixels over 2.0 secs and 3.0 secs ahead prediction horizons respectively. Additionally, we compared our framework against other baseline models based on sequence prediction models only. We have outperformed these models with the lowest margin of average displacement error of more than 5 pixels.","PeriodicalId":130057,"journal":{"name":"2018 Digital Image Computing: Techniques and Applications (DICTA)","volume":"145 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115540297","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Combining Deep and Handcrafted Image Features for Vehicle Classification in Drone Imagery","authors":"Xuesong Le, Yufei Wang, Jun Jo","doi":"10.1109/DICTA.2018.8615853","DOIUrl":"https://doi.org/10.1109/DICTA.2018.8615853","url":null,"abstract":"Using unmanned aerial vehicles (UAVs) as devices for traffic data collection exhibits many advantages in collecting traffic information. This paper presents an efficient method based on the deep learning and handcrafted features to classify vehicles taken from drone imagery. Experimental results show that compared to classification algorithms based on pre-trained CNN or hand-crafted features, the proposed algorithm exhibits higher accuracy in vehicle recognition at different UAV altitudes with different view scopes, which can be used in future traffic monitoring and control in metropolitan areas.","PeriodicalId":130057,"journal":{"name":"2018 Digital Image Computing: Techniques and Applications (DICTA)","volume":"21 6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122597464","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Classifier-Free Extraction of Power Line Wires from Point Cloud Data","authors":"M. Awrangjeb, Yongsheng Gao, Guojun Lu","doi":"10.1109/DICTA.2018.8615869","DOIUrl":"https://doi.org/10.1109/DICTA.2018.8615869","url":null,"abstract":"This paper proposes a classifier-free method for extraction of power line wires from aerial point cloud data. It combines the advantages of both grid- and point-based processing of the input data. In addition to the non-ground point cloud data, the input to the proposed method includes the pylon locations, which are automatically extracted by a previous method. The proposed method first counts the number of wires in a span between the two successive pylons using two masks: vertical and horizontal. Then, the initial wire segments are obtained and refined iteratively. Finally, the initial segments are extended on both ends and each individual wire points are modelled as a 3D polynomial curve. Experimental results show both the object-based completeness and correctness are 97%, while the point-based completeness and correctness are 99% and 88%, respectively.","PeriodicalId":130057,"journal":{"name":"2018 Digital Image Computing: Techniques and Applications (DICTA)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128168852","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Spectral Super-resolution for RGB Images using Class-based BP Neural Networks","authors":"Xiaolin Han, Jing Yu, Jing-Hao Xue, Weidong Sun","doi":"10.1109/DICTA.2018.8615862","DOIUrl":"https://doi.org/10.1109/DICTA.2018.8615862","url":null,"abstract":"Hyperspectral images are of high spectral resolution and have been widely used in many applications, but the imaging process to achieve high spectral resolution is at the expense of spatial resolution. This paper aims to construct a high-spatial-resolution hyperspectral (HHS) image from a high-spatial-resolution RGB image, by proposing a novel class-based spectral super-resolution method. With the help of a set of RGB and HHS image-pairs, our proposed method learns nonlinear spectral mappings between RGB and HHS image-pairs using class-based back propagation neural networks (BPNNs). In the training stage, unsupervised clustering is used to divide an RGB image into several classes according to spectral correlation, and the spectrum-pairs from the classified RGB images and the corresponding HHS images are used to train the BPNNs, to establish the nonlinear spectral mapping for each class. In the spectral super-resolution stage, a supervised classification is used to classify the given RGB image into the classes determined during the training stage, and the final HHS image is reconstructed from the classified given RGB image using the trained BPNNs. Comparisons on three standard datasets, ICVL, CAVE and NUS, demonstrate that, our proposed method achieves a better spectral super-resolution quality than related state-of-the-art methods.","PeriodicalId":130057,"journal":{"name":"2018 Digital Image Computing: Techniques and Applications (DICTA)","volume":"146 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128435845","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
F. Particke, A. Kalisz, Christian Hofmann, M. Hiller, Henrik Bey, J. Thielecke
{"title":"Systematic Analysis of Direct Sparse Odometry","authors":"F. Particke, A. Kalisz, Christian Hofmann, M. Hiller, Henrik Bey, J. Thielecke","doi":"10.1109/DICTA.2018.8615807","DOIUrl":"https://doi.org/10.1109/DICTA.2018.8615807","url":null,"abstract":"In the field of robotics and autonomous driving, the camera as a sensor gets more and more important, as the camera is cheap and robust against environmental influences. One challenging task is the localization of the robot on an unknown map. This leads to the so-called Simultaneous Localization and Mapping (SLAM) problem. For the Visual SLAM problem, a plethora of algorithms was proposed in the last years, but the algorithms were rarely evaluated regarding the robustness of the approaches. This contribution motivates the systematic analysis of Visual SLAMs in simulation by using heterogeneous environments in Blender. For this purpose, three different environments are used for evaluation ranging from very low detailed to high detailed worlds. In this contribution, the Direct Sparse Odometry (DSO) is evaluated as an exemplary Visual SLAM. It is shown that the DSO is very sensitive to rotations of the camera. In addition, it is presented that if the scene does not provide sufficient clues about the depth, an estimation of the trajectory is not possible. The results are complemented by real-world experiments.","PeriodicalId":130057,"journal":{"name":"2018 Digital Image Computing: Techniques and Applications (DICTA)","volume":"110 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134390411","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Combination of Supervised Learning and Unsupervised Learning Based on Object Association for Land Cover Classification","authors":"Na Li, Arnaud Martin, R. Estival","doi":"10.1109/DICTA.2018.8615871","DOIUrl":"https://doi.org/10.1109/DICTA.2018.8615871","url":null,"abstract":"Conventional supervised classification approaches have significant limitations in the land cover classification from remote sensing data because a large amount of high quality labeled samples are difficult to guarantee. To overcome this limitation, combination with unsupervised approach is considered as one promising candidate. In this paper, we propose a novel framework to achieve the combination through object association based on Dempster-Shafer theory. Inspired by object association, the framework can label the unsupervised clusters according to the supervised classes even though they have different numbers. The proposed framework has been tested on the different combinations of commonly used supervised and unsupervised methods. Compared with the supervise methods, our proposed framework can furthest enhance the overall accuracy approximately by 8.2%. The experiment results proved that our proposed framework has achieved twofold performance gain: better performance on the insufficient training data case and the possibility to apply on a large area.","PeriodicalId":130057,"journal":{"name":"2018 Digital Image Computing: Techniques and Applications (DICTA)","volume":"105 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131431280","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Bi-Modal Content Based Image Retrieval using Multi-class Cycle-GAN","authors":"Girraj Pahariya","doi":"10.1109/DICTA.2018.8615838","DOIUrl":"https://doi.org/10.1109/DICTA.2018.8615838","url":null,"abstract":"Content Based Image Retrieval (CBIR) systems retrieve relevant images from a database based on the content of the query. Most CBIR systems take a query image as input and retrieve similar images from a gallery, based on the global features (such as texture, shape, and color) extracted from an image. There are several ways of querying from an image database for retrieval purpose. Some of which are text, image, and sketch. However, the traditional methodologies support only one of the domains at a time. There is a need of bridging the gap between different domains (sketch and image) for enabling a Multi-Modal CBIR system. In this work, we propose a novel bimodal query based retrieval framework, which can take inputs from both sketch and image domains. The proposed framework aims at reducing the domain gap by learning a mapping function using Generative Adversarial Networks (GANs) and supervised deep domain adaptation techniques. Extensive experimentation and comparison with several baselines on two popular sketch datasets (Sketchy and TU-Berlin) show the effectiveness of our proposed framework.","PeriodicalId":130057,"journal":{"name":"2018 Digital Image Computing: Techniques and Applications (DICTA)","volume":"547 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120939092","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}