{"title":"Body Part Labelling with Minkowski Networks","authors":"Joseph Cahill-Lane, S. Mills, Stuart Duncan","doi":"10.1109/IVCNZ48456.2019.8961026","DOIUrl":"https://doi.org/10.1109/IVCNZ48456.2019.8961026","url":null,"abstract":"Labelling body parts in depth images is useful for a wide variety of tasks. Many approaches use skeleton-based labelling, which is not robust when there is a partial view of the figure. In this work we show that Minkowski networks, which have recently been developed for 3D point cloud labelling of scenes, can be used to label point clouds with body part categories, achieving 85.6% accuracy with a full view of the figure, and 82.1% with partial views. These results are limited by a small sample size of our training data, but there is evidence that some of these ‘misclassifications’ may be correcting mistakes in the reference labelling. Overall, we demonstrate that Minkowski networks are effective for body part labelling in point clouds, and are robust to occlusion.","PeriodicalId":217359,"journal":{"name":"2019 International Conference on Image and Vision Computing New Zealand (IVCNZ)","volume":"134 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127208991","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Comparison of Three-Dimensional Scanning Devices","authors":"Niklas Deckers, R. Reulke","doi":"10.1109/IVCNZ48456.2019.8961017","DOIUrl":"https://doi.org/10.1109/IVCNZ48456.2019.8961017","url":null,"abstract":"There are now a variety of methods and devices for deriving 3D information, e.g. binocular camera (stereo) systems and methods of the class \"shape from X\". In addition, there are modern recording technologies that simultaneously measure gray scale values and distance (range cameras, TOF cameras). Many of these methods are available through low-cost sensor systems that are simple and fast to use. We propose a simple experimental setup and a processing chain in order to evaluate the quality of the data recorded by these systems. In an experiment with students we demonstrate the usability of this approach and introduce results for selected low-cost sensors.","PeriodicalId":217359,"journal":{"name":"2019 International Conference on Image and Vision Computing New Zealand (IVCNZ)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114426479","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Plant Leaf Recognition using Geometric Features and Pearson Correlations","authors":"Md. Ajij, D. S. Roy, Sanjoy Pratihar","doi":"10.1109/IVCNZ48456.2019.8961036","DOIUrl":"https://doi.org/10.1109/IVCNZ48456.2019.8961036","url":null,"abstract":"Plant identification is an important task that is necessary for professionals like biologists, chemists, botanists, farmers, and nature hobbyists. The identification of plants from their leaves is a well-known strategy. In this paper, we present a novel set of features based on Pearson correlation coefficients, and we show the applicability of the proposed features for the classification of plant leaves. The foremost contribution in this paper is the use of the Pearson correlation coefficient computed from the leaf boundary pixels for analyzing shape similarity. The method has been tested on two well-known plant leaf datasets, Flavia and Swedish. The method shows the accuracy level of 95.16% on the Flavia dataset and of 97.0% on the Swedish dataset. The results corroborate the strength of our proposed feature set in comparison with other available methods.","PeriodicalId":217359,"journal":{"name":"2019 International Conference on Image and Vision Computing New Zealand (IVCNZ)","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128538671","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Hiqmat Nisa, J. Thom, V. Ciesielski, Ruwan Tennakoon
{"title":"A deep learning approach to handwritten text recognition in the presence of struck-out text","authors":"Hiqmat Nisa, J. Thom, V. Ciesielski, Ruwan Tennakoon","doi":"10.1109/IVCNZ48456.2019.8961024","DOIUrl":"https://doi.org/10.1109/IVCNZ48456.2019.8961024","url":null,"abstract":"The accuracy of handwritten text recognition may be affected by the presence of struck-out text in the handwritten manuscript. This paper investigates and improves the performance of a widely used handwritten text recognition approach Convolutional Recurrent Neural Network (CRNN) on handwritten lines containing struck out words. For this purpose, some common types of struck-out strokes were superimposed on words in a text line. A model, trained on the IAM line database was tested on lines containing struck-out words. The Character Error Rate (CER) increased from 0.09 to 0.11. This model was re-trained on dataset containing struck-out text. The model performed well in terms of struck-out text detection. We found that after providing an adequate number of training examples, the model can deal with learning struck-out patterns in a way that does not affect the overall recognition accuracy.","PeriodicalId":217359,"journal":{"name":"2019 International Conference on Image and Vision Computing New Zealand (IVCNZ)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127886551","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Shah Nawaz, Alessandro Calefati, Moreno Caraffini, Nicola Landro, I. Gallo
{"title":"Are These Birds Similar: Learning Branched Networks for Fine-grained Representations","authors":"Shah Nawaz, Alessandro Calefati, Moreno Caraffini, Nicola Landro, I. Gallo","doi":"10.1109/IVCNZ48456.2019.8960960","DOIUrl":"https://doi.org/10.1109/IVCNZ48456.2019.8960960","url":null,"abstract":"Fine-grained image classification is a challenging task due to the presence of hierarchical coarse-to-fine-grained distribution in the dataset. Generally, parts are used to discriminate various objects in fine-grained datasets, however, not all parts are beneficial and indispensable. In recent years, natural language descriptions are used to obtain information on discriminative parts of the object. This paper leverages on natural language description and proposes a strategy for learning the joint representation of natural language description and images using a two-branch network with multiple layers to improve the fine-grained classification task. Extensive experiments show that our approach gains significant improvements in accuracy for the fine-grained image classification task. Furthermore, our method achieves new state-of-the-art results on the CUB-200-2011 dataset.","PeriodicalId":217359,"journal":{"name":"2019 International Conference on Image and Vision Computing New Zealand (IVCNZ)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126363482","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Evaluating Spatial Configuration Constrained CNNs for Localizing Facial and Body Pose Landmarks","authors":"Christian Payer, D. Štern, M. Urschler","doi":"10.1109/IVCNZ48456.2019.8961000","DOIUrl":"https://doi.org/10.1109/IVCNZ48456.2019.8961000","url":null,"abstract":"Landmark localization is a widely used task required in medical image analysis and computer vision applications. Formulated in a heatmap regression framework, we have recently proposed a CNN architecture that learns on its own to split the localization task into two simpler sub-problems, dedicating one component to locally accurate but ambiguous predictions, while the other component improves robustness by incorporating the spatial configuration of landmarks to remove ambiguities. We learn this simplification in our SpatialConfiguration-Net (SCN) by multiplying the heatmap predictions of its two components and by training the network in and end-to-end manner, thus achieving regularization similar to e.g. a hand-crafted Markov Random Field model. While we have previously shown localization results solely on data from 2D and 3D medical imaging modalities, in this work our aim is to study the generalization capabilities of our SpatialConfiguration-Net to computer vision problems. Therefore, we evaluate our performance both in terms of accuracy and robustness on a facial alignment task, where we improve upon the state-of-the-art methods, as well as on a human body pose estimation task, where we demonstrate results in line with the recent state-of-the-art.","PeriodicalId":217359,"journal":{"name":"2019 International Conference on Image and Vision Computing New Zealand (IVCNZ)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128203119","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Identifying Simple Shapes to Classify the Big Picture","authors":"Megan Liang, Gabrielle Palado, Will N. Browne","doi":"10.1109/IVCNZ48456.2019.8960989","DOIUrl":"https://doi.org/10.1109/IVCNZ48456.2019.8960989","url":null,"abstract":"In recent years, Deep Artificial Neural Networks (DNNs) have demonstrated their ability in solving visual classification problems. However, an impediment is transparency where it is difficult to interpret why an object is classified in a particular way. Furthermore, it is also difficult to validate whether a learned model truly represents a problem space. Learning Classifier Systems (LCSs) are an Evolutionary Computation technique capable of producing human-readable rules that explain why an instance has been classified, i.e. the system is fully transparent. However, because they can encode complex relationships between features, they are not best suited to domains with a large number of input features, e.g. classification in pixel images. Thus, the aim of this work is to develop a novel DNN-LCS system where the former extracts features from pixels and the latter classifies objects from these features with clear decision boundaries. Results show that the system can explain its classification decisions on curated image data, e.g. plates have elliptical or rectangular shapes. This work represents a promising step towards explainable artificial intelligence in computer vision.","PeriodicalId":217359,"journal":{"name":"2019 International Conference on Image and Vision Computing New Zealand (IVCNZ)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115187003","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An Image Mapping Approach for Quick Dissimilarity Detection of Binary Images","authors":"Adnan A. Y. Mustafa","doi":"10.1109/IVCNZ48456.2019.8961029","DOIUrl":"https://doi.org/10.1109/IVCNZ48456.2019.8961029","url":null,"abstract":"In this paper we present an approach for the quick detection of dissimilar and similar binary images. The approach conforms to the Probabilistic Matching Model for Binary Images (PMMBI) that ascertained that detecting dissimilarity between binary images can be done quickly by randomly examining a few points between two images. The approach is based on exploiting the 15 binary image mapping variations that are possible when mapping pixels between binary images and inspecting their tuple size. We call this approach the Dissimilar Detection via Mapping (DDM). We present tests with real images.","PeriodicalId":217359,"journal":{"name":"2019 International Conference on Image and Vision Computing New Zealand (IVCNZ)","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116804790","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
J. Fourie, K. Pahalawatta, J. Hsiao, C. Bateman, Peter Carey
{"title":"Fusion of thermal and visible colour images for robust detection of people in forests","authors":"J. Fourie, K. Pahalawatta, J. Hsiao, C. Bateman, Peter Carey","doi":"10.1109/IVCNZ48456.2019.8960964","DOIUrl":"https://doi.org/10.1109/IVCNZ48456.2019.8960964","url":null,"abstract":"Safe operation of automated robotic platforms in environments where humans also work require on-board sensors that can accurately and robustly detect humans in the environment so that appropriate action can be taken. This is a challenging problem in unstructured outdoor environments as most sensors are negatively affected by changing environmental conditions like ambient light and moisture. Our aim is to use a combination of thermal and visible colour images to detect humans in forest environments. The system should be able to work through dense foliage and should not be confused by other objects that generate heat like machines or other animals. We developed and tested a system on a data-set of sensor data collected in a similar outdoor environment but with synthetic targets added to highlight the ability of the system to be robust to severe optical occlusion in dense vegetation and to the presence of hot machines that could fool the thermal sensor. Our initial results show promise and also highlight where improvements can be made with further testing in more realistic forest environments.","PeriodicalId":217359,"journal":{"name":"2019 International Conference on Image and Vision Computing New Zealand (IVCNZ)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129426583","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Automated Segmentation of Breast Arterial Calcifications from Digital Mammography","authors":"Kaier Wang, N. Khan, R. Highnam","doi":"10.1109/IVCNZ48456.2019.8960956","DOIUrl":"https://doi.org/10.1109/IVCNZ48456.2019.8960956","url":null,"abstract":"Breast arterial calcifications (BACs) are formed when calcium is deposited in the walls of arteries in the breast. The accurate segmentation of BACs is a critical step for risk assessment of cardiovascular disease from a mammogram. This paper evaluates the performance of three deep learning architectures, YOLO, Unet and DeepLabv3+, on detecting BACs in digital mammography. In comparison, a simple Hessian-based multiscale filter is developed to enhance BACs pattern, then a self-adaptive thresholding algorithm is applied to obtain the binary mask of BACs. As BACs are relatively small in size, we developed a new metric to better evaluate the small object segmentation. In this study, 135 digital mammographic images containing labelled BACs were obtained, in which 80% for training deep learning networks and 20% for validation. The results show that our Hessian-based filtering method achieves a highest accuracy on validation data, and DeepLabv3+ falls behind with little effectiveness. We conclude simple filtering technique is effective in BACs extraction, and DeepLabv3+ is an expensive alternative in terms of its computational cost and configuration complexity.","PeriodicalId":217359,"journal":{"name":"2019 International Conference on Image and Vision Computing New Zealand (IVCNZ)","volume":"145 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129581914","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}