{"title":"Effective Utilisation of Multiple Open-Source Datasets to Improve Generalisation Performance of Point Cloud Segmentation Models","authors":"Matthew Howe, Boris Repasky, Timothy Payne","doi":"10.1109/DICTA56598.2022.10034566","DOIUrl":null,"url":null,"abstract":"Utilising a single point cloud segmentation model can be desirable in situations where point cloud source, quality, and content is unknown. In these situations the segmentation model must be able to handle these variations with predictable and consistent results. Although deep learning can segment point clouds accurately it often suffers with generalisation, adapting poorly to data which is different than the data it was trained on. To address this issue, we propose to utilise multiple available open source fully annotated datasets to train and test models that are better able to generalise. The open-source datasets we utilise are DublinCity, DALES, ISPRS, Swiss3DCities, SensatUrban, SUM, and H3D [5], [11], [10], [1], [3], [2], [6]. In this paper we discuss the combination of these datasets into a simple training set and challenging test set which evaluates multiple aspects of the generalisation task. We show that a naive combination and training produces improved results as expected. We also show that an improved sampling strategy which decreases sampling variations increases the generalisation performance substantially on top of this. Experiments to find the contributing factor of which variables give this performance boost found that none individually boost performance and rather it is the consistency of samples the model is evaluated on which yields this improvement.","PeriodicalId":159377,"journal":{"name":"2022 International Conference on Digital Image Computing: Techniques and Applications (DICTA)","volume":"93 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 International Conference on Digital Image Computing: Techniques and Applications (DICTA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/DICTA56598.2022.10034566","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Utilising a single point cloud segmentation model can be desirable in situations where point cloud source, quality, and content is unknown. In these situations the segmentation model must be able to handle these variations with predictable and consistent results. Although deep learning can segment point clouds accurately it often suffers with generalisation, adapting poorly to data which is different than the data it was trained on. To address this issue, we propose to utilise multiple available open source fully annotated datasets to train and test models that are better able to generalise. The open-source datasets we utilise are DublinCity, DALES, ISPRS, Swiss3DCities, SensatUrban, SUM, and H3D [5], [11], [10], [1], [3], [2], [6]. In this paper we discuss the combination of these datasets into a simple training set and challenging test set which evaluates multiple aspects of the generalisation task. We show that a naive combination and training produces improved results as expected. We also show that an improved sampling strategy which decreases sampling variations increases the generalisation performance substantially on top of this. Experiments to find the contributing factor of which variables give this performance boost found that none individually boost performance and rather it is the consistency of samples the model is evaluated on which yields this improvement.