薄片的超像素分割:评估生成机器学习训练数据集的方法

Comput. Geosci. Pub Date : 2021-10-16 DOI:10.31223/x55s65
Jiaxin Yu, F. Wellmann, S. Virgo, Marven von Domarus, M. Jiang, J. Schmatz, B. Leibe
{"title":"薄片的超像素分割:评估生成机器学习训练数据集的方法","authors":"Jiaxin Yu, F. Wellmann, S. Virgo, Marven von Domarus, M. Jiang, J. Schmatz, B. Leibe","doi":"10.31223/x55s65","DOIUrl":null,"url":null,"abstract":"Training data is the backbone of developing either Machine Learning (ML) models or specific deep learning algorithms. The paucity of well-labeled training image data has significantly impeded the applications of ML-based approaches, especially the development of novel Deep Learning (DL) methods like Convolutional Neural Networks (CNNs) in mineral thin section images identification. However, image annotation, especially pixel-wise annotation is always a costly process. Manually creating dense semantic labels for rock thin section images has been long considered as an unprecedented challenge in view of the ubiquitous variety and complexity of minerals in thin sections. To speed up the annotation, we propose a human-computer collaborative pipeline in which superpixel segmentation is used as a boundary extractor to avoid hand delineation of instances boundaries. The pipeline consists of two steps: superpixel segmentation using MultiSLIC, and superpixel labeling through a specific-designed tool. We use a cutting-edge methodology Virtual Petroscopy (ViP) for automatic image acquisition. Bentheimer sandstone sample is used to conduct performance testing of the pipeline. Three standard error metrics are used to evaluate the performance of MultiSLIC. The result indicates that MultiSLIC is able to extract compact superpixels with satisfying boundary adherence given multiple input images. According to our test results, large and complex thin section images with pixel-wisely accurate labels can be annotated with the labeling tool more efficiently than in a conventional, purely manual work, and generate data of high quality.","PeriodicalId":10649,"journal":{"name":"Comput. Geosci.","volume":"26 1","pages":"105232"},"PeriodicalIF":0.0000,"publicationDate":"2021-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":"{\"title\":\"Superpixel segmentations for thin sections: Evaluation of methods to enable the generation of machine learning training data sets\",\"authors\":\"Jiaxin Yu, F. Wellmann, S. Virgo, Marven von Domarus, M. Jiang, J. Schmatz, B. Leibe\",\"doi\":\"10.31223/x55s65\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Training data is the backbone of developing either Machine Learning (ML) models or specific deep learning algorithms. The paucity of well-labeled training image data has significantly impeded the applications of ML-based approaches, especially the development of novel Deep Learning (DL) methods like Convolutional Neural Networks (CNNs) in mineral thin section images identification. However, image annotation, especially pixel-wise annotation is always a costly process. Manually creating dense semantic labels for rock thin section images has been long considered as an unprecedented challenge in view of the ubiquitous variety and complexity of minerals in thin sections. To speed up the annotation, we propose a human-computer collaborative pipeline in which superpixel segmentation is used as a boundary extractor to avoid hand delineation of instances boundaries. The pipeline consists of two steps: superpixel segmentation using MultiSLIC, and superpixel labeling through a specific-designed tool. We use a cutting-edge methodology Virtual Petroscopy (ViP) for automatic image acquisition. Bentheimer sandstone sample is used to conduct performance testing of the pipeline. Three standard error metrics are used to evaluate the performance of MultiSLIC. The result indicates that MultiSLIC is able to extract compact superpixels with satisfying boundary adherence given multiple input images. According to our test results, large and complex thin section images with pixel-wisely accurate labels can be annotated with the labeling tool more efficiently than in a conventional, purely manual work, and generate data of high quality.\",\"PeriodicalId\":10649,\"journal\":{\"name\":\"Comput. Geosci.\",\"volume\":\"26 1\",\"pages\":\"105232\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-10-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Comput. Geosci.\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.31223/x55s65\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Comput. Geosci.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.31223/x55s65","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5

摘要

训练数据是开发机器学习(ML)模型或特定深度学习算法的支柱。缺乏标记良好的训练图像数据严重阻碍了基于机器学习的方法的应用,特别是卷积神经网络(cnn)等新型深度学习(DL)方法在矿物薄片图像识别中的发展。然而,图像标注,特别是像素标注总是一个代价高昂的过程。考虑到岩石薄片中矿物的多样性和复杂性,长期以来,人工为岩石薄片图像创建密集的语义标签一直被认为是一个前所未有的挑战。为了加快标注速度,我们提出了一种人机协作管道,其中使用超像素分割作为边界提取器,以避免手工划定实例边界。该流程包括两个步骤:使用MultiSLIC进行超像素分割,以及通过特定设计的工具进行超像素标记。我们使用先进的方法虚拟岩石镜(ViP)进行自动图像采集。采用Bentheimer砂岩试样对管道进行了性能测试。使用三个标准误差指标来评估MultiSLIC的性能。结果表明,在给定多幅输入图像的情况下,MultiSLIC能够提取出紧凑的超像素,并具有满意的边界依附性。根据我们的测试结果,与传统的纯手工工作相比,使用标注工具可以更有效地注释具有像素精确标签的大型复杂薄片图像,并生成高质量的数据。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Superpixel segmentations for thin sections: Evaluation of methods to enable the generation of machine learning training data sets
Training data is the backbone of developing either Machine Learning (ML) models or specific deep learning algorithms. The paucity of well-labeled training image data has significantly impeded the applications of ML-based approaches, especially the development of novel Deep Learning (DL) methods like Convolutional Neural Networks (CNNs) in mineral thin section images identification. However, image annotation, especially pixel-wise annotation is always a costly process. Manually creating dense semantic labels for rock thin section images has been long considered as an unprecedented challenge in view of the ubiquitous variety and complexity of minerals in thin sections. To speed up the annotation, we propose a human-computer collaborative pipeline in which superpixel segmentation is used as a boundary extractor to avoid hand delineation of instances boundaries. The pipeline consists of two steps: superpixel segmentation using MultiSLIC, and superpixel labeling through a specific-designed tool. We use a cutting-edge methodology Virtual Petroscopy (ViP) for automatic image acquisition. Bentheimer sandstone sample is used to conduct performance testing of the pipeline. Three standard error metrics are used to evaluate the performance of MultiSLIC. The result indicates that MultiSLIC is able to extract compact superpixels with satisfying boundary adherence given multiple input images. According to our test results, large and complex thin section images with pixel-wisely accurate labels can be annotated with the labeling tool more efficiently than in a conventional, purely manual work, and generate data of high quality.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信