用于支持深度学习的原煤智能分类和分析的大规模开放图像数据集。

IF 6.9 2区综合性期刊 Q1 MULTIDISCIPLINARY SCIENCES

Scientific Data Pub Date : 2025-03-08 DOI:10.1038/s41597-025-04719-0

Ziqi Lv, Yuhan Fan, Te Sha, Yao Cui, Yuxin Wu, Haimei Lv, Meijie Sun, Yanan Tu, Zhiqiang Xu, Weidong Wang

{"title":"用于支持深度学习的原煤智能分类和分析的大规模开放图像数据集。","authors":"Ziqi Lv, Yuhan Fan, Te Sha, Yao Cui, Yuxin Wu, Haimei Lv, Meijie Sun, Yanan Tu, Zhiqiang Xu, Weidong Wang","doi":"10.1038/s41597-025-04719-0","DOIUrl":null,"url":null,"abstract":"Under the strategic objectives of carbon peaking and carbon neutrality, energy transition driven by new quality productive forces has emerged as a central theme in China's energy development. Among these, the intelligent sorting and analysis of raw coal using deep learning constitute a pivotal technical process. However, the progress of intelligent coal preparation in China has been constrained by the absence of accurate and large-scale data. To address this gap, this study introduces DsCGF, a large-scale, open-source raw coal image dataset. Over the past five years, extensive raw coal image samples were systematically collected and meticulously annotated from three representative mining regions in China, resulting in a dataset comprising over 270,000 visible-light images. These images are annotated at multiple levels, targeting three primary categories: coal, gangue, and foreign objects, and are designed for three core computer vision tasks: image classification, object detection, and instance segmentation. Comprehensive evaluation results indicate that the DsCGF can effectively support further research into the intelligent sorting of raw coal.","PeriodicalId":21597,"journal":{"name":"Scientific Data","volume":"12 1","pages":"403"},"PeriodicalIF":6.9000,"publicationDate":"2025-03-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11890867/pdf/","citationCount":"0","resultStr":"{\"title\":\"A large-scale open image dataset for deep learning-enabled intelligent sorting and analyzing of raw coal.\",\"authors\":\"Ziqi Lv, Yuhan Fan, Te Sha, Yao Cui, Yuxin Wu, Haimei Lv, Meijie Sun, Yanan Tu, Zhiqiang Xu, Weidong Wang\",\"doi\":\"10.1038/s41597-025-04719-0\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Under the strategic objectives of carbon peaking and carbon neutrality, energy transition driven by new quality productive forces has emerged as a central theme in China's energy development. Among these, the intelligent sorting and analysis of raw coal using deep learning constitute a pivotal technical process. However, the progress of intelligent coal preparation in China has been constrained by the absence of accurate and large-scale data. To address this gap, this study introduces DsCGF, a large-scale, open-source raw coal image dataset. Over the past five years, extensive raw coal image samples were systematically collected and meticulously annotated from three representative mining regions in China, resulting in a dataset comprising over 270,000 visible-light images. These images are annotated at multiple levels, targeting three primary categories: coal, gangue, and foreign objects, and are designed for three core computer vision tasks: image classification, object detection, and instance segmentation. Comprehensive evaluation results indicate that the DsCGF can effectively support further research into the intelligent sorting of raw coal.\",\"PeriodicalId\":21597,\"journal\":{\"name\":\"Scientific Data\",\"volume\":\"12 1\",\"pages\":\"403\"},\"PeriodicalIF\":6.9000,\"publicationDate\":\"2025-03-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11890867/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Scientific Data\",\"FirstCategoryId\":\"103\",\"ListUrlMain\":\"https://doi.org/10.1038/s41597-025-04719-0\",\"RegionNum\":2,\"RegionCategory\":\"综合性期刊\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"MULTIDISCIPLINARY SCIENCES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Scientific Data","FirstCategoryId":"103","ListUrlMain":"https://doi.org/10.1038/s41597-025-04719-0","RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MULTIDISCIPLINARY SCIENCES","Score":null,"Total":0}

引用次数: 0

摘要

在碳调峰和碳中和的战略目标下，以新型优质生产力为驱动的能源转型已成为中国能源发展的中心主题。其中，利用深度学习技术对原煤进行智能分选分析是一个关键的技术过程。然而，由于缺乏准确和大规模的数据，中国智能选煤的进展一直受到制约。为了解决这一差距，本研究引入了DsCGF，一个大规模的开源原煤图像数据集。在过去的五年里，我们系统地收集了大量来自中国三个代表性矿区的原煤图像样本，并对其进行了细致的注释，形成了一个包含27万多张可见光图像的数据集。这些图像在多个层面上进行了注释，针对三个主要类别：煤、脉石和异物，并为三个核心计算机视觉任务设计：图像分类、目标检测和实例分割。综合评价结果表明，DsCGF可有效支持原煤智能分选的进一步研究。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

A large-scale open image dataset for deep learning-enabled intelligent sorting and analyzing of raw coal.

查看原文本刊更多论文

A large-scale open image dataset for deep learning-enabled intelligent sorting and analyzing of raw coal.

Under the strategic objectives of carbon peaking and carbon neutrality, energy transition driven by new quality productive forces has emerged as a central theme in China's energy development. Among these, the intelligent sorting and analysis of raw coal using deep learning constitute a pivotal technical process. However, the progress of intelligent coal preparation in China has been constrained by the absence of accurate and large-scale data. To address this gap, this study introduces DsCGF, a large-scale, open-source raw coal image dataset. Over the past five years, extensive raw coal image samples were systematically collected and meticulously annotated from three representative mining regions in China, resulting in a dataset comprising over 270,000 visible-light images. These images are annotated at multiple levels, targeting three primary categories: coal, gangue, and foreign objects, and are designed for three core computer vision tasks: image classification, object detection, and instance segmentation. Comprehensive evaluation results indicate that the DsCGF can effectively support further research into the intelligent sorting of raw coal.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Scientific Data Social Sciences-Education

CiteScore

11.20

自引率

4.10%

发文量

689

审稿时长

16 weeks

期刊介绍： Scientific Data is an open-access journal focused on data, publishing descriptions of research datasets and articles on data sharing across natural sciences, medicine, engineering, and social sciences. Its goal is to enhance the sharing and reuse of scientific data, encourage broader data sharing, and acknowledge those who share their data. The journal primarily publishes Data Descriptors, which offer detailed descriptions of research datasets, including data collection methods and technical analyses validating data quality. These descriptors aim to facilitate data reuse rather than testing hypotheses or presenting new interpretations, methods, or in-depth analyses.