FabricSpotDefect: An annotated dataset for identifying spot defects in different fabric types.

IF 1 Q3 MULTIDISCIPLINARY SCIENCES
Data in Brief Pub Date : 2024-11-24 eCollection Date: 2024-12-01 DOI:10.1016/j.dib.2024.111165
Farzana Islam, Sumaya, Md Fahad Monir, Ashraful Islam
{"title":"FabricSpotDefect: An annotated dataset for identifying spot defects in different fabric types.","authors":"Farzana Islam, Sumaya, Md Fahad Monir, Ashraful Islam","doi":"10.1016/j.dib.2024.111165","DOIUrl":null,"url":null,"abstract":"<p><p>The FabricSpotDefect dataset is, to the best of our knowledge, the first dataset specifically designed to accurately challenge computer vision in detecting fabric spots. There are a total of 1014 raw images and manually annotated 3288 different categories of spots. This dataset expands to 2300 augmented images after applying six categories of augmentation techniques like flipping, rotating, shearing, saturation adjustment, brightness adjustment, and noise addition. We manually conducted annotations on original images to provide real-world essence rather than augmented images. Two versions are considered for augmented images, one is YOLOv8 resulting in 7641 annotations and another one is COCO format resulting in 7635 annotations. To reduce overfitting and to improve model robustness augmentation technique is required, which eventually increases data diversity. This dataset consists of various types of fabrics such as cotton, linen, silk, denim, patterned textiles, jacquard fabrics, and so on, and spots like stains, discolorations, oil marks, rust, blood marks, and so on. These kinds of spots are quite difficult to detect manually or in other traditional methods. The images were snapped in home lights, using basic everyday clothes, and in normal conditions, making this FabricSpotDefect dataset established in real-world applications. The dataset is organized in a way that makes it easy to use for training, testing, and validating machine learning (ML) models and can be reused at any time since this dataset is real and authentic. Researchers and Developers are free to use this prebuilt dataset to work with artificial intelligence (AI) tools that enhance quality control in the textile industry, such as checking the quality of fabrics used in clothing or medical textiles such as surgical gloves, masks, gauze and aprons and so on. The data is annotated with bounding boxes and polygons to precisely mark spot defects. This dataset is available in Roboflow with various formats like COCO and YOLOv8, which work with different ML frameworks. We strongly claim that our dataset is unique because it covers a wide range of fabrics and challenging spot defects often found in patterned and colorful prints, where spotting defects is especially difficult due to the complexity of the printed fabrics.</p>","PeriodicalId":10973,"journal":{"name":"Data in Brief","volume":"57 ","pages":"111165"},"PeriodicalIF":1.0000,"publicationDate":"2024-11-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11648198/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Data in Brief","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1016/j.dib.2024.111165","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/12/1 0:00:00","PubModel":"eCollection","JCR":"Q3","JCRName":"MULTIDISCIPLINARY SCIENCES","Score":null,"Total":0}
引用次数: 0

Abstract

The FabricSpotDefect dataset is, to the best of our knowledge, the first dataset specifically designed to accurately challenge computer vision in detecting fabric spots. There are a total of 1014 raw images and manually annotated 3288 different categories of spots. This dataset expands to 2300 augmented images after applying six categories of augmentation techniques like flipping, rotating, shearing, saturation adjustment, brightness adjustment, and noise addition. We manually conducted annotations on original images to provide real-world essence rather than augmented images. Two versions are considered for augmented images, one is YOLOv8 resulting in 7641 annotations and another one is COCO format resulting in 7635 annotations. To reduce overfitting and to improve model robustness augmentation technique is required, which eventually increases data diversity. This dataset consists of various types of fabrics such as cotton, linen, silk, denim, patterned textiles, jacquard fabrics, and so on, and spots like stains, discolorations, oil marks, rust, blood marks, and so on. These kinds of spots are quite difficult to detect manually or in other traditional methods. The images were snapped in home lights, using basic everyday clothes, and in normal conditions, making this FabricSpotDefect dataset established in real-world applications. The dataset is organized in a way that makes it easy to use for training, testing, and validating machine learning (ML) models and can be reused at any time since this dataset is real and authentic. Researchers and Developers are free to use this prebuilt dataset to work with artificial intelligence (AI) tools that enhance quality control in the textile industry, such as checking the quality of fabrics used in clothing or medical textiles such as surgical gloves, masks, gauze and aprons and so on. The data is annotated with bounding boxes and polygons to precisely mark spot defects. This dataset is available in Roboflow with various formats like COCO and YOLOv8, which work with different ML frameworks. We strongly claim that our dataset is unique because it covers a wide range of fabrics and challenging spot defects often found in patterned and colorful prints, where spotting defects is especially difficult due to the complexity of the printed fabrics.

求助全文
约1分钟内获得全文 求助全文
来源期刊
Data in Brief
Data in Brief MULTIDISCIPLINARY SCIENCES-
CiteScore
3.10
自引率
0.00%
发文量
996
审稿时长
70 days
期刊介绍: Data in Brief provides a way for researchers to easily share and reuse each other''s datasets by publishing data articles that: -Thoroughly describe your data, facilitating reproducibility. -Make your data, which is often buried in supplementary material, easier to find. -Increase traffic towards associated research articles and data, leading to more citations. -Open up doors for new collaborations. Because you never know what data will be useful to someone else, Data in Brief welcomes submissions that describe data from all research areas.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信