Generation of micrograph-annotation pairs for steel microstructure recognition using the hybrid deep generative model in the case of an extremely small and imbalanced dataset
IF 4.8 2区 材料科学Q1 MATERIALS SCIENCE, CHARACTERIZATION & TESTING
{"title":"Generation of micrograph-annotation pairs for steel microstructure recognition using the hybrid deep generative model in the case of an extremely small and imbalanced dataset","authors":"","doi":"10.1016/j.matchar.2024.114407","DOIUrl":null,"url":null,"abstract":"<div><div>Insufficient annotated samples coupled with class imbalance problem largely restrict the wide application of deep learning (DL)-based approach in microstructure recognition and quantification. In this work, we present a micrograph augmentation approach using the hybrid deep generative model to generate SEM image-annotation pairs for the establishment of a large-scale and well-balanced augmentation dataset. In this method, a generator is established to produce the desired annotations and then a translator is trained to translate these synthetic annotations into high-quality SEM images. The proposed method is successfully applied to an extremely small and imbalanced additively manufactured (AM) steel dataset containing only one SEM image-annotation pair with a very low martensite/austenite (MA) fraction, to significantly augment the initial dataset and achieve a more balanced distribution of phase fraction. The effectiveness of the present method is well demonstrated by the fact that the extensibility of microstructure recognition model to unseen micrographs is improved through the utilization of synthetic data. Furthermore, the impact of synthetic data proportion on the model's performance and the underlying reasons for synthetic data to improve the extensibility of trained models are also discussed in detail.</div></div>","PeriodicalId":18727,"journal":{"name":"Materials Characterization","volume":null,"pages":null},"PeriodicalIF":4.8000,"publicationDate":"2024-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Materials Characterization","FirstCategoryId":"88","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1044580324007885","RegionNum":2,"RegionCategory":"材料科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MATERIALS SCIENCE, CHARACTERIZATION & TESTING","Score":null,"Total":0}
引用次数: 0
Abstract
Insufficient annotated samples coupled with class imbalance problem largely restrict the wide application of deep learning (DL)-based approach in microstructure recognition and quantification. In this work, we present a micrograph augmentation approach using the hybrid deep generative model to generate SEM image-annotation pairs for the establishment of a large-scale and well-balanced augmentation dataset. In this method, a generator is established to produce the desired annotations and then a translator is trained to translate these synthetic annotations into high-quality SEM images. The proposed method is successfully applied to an extremely small and imbalanced additively manufactured (AM) steel dataset containing only one SEM image-annotation pair with a very low martensite/austenite (MA) fraction, to significantly augment the initial dataset and achieve a more balanced distribution of phase fraction. The effectiveness of the present method is well demonstrated by the fact that the extensibility of microstructure recognition model to unseen micrographs is improved through the utilization of synthetic data. Furthermore, the impact of synthetic data proportion on the model's performance and the underlying reasons for synthetic data to improve the extensibility of trained models are also discussed in detail.
注释样本不足和类不平衡问题在很大程度上限制了基于深度学习(DL)的方法在显微结构识别和量化中的广泛应用。在这项工作中,我们提出了一种显微摄影增强方法,使用混合深度生成模型生成 SEM 图像-注释对,以建立大规模且均衡的增强数据集。在这种方法中,先建立一个生成器来生成所需的注释,然后训练一个翻译器将这些合成注释翻译成高质量的扫描电镜图像。所提出的方法成功地应用于一个极小且不平衡的加法制造(AM)钢数据集,该数据集仅包含一个马氏体/奥氏体(MA)分数极低的 SEM 图像-注释对,从而显著增强了初始数据集,并使相分数分布更加平衡。通过利用合成数据,显微结构识别模型对未见显微照片的可扩展性得到了改善,这充分证明了本方法的有效性。此外,还详细讨论了合成数据比例对模型性能的影响以及合成数据提高训练模型可扩展性的根本原因。
期刊介绍:
Materials Characterization features original articles and state-of-the-art reviews on theoretical and practical aspects of the structure and behaviour of materials.
The Journal focuses on all characterization techniques, including all forms of microscopy (light, electron, acoustic, etc.,) and analysis (especially microanalysis and surface analytical techniques). Developments in both this wide range of techniques and their application to the quantification of the microstructure of materials are essential facets of the Journal.
The Journal provides the Materials Scientist/Engineer with up-to-date information on many types of materials with an underlying theme of explaining the behavior of materials using novel approaches. Materials covered by the journal include:
Metals & Alloys
Ceramics
Nanomaterials
Biomedical materials
Optical materials
Composites
Natural Materials.