利用随机森林预测电子显微镜中目标检测模型的性能

IF 6.2 Q1 CHEMISTRY, MULTIDISCIPLINARY

Digital discovery Pub Date : 2025-03-04 DOI:10.1039/D4DD00351A

Ni Li, Ryan Jacobs, Matthew Lynch, Vidit Agrawal, Kevin Field and Dane Morgan

{"title":"利用随机森林预测电子显微镜中目标检测模型的性能","authors":"Ni Li, Ryan Jacobs, Matthew Lynch, Vidit Agrawal, Kevin Field and Dane Morgan","doi":"10.1039/D4DD00351A","DOIUrl":null,"url":null,"abstract":"Quantifying prediction uncertainty when applying object detection models to new, unlabeled datasets is critical in applied machine learning. This study introduces an approach to estimate the performance of deep learning-based object detection models for quantifying defects in transmission electron microscopy (TEM) images, focusing on detecting irradiation-induced cavities in TEM images of metal alloys. We developed a random forest regression model that predicts the object detection F1 score, a statistical metric used to evaluate the ability to accurately locate and classify objects of interest. The random forest model uses features extracted from the predictions of the object detection model whose uncertainty is being quantified, enabling fast prediction on new, unlabeled images. The mean absolute error (MAE) for predicting F1 of the trained model on test data is 0.09, and the R2 score is 0.77, indicating there is a significant correlation between the random forest regression model predicted and true defect detection F1 scores. The approach is shown to be robust across three distinct TEM image datasets with varying imaging and material domains. Our approach enables users to estimate the reliability of a defect detection and segmentation model predictions and assess the applicability of the model to their specific datasets, providing valuable information about possible domain shifts and whether the model needs to be fine-tuned or trained on additional data to be maximally effective for the desired use case.","PeriodicalId":72816,"journal":{"name":"Digital discovery","volume":" 4","pages":" 987-997"},"PeriodicalIF":6.2000,"publicationDate":"2025-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://pubs.rsc.org/en/content/articlepdf/2025/dd/d4dd00351a?page=search","citationCount":"0","resultStr":"{\"title\":\"Predicting performance of object detection models in electron microscopy using random forests†\",\"authors\":\"Ni Li, Ryan Jacobs, Matthew Lynch, Vidit Agrawal, Kevin Field and Dane Morgan\",\"doi\":\"10.1039/D4DD00351A\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Quantifying prediction uncertainty when applying object detection models to new, unlabeled datasets is critical in applied machine learning. This study introduces an approach to estimate the performance of deep learning-based object detection models for quantifying defects in transmission electron microscopy (TEM) images, focusing on detecting irradiation-induced cavities in TEM images of metal alloys. We developed a random forest regression model that predicts the object detection F1 score, a statistical metric used to evaluate the ability to accurately locate and classify objects of interest. The random forest model uses features extracted from the predictions of the object detection model whose uncertainty is being quantified, enabling fast prediction on new, unlabeled images. The mean absolute error (MAE) for predicting F1 of the trained model on test data is 0.09, and the R2 score is 0.77, indicating there is a significant correlation between the random forest regression model predicted and true defect detection F1 scores. The approach is shown to be robust across three distinct TEM image datasets with varying imaging and material domains. Our approach enables users to estimate the reliability of a defect detection and segmentation model predictions and assess the applicability of the model to their specific datasets, providing valuable information about possible domain shifts and whether the model needs to be fine-tuned or trained on additional data to be maximally effective for the desired use case.\",\"PeriodicalId\":72816,\"journal\":{\"name\":\"Digital discovery\",\"volume\":\" 4\",\"pages\":\" 987-997\"},\"PeriodicalIF\":6.2000,\"publicationDate\":\"2025-03-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://pubs.rsc.org/en/content/articlepdf/2025/dd/d4dd00351a?page=search\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Digital discovery\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://pubs.rsc.org/en/content/articlelanding/2025/dd/d4dd00351a\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"CHEMISTRY, MULTIDISCIPLINARY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Digital discovery","FirstCategoryId":"1085","ListUrlMain":"https://pubs.rsc.org/en/content/articlelanding/2025/dd/d4dd00351a","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CHEMISTRY, MULTIDISCIPLINARY","Score":null,"Total":0}

引用次数: 0

摘要

当将目标检测模型应用于新的、未标记的数据集时，量化预测不确定性在应用机器学习中至关重要。本研究介绍了一种评估基于深度学习的物体检测模型的性能的方法，用于量化透射电子显微镜（TEM）图像中的缺陷，重点是检测金属合金TEM图像中的辐照诱导空洞。我们开发了一个随机森林回归模型来预测目标检测F1分数，这是一个用于评估准确定位和分类感兴趣对象的能力的统计指标。随机森林模型使用从目标检测模型的预测中提取的特征，该模型的不确定性被量化，从而能够对新的未标记的图像进行快速预测。训练后的模型在测试数据上预测F1的平均绝对误差（MAE）为0.09，R2评分为0.77，表明随机森林回归模型预测的F1与真缺陷检测F1评分存在显著相关性。该方法在具有不同成像和材料域的三种不同的TEM图像数据集上显示出鲁棒性。我们的方法使用户能够估计缺陷检测和分割模型预测的可靠性，并评估模型对其特定数据集的适用性，提供关于可能的领域转移的有价值的信息，以及模型是否需要在额外的数据上进行微调或训练，以便对期望的用例最大限度地有效。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

Predicting performance of object detection models in electron microscopy using random forests†

查看原文本刊更多论文

Predicting performance of object detection models in electron microscopy using random forests†

Quantifying prediction uncertainty when applying object detection models to new, unlabeled datasets is critical in applied machine learning. This study introduces an approach to estimate the performance of deep learning-based object detection models for quantifying defects in transmission electron microscopy (TEM) images, focusing on detecting irradiation-induced cavities in TEM images of metal alloys. We developed a random forest regression model that predicts the object detection F₁ score, a statistical metric used to evaluate the ability to accurately locate and classify objects of interest. The random forest model uses features extracted from the predictions of the object detection model whose uncertainty is being quantified, enabling fast prediction on new, unlabeled images. The mean absolute error (MAE) for predicting F₁ of the trained model on test data is 0.09, and the R² score is 0.77, indicating there is a significant correlation between the random forest regression model predicted and true defect detection F₁ scores. The approach is shown to be robust across three distinct TEM image datasets with varying imaging and material domains. Our approach enables users to estimate the reliability of a defect detection and segmentation model predictions and assess the applicability of the model to their specific datasets, providing valuable information about possible domain shifts and whether the model needs to be fine-tuned or trained on additional data to be maximally effective for the desired use case.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Digital discovery

CiteScore

2.80

自引率

0.00%

发文量