铀矿精矿加工历史预测模型非参数密度估计的分布外检测

IF 3.3 3区 材料科学 Q2 MATERIALS SCIENCE, MULTIDISCIPLINARY
Cuong Ly , Cody Nizinski , Luther W. McDonald IV , Aaron Chalifoux , Alex Hagen
{"title":"铀矿精矿加工历史预测模型非参数密度估计的分布外检测","authors":"Cuong Ly ,&nbsp;Cody Nizinski ,&nbsp;Luther W. McDonald IV ,&nbsp;Aaron Chalifoux ,&nbsp;Alex Hagen","doi":"10.1016/j.commatsci.2025.114148","DOIUrl":null,"url":null,"abstract":"<div><div>The rapid advancement in machine learning (ML) and computer vision (CV) coincides with the growth of interest in deploying these ML/CV models in numerous fields from medicine to social science. Similar to those areas, we have witnessed a great number of works in materials science employing ML/CV models – neural networks in particular – in their studies in recent years. These models have proven to obtain accurate performance in various tasks. However, these models struggle to attain a similar performance when encountering test samples coming from a distribution that is different from the training set. More importantly, they fail without providing any warning to the users. Therefore, we propose a framework for detecting out-of-distribution (OOD) samples to alert users when a human intervention might be necessary in this work. Specifically, we explore the use of a non-parametric density estimation method to detect OOD samples. We assess OOD detection capability of the proposed framework on ML models developed for categorizing precipitation routes of U<span><math><msub><mrow></mrow><mrow><mn>3</mn></mrow></msub></math></span>O<span><math><msub><mrow></mrow><mrow><mn>8</mn></mrow></msub></math></span> when encountering OOD datasets that contain samples (1) undergone different imaging acquisition process, (2) undergone different material synthesis process, and (3) different materials than ID set. Through those experiments, we achieve an average area under the receiver operating characteristic (AUROC) of at least 91% on average in detecting OOD samples. With minimal overhead cost and superior performance, the proposed framework enables a reliable and safe system when deploying in real-world scenarios.</div></div>","PeriodicalId":10650,"journal":{"name":"Computational Materials Science","volume":"259 ","pages":"Article 114148"},"PeriodicalIF":3.3000,"publicationDate":"2025-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Out-of-distribution detection with non-parametric density estimation for models predicting processing history of uranium ore concentrates\",\"authors\":\"Cuong Ly ,&nbsp;Cody Nizinski ,&nbsp;Luther W. McDonald IV ,&nbsp;Aaron Chalifoux ,&nbsp;Alex Hagen\",\"doi\":\"10.1016/j.commatsci.2025.114148\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>The rapid advancement in machine learning (ML) and computer vision (CV) coincides with the growth of interest in deploying these ML/CV models in numerous fields from medicine to social science. Similar to those areas, we have witnessed a great number of works in materials science employing ML/CV models – neural networks in particular – in their studies in recent years. These models have proven to obtain accurate performance in various tasks. However, these models struggle to attain a similar performance when encountering test samples coming from a distribution that is different from the training set. More importantly, they fail without providing any warning to the users. Therefore, we propose a framework for detecting out-of-distribution (OOD) samples to alert users when a human intervention might be necessary in this work. Specifically, we explore the use of a non-parametric density estimation method to detect OOD samples. We assess OOD detection capability of the proposed framework on ML models developed for categorizing precipitation routes of U<span><math><msub><mrow></mrow><mrow><mn>3</mn></mrow></msub></math></span>O<span><math><msub><mrow></mrow><mrow><mn>8</mn></mrow></msub></math></span> when encountering OOD datasets that contain samples (1) undergone different imaging acquisition process, (2) undergone different material synthesis process, and (3) different materials than ID set. Through those experiments, we achieve an average area under the receiver operating characteristic (AUROC) of at least 91% on average in detecting OOD samples. With minimal overhead cost and superior performance, the proposed framework enables a reliable and safe system when deploying in real-world scenarios.</div></div>\",\"PeriodicalId\":10650,\"journal\":{\"name\":\"Computational Materials Science\",\"volume\":\"259 \",\"pages\":\"Article 114148\"},\"PeriodicalIF\":3.3000,\"publicationDate\":\"2025-08-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computational Materials Science\",\"FirstCategoryId\":\"88\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0927025625004914\",\"RegionNum\":3,\"RegionCategory\":\"材料科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"MATERIALS SCIENCE, MULTIDISCIPLINARY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computational Materials Science","FirstCategoryId":"88","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0927025625004914","RegionNum":3,"RegionCategory":"材料科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MATERIALS SCIENCE, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0

摘要

随着机器学习(ML)和计算机视觉(CV)的快速发展,人们对在从医学到社会科学的众多领域部署这些ML/CV模型的兴趣也在增长。与这些领域类似,近年来,我们在材料科学领域看到了大量使用ML/CV模型(特别是神经网络)的研究。这些模型已被证明在各种任务中获得准确的性能。然而,当遇到来自不同于训练集的分布的测试样本时,这些模型很难获得类似的性能。更重要的是,它们失败时没有向用户提供任何警告。因此,我们提出了一个检测非分布(OOD)样本的框架,以便在这项工作中需要人工干预时提醒用户。具体来说,我们探索了使用非参数密度估计方法来检测OOD样本。我们在针对U3O8降水路径分类开发的ML模型上评估了该框架在遇到OOD数据集时的OOD检测能力,这些OOD数据集包含(1)经过不同成像采集过程的样本,(2)经过不同材料合成过程的样本,以及(3)不同于ID集的材料。通过这些实验,我们在检测OOD样品时实现了接收器工作特征(AUROC)下的平均面积平均至少为91%。该框架具有最小的开销成本和卓越的性能,在实际场景中部署时可以实现可靠和安全的系统。
本文章由计算机程序翻译,如有差异,请以英文原文为准。

Out-of-distribution detection with non-parametric density estimation for models predicting processing history of uranium ore concentrates

Out-of-distribution detection with non-parametric density estimation for models predicting processing history of uranium ore concentrates
The rapid advancement in machine learning (ML) and computer vision (CV) coincides with the growth of interest in deploying these ML/CV models in numerous fields from medicine to social science. Similar to those areas, we have witnessed a great number of works in materials science employing ML/CV models – neural networks in particular – in their studies in recent years. These models have proven to obtain accurate performance in various tasks. However, these models struggle to attain a similar performance when encountering test samples coming from a distribution that is different from the training set. More importantly, they fail without providing any warning to the users. Therefore, we propose a framework for detecting out-of-distribution (OOD) samples to alert users when a human intervention might be necessary in this work. Specifically, we explore the use of a non-parametric density estimation method to detect OOD samples. We assess OOD detection capability of the proposed framework on ML models developed for categorizing precipitation routes of U3O8 when encountering OOD datasets that contain samples (1) undergone different imaging acquisition process, (2) undergone different material synthesis process, and (3) different materials than ID set. Through those experiments, we achieve an average area under the receiver operating characteristic (AUROC) of at least 91% on average in detecting OOD samples. With minimal overhead cost and superior performance, the proposed framework enables a reliable and safe system when deploying in real-world scenarios.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Computational Materials Science
Computational Materials Science 工程技术-材料科学:综合
CiteScore
6.50
自引率
6.10%
发文量
665
审稿时长
26 days
期刊介绍: The goal of Computational Materials Science is to report on results that provide new or unique insights into, or significantly expand our understanding of, the properties of materials or phenomena associated with their design, synthesis, processing, characterization, and utilization. To be relevant to the journal, the results should be applied or applicable to specific material systems that are discussed within the submission.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信