Cuong Ly , Cody Nizinski , Luther W. McDonald IV , Aaron Chalifoux , Alex Hagen
{"title":"铀矿精矿加工历史预测模型非参数密度估计的分布外检测","authors":"Cuong Ly , Cody Nizinski , Luther W. McDonald IV , Aaron Chalifoux , Alex Hagen","doi":"10.1016/j.commatsci.2025.114148","DOIUrl":null,"url":null,"abstract":"<div><div>The rapid advancement in machine learning (ML) and computer vision (CV) coincides with the growth of interest in deploying these ML/CV models in numerous fields from medicine to social science. Similar to those areas, we have witnessed a great number of works in materials science employing ML/CV models – neural networks in particular – in their studies in recent years. These models have proven to obtain accurate performance in various tasks. However, these models struggle to attain a similar performance when encountering test samples coming from a distribution that is different from the training set. More importantly, they fail without providing any warning to the users. Therefore, we propose a framework for detecting out-of-distribution (OOD) samples to alert users when a human intervention might be necessary in this work. Specifically, we explore the use of a non-parametric density estimation method to detect OOD samples. We assess OOD detection capability of the proposed framework on ML models developed for categorizing precipitation routes of U<span><math><msub><mrow></mrow><mrow><mn>3</mn></mrow></msub></math></span>O<span><math><msub><mrow></mrow><mrow><mn>8</mn></mrow></msub></math></span> when encountering OOD datasets that contain samples (1) undergone different imaging acquisition process, (2) undergone different material synthesis process, and (3) different materials than ID set. Through those experiments, we achieve an average area under the receiver operating characteristic (AUROC) of at least 91% on average in detecting OOD samples. With minimal overhead cost and superior performance, the proposed framework enables a reliable and safe system when deploying in real-world scenarios.</div></div>","PeriodicalId":10650,"journal":{"name":"Computational Materials Science","volume":"259 ","pages":"Article 114148"},"PeriodicalIF":3.3000,"publicationDate":"2025-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Out-of-distribution detection with non-parametric density estimation for models predicting processing history of uranium ore concentrates\",\"authors\":\"Cuong Ly , Cody Nizinski , Luther W. McDonald IV , Aaron Chalifoux , Alex Hagen\",\"doi\":\"10.1016/j.commatsci.2025.114148\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>The rapid advancement in machine learning (ML) and computer vision (CV) coincides with the growth of interest in deploying these ML/CV models in numerous fields from medicine to social science. Similar to those areas, we have witnessed a great number of works in materials science employing ML/CV models – neural networks in particular – in their studies in recent years. These models have proven to obtain accurate performance in various tasks. However, these models struggle to attain a similar performance when encountering test samples coming from a distribution that is different from the training set. More importantly, they fail without providing any warning to the users. Therefore, we propose a framework for detecting out-of-distribution (OOD) samples to alert users when a human intervention might be necessary in this work. Specifically, we explore the use of a non-parametric density estimation method to detect OOD samples. We assess OOD detection capability of the proposed framework on ML models developed for categorizing precipitation routes of U<span><math><msub><mrow></mrow><mrow><mn>3</mn></mrow></msub></math></span>O<span><math><msub><mrow></mrow><mrow><mn>8</mn></mrow></msub></math></span> when encountering OOD datasets that contain samples (1) undergone different imaging acquisition process, (2) undergone different material synthesis process, and (3) different materials than ID set. Through those experiments, we achieve an average area under the receiver operating characteristic (AUROC) of at least 91% on average in detecting OOD samples. With minimal overhead cost and superior performance, the proposed framework enables a reliable and safe system when deploying in real-world scenarios.</div></div>\",\"PeriodicalId\":10650,\"journal\":{\"name\":\"Computational Materials Science\",\"volume\":\"259 \",\"pages\":\"Article 114148\"},\"PeriodicalIF\":3.3000,\"publicationDate\":\"2025-08-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computational Materials Science\",\"FirstCategoryId\":\"88\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0927025625004914\",\"RegionNum\":3,\"RegionCategory\":\"材料科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"MATERIALS SCIENCE, MULTIDISCIPLINARY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computational Materials Science","FirstCategoryId":"88","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0927025625004914","RegionNum":3,"RegionCategory":"材料科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MATERIALS SCIENCE, MULTIDISCIPLINARY","Score":null,"Total":0}
Out-of-distribution detection with non-parametric density estimation for models predicting processing history of uranium ore concentrates
The rapid advancement in machine learning (ML) and computer vision (CV) coincides with the growth of interest in deploying these ML/CV models in numerous fields from medicine to social science. Similar to those areas, we have witnessed a great number of works in materials science employing ML/CV models – neural networks in particular – in their studies in recent years. These models have proven to obtain accurate performance in various tasks. However, these models struggle to attain a similar performance when encountering test samples coming from a distribution that is different from the training set. More importantly, they fail without providing any warning to the users. Therefore, we propose a framework for detecting out-of-distribution (OOD) samples to alert users when a human intervention might be necessary in this work. Specifically, we explore the use of a non-parametric density estimation method to detect OOD samples. We assess OOD detection capability of the proposed framework on ML models developed for categorizing precipitation routes of UO when encountering OOD datasets that contain samples (1) undergone different imaging acquisition process, (2) undergone different material synthesis process, and (3) different materials than ID set. Through those experiments, we achieve an average area under the receiver operating characteristic (AUROC) of at least 91% on average in detecting OOD samples. With minimal overhead cost and superior performance, the proposed framework enables a reliable and safe system when deploying in real-world scenarios.
期刊介绍:
The goal of Computational Materials Science is to report on results that provide new or unique insights into, or significantly expand our understanding of, the properties of materials or phenomena associated with their design, synthesis, processing, characterization, and utilization. To be relevant to the journal, the results should be applied or applicable to specific material systems that are discussed within the submission.