David Fernández-Narro, Pablo Ferri, Juan Miguel García-Gómez, Carlos Sáez
{"title":"量化数据集移位下更安全健康AI性能预测中的认知不确定性。","authors":"David Fernández-Narro, Pablo Ferri, Juan Miguel García-Gómez, Carlos Sáez","doi":"10.3233/SHTI251493","DOIUrl":null,"url":null,"abstract":"<p><p>Out-of-distribution data , data coming from a different distribution with respect to the training data, entails a critical challenge for the robustness and safety of AI-based clinical decision support systems (CDSSs). This work aims to investigate whether real-time, sample-level quantification of epistemic uncertainty, the model's uncertainty due to limited knowledge of the true data-generating process, can act as a lightweight safety layer for health AI and CDSSs, targeting model updates and spotlighting human review. To this end, we trained and evaluated a continual learning-based neural network classifier on quarterly batches in a real-world Mexican COVID-19 dataset. For each training window, we estimated the distribution of the prediction epistemic uncertainties using Monte Carlo Dropout. We set a data-driven uncertainty threshold to determine potential out-of-distribution samples at 95% of that distribution. Results across all training-test time pairs show that samples below this threshold exhibit consistently higher macro-F1 and render performance virtually invariant to temporal drift, while the flagged samples captured most prediction errors. Since our method requires no model retraining, sample-level epistemic uncertainty screening offers a practical and efficient first line of defense for deploying health-AI systems in dynamic environments.</p>","PeriodicalId":94357,"journal":{"name":"Studies in health technology and informatics","volume":"332 ","pages":"47-51"},"PeriodicalIF":0.0000,"publicationDate":"2025-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Quantifying Epistemic Uncertainty in Predictions for Safer Health AI Performance Under Dataset Shifts.\",\"authors\":\"David Fernández-Narro, Pablo Ferri, Juan Miguel García-Gómez, Carlos Sáez\",\"doi\":\"10.3233/SHTI251493\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Out-of-distribution data , data coming from a different distribution with respect to the training data, entails a critical challenge for the robustness and safety of AI-based clinical decision support systems (CDSSs). This work aims to investigate whether real-time, sample-level quantification of epistemic uncertainty, the model's uncertainty due to limited knowledge of the true data-generating process, can act as a lightweight safety layer for health AI and CDSSs, targeting model updates and spotlighting human review. To this end, we trained and evaluated a continual learning-based neural network classifier on quarterly batches in a real-world Mexican COVID-19 dataset. For each training window, we estimated the distribution of the prediction epistemic uncertainties using Monte Carlo Dropout. We set a data-driven uncertainty threshold to determine potential out-of-distribution samples at 95% of that distribution. Results across all training-test time pairs show that samples below this threshold exhibit consistently higher macro-F1 and render performance virtually invariant to temporal drift, while the flagged samples captured most prediction errors. Since our method requires no model retraining, sample-level epistemic uncertainty screening offers a practical and efficient first line of defense for deploying health-AI systems in dynamic environments.</p>\",\"PeriodicalId\":94357,\"journal\":{\"name\":\"Studies in health technology and informatics\",\"volume\":\"332 \",\"pages\":\"47-51\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2025-10-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Studies in health technology and informatics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.3233/SHTI251493\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Studies in health technology and informatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3233/SHTI251493","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Quantifying Epistemic Uncertainty in Predictions for Safer Health AI Performance Under Dataset Shifts.
Out-of-distribution data , data coming from a different distribution with respect to the training data, entails a critical challenge for the robustness and safety of AI-based clinical decision support systems (CDSSs). This work aims to investigate whether real-time, sample-level quantification of epistemic uncertainty, the model's uncertainty due to limited knowledge of the true data-generating process, can act as a lightweight safety layer for health AI and CDSSs, targeting model updates and spotlighting human review. To this end, we trained and evaluated a continual learning-based neural network classifier on quarterly batches in a real-world Mexican COVID-19 dataset. For each training window, we estimated the distribution of the prediction epistemic uncertainties using Monte Carlo Dropout. We set a data-driven uncertainty threshold to determine potential out-of-distribution samples at 95% of that distribution. Results across all training-test time pairs show that samples below this threshold exhibit consistently higher macro-F1 and render performance virtually invariant to temporal drift, while the flagged samples captured most prediction errors. Since our method requires no model retraining, sample-level epistemic uncertainty screening offers a practical and efficient first line of defense for deploying health-AI systems in dynamic environments.