Ángel Sánchez-García, David Fernández-Narro, Pablo Ferri, Juan M García-Gómez, Carlos Sáez
{"title":"Towards an Analytical System for Supervising Fairness, Robustness, and Dataset Shifts in Health AI.","authors":"Ángel Sánchez-García, David Fernández-Narro, Pablo Ferri, Juan M García-Gómez, Carlos Sáez","doi":"10.3233/SHTI251537","DOIUrl":null,"url":null,"abstract":"<p><p>Ensuring trustworthy use of Artificial Intelligence (AI)-based Clinical Decision Support Systems (CDSSs) requires continuous evaluation of their performance and fairness, given the potential impact on patient safety and individual rights as high-risk AI systems. However, the practical implementation of health AI performance and fairness monitoring dashboards presents several challenges. Confusion-matrix-derived performance and fairness metrics are non-additive and cannot be reliably aggregated or disaggregated across time or population subgroups. Furthermore, acquiring ground-truth labels or sensitive variable information, and controlling dataset shifts-changes in data statistical distributions-may require additional interoperability with the electronic health records. We present the design of ShinAI-Agent, a modular system that enables continuous, interpretable, and privacy-aware monitoring of health AI and CDSS performance and fairness. An exploratory dashboard combines time series navigation for multiple performance and fairness metrics, model calibration and decision cutoff exploration, and dataset shift monitoring. The system adopts a two-layer database. First, a proxy database, mapping AI outcomes and essential case-level data such as the ground-truth and sensitive variables. And second, an OLAP architecture with aggregable primitives, including case-based confusion matrices and binned probability distributions for flexible computation of performance and fairness metrics across time or sensitive subgroups. The ShinAI-Agent approach supports compliance with the ethical and robustness requirements of the EU AI Act, enables advisory for model retraining and promotes the operationalisation of Trustworthy AI.</p>","PeriodicalId":94357,"journal":{"name":"Studies in health technology and informatics","volume":"332 ","pages":"247-251"},"PeriodicalIF":0.0000,"publicationDate":"2025-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Studies in health technology and informatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3233/SHTI251537","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Ensuring trustworthy use of Artificial Intelligence (AI)-based Clinical Decision Support Systems (CDSSs) requires continuous evaluation of their performance and fairness, given the potential impact on patient safety and individual rights as high-risk AI systems. However, the practical implementation of health AI performance and fairness monitoring dashboards presents several challenges. Confusion-matrix-derived performance and fairness metrics are non-additive and cannot be reliably aggregated or disaggregated across time or population subgroups. Furthermore, acquiring ground-truth labels or sensitive variable information, and controlling dataset shifts-changes in data statistical distributions-may require additional interoperability with the electronic health records. We present the design of ShinAI-Agent, a modular system that enables continuous, interpretable, and privacy-aware monitoring of health AI and CDSS performance and fairness. An exploratory dashboard combines time series navigation for multiple performance and fairness metrics, model calibration and decision cutoff exploration, and dataset shift monitoring. The system adopts a two-layer database. First, a proxy database, mapping AI outcomes and essential case-level data such as the ground-truth and sensitive variables. And second, an OLAP architecture with aggregable primitives, including case-based confusion matrices and binned probability distributions for flexible computation of performance and fairness metrics across time or sensitive subgroups. The ShinAI-Agent approach supports compliance with the ethical and robustness requirements of the EU AI Act, enables advisory for model retraining and promotes the operationalisation of Trustworthy AI.