Towards an Analytical System for Supervising Fairness, Robustness, and Dataset Shifts in Health AI.

Studies in health technology and informatics Pub Date : 2025-10-02 DOI:10.3233/SHTI251537

Ángel Sánchez-García, David Fernández-Narro, Pablo Ferri, Juan M García-Gómez, Carlos Sáez

{"title":"Towards an Analytical System for Supervising Fairness, Robustness, and Dataset Shifts in Health AI.","authors":"Ángel Sánchez-García, David Fernández-Narro, Pablo Ferri, Juan M García-Gómez, Carlos Sáez","doi":"10.3233/SHTI251537","DOIUrl":null,"url":null,"abstract":"<p><p>Ensuring trustworthy use of Artificial Intelligence (AI)-based Clinical Decision Support Systems (CDSSs) requires continuous evaluation of their performance and fairness, given the potential impact on patient safety and individual rights as high-risk AI systems. However, the practical implementation of health AI performance and fairness monitoring dashboards presents several challenges. Confusion-matrix-derived performance and fairness metrics are non-additive and cannot be reliably aggregated or disaggregated across time or population subgroups. Furthermore, acquiring ground-truth labels or sensitive variable information, and controlling dataset shifts-changes in data statistical distributions-may require additional interoperability with the electronic health records. We present the design of ShinAI-Agent, a modular system that enables continuous, interpretable, and privacy-aware monitoring of health AI and CDSS performance and fairness. An exploratory dashboard combines time series navigation for multiple performance and fairness metrics, model calibration and decision cutoff exploration, and dataset shift monitoring. The system adopts a two-layer database. First, a proxy database, mapping AI outcomes and essential case-level data such as the ground-truth and sensitive variables. And second, an OLAP architecture with aggregable primitives, including case-based confusion matrices and binned probability distributions for flexible computation of performance and fairness metrics across time or sensitive subgroups. The ShinAI-Agent approach supports compliance with the ethical and robustness requirements of the EU AI Act, enables advisory for model retraining and promotes the operationalisation of Trustworthy AI.</p>","PeriodicalId":94357,"journal":{"name":"Studies in health technology and informatics","volume":"332 ","pages":"247-251"},"PeriodicalIF":0.0000,"publicationDate":"2025-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Studies in health technology and informatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3233/SHTI251537","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Ensuring trustworthy use of Artificial Intelligence (AI)-based Clinical Decision Support Systems (CDSSs) requires continuous evaluation of their performance and fairness, given the potential impact on patient safety and individual rights as high-risk AI systems. However, the practical implementation of health AI performance and fairness monitoring dashboards presents several challenges. Confusion-matrix-derived performance and fairness metrics are non-additive and cannot be reliably aggregated or disaggregated across time or population subgroups. Furthermore, acquiring ground-truth labels or sensitive variable information, and controlling dataset shifts-changes in data statistical distributions-may require additional interoperability with the electronic health records. We present the design of ShinAI-Agent, a modular system that enables continuous, interpretable, and privacy-aware monitoring of health AI and CDSS performance and fairness. An exploratory dashboard combines time series navigation for multiple performance and fairness metrics, model calibration and decision cutoff exploration, and dataset shift monitoring. The system adopts a two-layer database. First, a proxy database, mapping AI outcomes and essential case-level data such as the ground-truth and sensitive variables. And second, an OLAP architecture with aggregable primitives, including case-based confusion matrices and binned probability distributions for flexible computation of performance and fairness metrics across time or sensitive subgroups. The ShinAI-Agent approach supports compliance with the ethical and robustness requirements of the EU AI Act, enables advisory for model retraining and promotes the operationalisation of Trustworthy AI.

查看原文本刊更多论文

健康人工智能中监督公平性、鲁棒性和数据集转换的分析系统。

考虑到作为高风险人工智能系统对患者安全和个人权利的潜在影响，确保基于人工智能（AI）的临床决策支持系统（cdss）的可靠使用需要对其性能和公平性进行持续评估。然而，卫生人工智能性能和公平性监测仪表板的实际实施存在一些挑战。混乱矩阵衍生的性能和公平性指标是非可加性的，不能可靠地跨时间或人口子组进行聚合或分解。此外，获取真实标签或敏感变量信息以及控制数据集移位（数据统计分布的变化）可能需要与电子健康记录额外的互操作性。我们介绍了ShinAI-Agent的设计，这是一个模块化系统，可以对健康AI和CDSS的性能和公平性进行连续、可解释和隐私感知的监测。探索性仪表板结合了多个性能和公平性指标的时间序列导航、模型校准和决策截止探索以及数据集移位监控。系统采用两层数据库。首先，一个代理数据库，映射人工智能结果和基本案例级数据，如基础真相和敏感变量。第二，具有可聚合原语的OLAP体系结构，包括基于案例的混淆矩阵和分箱概率分布，用于跨时间或敏感子组灵活计算性能和公平性指标。ShinAI-Agent方法支持遵守欧盟人工智能法案的道德和鲁棒性要求，为模型再培训提供咨询，并促进可信赖人工智能的运作。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Studies in health technology and informatics

自引率

0.00%

发文量