{"title":"Reliability and Interpretability in Science and Deep Learning","authors":"Luigi Scorzato","doi":"10.1007/s11023-024-09682-0","DOIUrl":null,"url":null,"abstract":"<p>In recent years, the question of the reliability of Machine Learning (ML) methods has acquired significant importance, and the analysis of the associated uncertainties has motivated a growing amount of research. However, most of these studies have applied standard error analysis to ML models—and in particular Deep Neural Network (DNN) models—which represent a rather significant departure from standard scientific modelling. It is therefore necessary to integrate the standard error analysis with a deeper epistemological analysis of the possible differences between DNN models and standard scientific modelling and the possible implications of these differences in the assessment of reliability. This article offers several contributions. First, it emphasises the ubiquitous role of model assumptions (both in ML and traditional science) against the illusion of theory-free science. Secondly, model assumptions are analysed from the point of view of their (epistemic) complexity, which is shown to be language-independent. It is argued that the high epistemic complexity of DNN models hinders the estimate of their reliability and also their prospect of long term progress. Some potential ways forward are suggested. Thirdly, this article identifies the close relation between a model’s epistemic complexity and its interpretability, as introduced in the context of responsible AI. This clarifies in which sense—and to what extent—the lack of understanding of a model (black-box problem) impacts its interpretability in a way that is independent of individual skills. It also clarifies how interpretability is a precondition for a plausible assessment of the reliability of any model, which cannot be based on statistical analysis alone. This article focuses on the comparison between traditional scientific models and DNN models. However, Random Forest (RF) and Logistic Regression (LR) models are also briefly considered.</p>","PeriodicalId":51133,"journal":{"name":"Minds and Machines","volume":"25 1","pages":""},"PeriodicalIF":4.2000,"publicationDate":"2024-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Minds and Machines","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s11023-024-09682-0","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
In recent years, the question of the reliability of Machine Learning (ML) methods has acquired significant importance, and the analysis of the associated uncertainties has motivated a growing amount of research. However, most of these studies have applied standard error analysis to ML models—and in particular Deep Neural Network (DNN) models—which represent a rather significant departure from standard scientific modelling. It is therefore necessary to integrate the standard error analysis with a deeper epistemological analysis of the possible differences between DNN models and standard scientific modelling and the possible implications of these differences in the assessment of reliability. This article offers several contributions. First, it emphasises the ubiquitous role of model assumptions (both in ML and traditional science) against the illusion of theory-free science. Secondly, model assumptions are analysed from the point of view of their (epistemic) complexity, which is shown to be language-independent. It is argued that the high epistemic complexity of DNN models hinders the estimate of their reliability and also their prospect of long term progress. Some potential ways forward are suggested. Thirdly, this article identifies the close relation between a model’s epistemic complexity and its interpretability, as introduced in the context of responsible AI. This clarifies in which sense—and to what extent—the lack of understanding of a model (black-box problem) impacts its interpretability in a way that is independent of individual skills. It also clarifies how interpretability is a precondition for a plausible assessment of the reliability of any model, which cannot be based on statistical analysis alone. This article focuses on the comparison between traditional scientific models and DNN models. However, Random Forest (RF) and Logistic Regression (LR) models are also briefly considered.
近年来,机器学习(ML)方法的可靠性问题变得越来越重要,对相关不确定性的分析也推动了越来越多的研究。然而,这些研究大多将标准误差分析应用于 ML 模型,特别是深度神经网络(DNN)模型,这与标准科学建模有很大不同。因此,有必要将标准误差分析与更深入的认识论分析结合起来,分析 DNN 模型与标准科学建模之间可能存在的差异,以及这些差异对可靠性评估可能产生的影响。本文有以下几个方面的贡献。首先,它强调了模型假设(在 ML 和传统科学中)无处不在的作用,反对无理论科学的假象。其次,文章从模型假设的(认识论)复杂性角度对其进行了分析,结果表明模型假设与语言无关。有观点认为,DNN 模型在认识论上的高度复杂性阻碍了对其可靠性的估计,也阻碍了其长期发展的前景。本文提出了一些可能的前进方向。第三,本文指出了在负责任人工智能背景下提出的模型认识复杂性与其可解释性之间的密切关系。这阐明了对模型缺乏理解(黑箱问题)在何种意义上以及在何种程度上影响了模型的可解释性,而这种影响与个人技能无关。它还阐明了可解释性如何成为对任何模型的可靠性进行合理评估的先决条件,而这种评估不能仅以统计分析为基础。本文侧重于传统科学模型与 DNN 模型之间的比较。不过,本文也简要介绍了随机森林(RF)和逻辑回归(LR)模型。
期刊介绍:
Minds and Machines, affiliated with the Society for Machines and Mentality, serves as a platform for fostering critical dialogue between the AI and philosophical communities. With a focus on problems of shared interest, the journal actively encourages discussions on the philosophical aspects of computer science.
Offering a global forum, Minds and Machines provides a space to debate and explore important and contentious issues within its editorial focus. The journal presents special editions dedicated to specific topics, invites critical responses to previously published works, and features review essays addressing current problem scenarios.
By facilitating a diverse range of perspectives, Minds and Machines encourages a reevaluation of the status quo and the development of new insights. Through this collaborative approach, the journal aims to bridge the gap between AI and philosophy, fostering a tradition of critique and ensuring these fields remain connected and relevant.