度量集成有助于可解释性:维基百科数据的案例研究

Big data analytics Pub Date : 2023-04-07 DOI:10.3390/analytics2020017

Grant Forbes, R. J. Crouser

{"title":"度量集成有助于可解释性:维基百科数据的案例研究","authors":"Grant Forbes, R. J. Crouser","doi":"10.3390/analytics2020017","DOIUrl":null,"url":null,"abstract":"In recent years, as machine learning models have become larger and more complex, it has become both more difficult and more important to be able to explain and interpret the results of those models, both to prevent model errors and to inspire confidence for end users of the model. As such, there has been a significant and growing interest in explainability in recent years as a highly desirable trait for a model to have. Similarly, there has been much recent attention on ensemble methods, which aim to aggregate results from multiple (often simple) models or metrics in order to outperform models that optimize for only a single metric. We argue that this latter issue can actually assist with the former: a model that optimizes for several metrics has some base level of explainability baked into the model, and this explainability can be leveraged not only for user confidence but to fine-tune the weights between the metrics themselves in an intuitive way. We demonstrate a case study of such a benefit, in which we obtain clear, explainable results based on an aggregate of five simple metrics of relevance, using Wikipedia data as a proxy for some large text-based recommendation problem. We demonstrate that not only can these metrics’ simplicity and multiplicity be leveraged for explainability, but in fact, that very explainability can lead to an intuitive fine-tuning process that improves the model itself.","PeriodicalId":93078,"journal":{"name":"Big data analytics","volume":"119 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2023-04-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Metric Ensembles Aid in Explainability: A Case Study with Wikipedia Data\",\"authors\":\"Grant Forbes, R. J. Crouser\",\"doi\":\"10.3390/analytics2020017\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In recent years, as machine learning models have become larger and more complex, it has become both more difficult and more important to be able to explain and interpret the results of those models, both to prevent model errors and to inspire confidence for end users of the model. As such, there has been a significant and growing interest in explainability in recent years as a highly desirable trait for a model to have. Similarly, there has been much recent attention on ensemble methods, which aim to aggregate results from multiple (often simple) models or metrics in order to outperform models that optimize for only a single metric. We argue that this latter issue can actually assist with the former: a model that optimizes for several metrics has some base level of explainability baked into the model, and this explainability can be leveraged not only for user confidence but to fine-tune the weights between the metrics themselves in an intuitive way. We demonstrate a case study of such a benefit, in which we obtain clear, explainable results based on an aggregate of five simple metrics of relevance, using Wikipedia data as a proxy for some large text-based recommendation problem. We demonstrate that not only can these metrics’ simplicity and multiplicity be leveraged for explainability, but in fact, that very explainability can lead to an intuitive fine-tuning process that improves the model itself.\",\"PeriodicalId\":93078,\"journal\":{\"name\":\"Big data analytics\",\"volume\":\"119 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-04-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Big data analytics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.3390/analytics2020017\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Big data analytics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3390/analytics2020017","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

摘要

近年来，随着机器学习模型变得越来越大，越来越复杂，能够解释和解释这些模型的结果变得越来越困难，也越来越重要，既要防止模型错误，又要激发模型最终用户的信心。因此，近年来，人们对可解释性的兴趣日益浓厚，认为这是模型所具有的一个非常理想的特征。类似地，最近有很多关注集成方法，其目的是聚合来自多个(通常是简单的)模型或度量的结果，以便优于仅针对单个度量进行优化的模型。我们认为后一个问题实际上可以帮助解决前一个问题:针对多个参数进行优化的模型具有一些基本的可解释性，并且这种可解释性不仅可以用于用户信心，还可以以直观的方式微调参数本身之间的权重。我们展示了这样一个好处的案例研究，在这个案例中，我们基于五个简单的相关性指标的总和获得了清晰、可解释的结果，使用维基百科数据作为一些大型基于文本的推荐问题的代理。我们证明，这些指标的简单性和多样性不仅可以用于可解释性，而且实际上，这种可解释性可以导致直观的微调过程，从而改进模型本身。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Metric Ensembles Aid in Explainability: A Case Study with Wikipedia Data

In recent years, as machine learning models have become larger and more complex, it has become both more difficult and more important to be able to explain and interpret the results of those models, both to prevent model errors and to inspire confidence for end users of the model. As such, there has been a significant and growing interest in explainability in recent years as a highly desirable trait for a model to have. Similarly, there has been much recent attention on ensemble methods, which aim to aggregate results from multiple (often simple) models or metrics in order to outperform models that optimize for only a single metric. We argue that this latter issue can actually assist with the former: a model that optimizes for several metrics has some base level of explainability baked into the model, and this explainability can be leveraged not only for user confidence but to fine-tune the weights between the metrics themselves in an intuitive way. We demonstrate a case study of such a benefit, in which we obtain clear, explainable results based on an aggregate of five simple metrics of relevance, using Wikipedia data as a proxy for some large text-based recommendation problem. We demonstrate that not only can these metrics’ simplicity and multiplicity be leveraged for explainability, but in fact, that very explainability can lead to an intuitive fine-tuning process that improves the model itself.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Big data analytics

自引率

0.00%

发文量

审稿时长

5 weeks