调查对模型的评价

IF 2.5 3区 心理学 Q2 PSYCHOLOGY, EXPERIMENTAL
P. Kane, S. Broomell
{"title":"调查对模型的评价","authors":"P. Kane, S. Broomell","doi":"10.1080/13546783.2021.1999327","DOIUrl":null,"url":null,"abstract":"Abstract Many important decisions depend on unknown states of the world. Society is increasingly relying on statistical predictive models to make decisions in these cases. While predictive models are useful, previous research has documented that (a) individual decision makers distrust models and (b) people’s predictions are often worse than those of models. These findings indicate a lack of awareness of how to evaluate predictions generally. This includes concepts like the loss function used to aggregate errors or whether error is training error or generalisation error. To address this gap, we present three studies testing how lay people visually evaluate the predictive accuracy of models. We found that (a) participant judgements of prediction errors were more similar to absolute error than squared error (Study 1), (b) we did not detect a difference in participant reactions to training error versus generalisation error (Study 2), and (c) participants rated complex models as more accurate when comparing two models, but rated simple models as more accurate when shown single models in isolation (Study 3). When communicating about models, researchers should be aware that the public’s visual evaluation of models may disagree with their method of measuring errors and that many may fail to recognise overfitting.","PeriodicalId":47270,"journal":{"name":"Thinking & Reasoning","volume":"67 1","pages":"569 - 604"},"PeriodicalIF":2.5000,"publicationDate":"2021-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Investigating lay evaluations of models\",\"authors\":\"P. Kane, S. Broomell\",\"doi\":\"10.1080/13546783.2021.1999327\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Abstract Many important decisions depend on unknown states of the world. Society is increasingly relying on statistical predictive models to make decisions in these cases. While predictive models are useful, previous research has documented that (a) individual decision makers distrust models and (b) people’s predictions are often worse than those of models. These findings indicate a lack of awareness of how to evaluate predictions generally. This includes concepts like the loss function used to aggregate errors or whether error is training error or generalisation error. To address this gap, we present three studies testing how lay people visually evaluate the predictive accuracy of models. We found that (a) participant judgements of prediction errors were more similar to absolute error than squared error (Study 1), (b) we did not detect a difference in participant reactions to training error versus generalisation error (Study 2), and (c) participants rated complex models as more accurate when comparing two models, but rated simple models as more accurate when shown single models in isolation (Study 3). When communicating about models, researchers should be aware that the public’s visual evaluation of models may disagree with their method of measuring errors and that many may fail to recognise overfitting.\",\"PeriodicalId\":47270,\"journal\":{\"name\":\"Thinking & Reasoning\",\"volume\":\"67 1\",\"pages\":\"569 - 604\"},\"PeriodicalIF\":2.5000,\"publicationDate\":\"2021-11-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Thinking & Reasoning\",\"FirstCategoryId\":\"102\",\"ListUrlMain\":\"https://doi.org/10.1080/13546783.2021.1999327\",\"RegionNum\":3,\"RegionCategory\":\"心理学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"PSYCHOLOGY, EXPERIMENTAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Thinking & Reasoning","FirstCategoryId":"102","ListUrlMain":"https://doi.org/10.1080/13546783.2021.1999327","RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"PSYCHOLOGY, EXPERIMENTAL","Score":null,"Total":0}
引用次数: 0

摘要

许多重要的决策依赖于未知的世界状态。社会越来越依赖统计预测模型在这些情况下做出决定。虽然预测模型是有用的,但之前的研究已经证明:(a)个体决策者不信任模型,(b)人们的预测通常比模型更糟糕。这些发现表明,人们缺乏对如何普遍评估预测的认识。这包括诸如用于汇总错误的损失函数之类的概念,或者错误是训练错误还是泛化错误。为了解决这一差距,我们提出了三个研究,测试外行人如何直观地评估模型的预测准确性。我们发现(a)参与者对预测误差的判断更类似于绝对误差,而不是平方误差(研究1),(b)我们没有发现参与者对训练误差和泛化误差的反应有差异(研究2),(c)参与者在比较两个模型时认为复杂模型更准确,但在单独展示单个模型时认为简单模型更准确(研究3)。研究人员应该意识到,公众对模型的视觉评价可能与他们测量误差的方法不一致,而且许多人可能无法识别过拟合。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Investigating lay evaluations of models
Abstract Many important decisions depend on unknown states of the world. Society is increasingly relying on statistical predictive models to make decisions in these cases. While predictive models are useful, previous research has documented that (a) individual decision makers distrust models and (b) people’s predictions are often worse than those of models. These findings indicate a lack of awareness of how to evaluate predictions generally. This includes concepts like the loss function used to aggregate errors or whether error is training error or generalisation error. To address this gap, we present three studies testing how lay people visually evaluate the predictive accuracy of models. We found that (a) participant judgements of prediction errors were more similar to absolute error than squared error (Study 1), (b) we did not detect a difference in participant reactions to training error versus generalisation error (Study 2), and (c) participants rated complex models as more accurate when comparing two models, but rated simple models as more accurate when shown single models in isolation (Study 3). When communicating about models, researchers should be aware that the public’s visual evaluation of models may disagree with their method of measuring errors and that many may fail to recognise overfitting.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Thinking & Reasoning
Thinking & Reasoning PSYCHOLOGY, EXPERIMENTAL-
CiteScore
6.50
自引率
11.50%
发文量
25
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信