User Modelling for Avoiding Overfitting in Interactive Knowledge Elicitation for Prediction

Pedram Daee, Tomi Peltola, Aki Vehtari, Samuel Kaski
{"title":"User Modelling for Avoiding Overfitting in Interactive Knowledge Elicitation for Prediction","authors":"Pedram Daee, Tomi Peltola, Aki Vehtari, Samuel Kaski","doi":"10.1145/3172944.3172989","DOIUrl":null,"url":null,"abstract":"In human-in-the-loop machine learning, the user provides information beyond that in the training data. Many algorithms and user interfaces have been designed to optimize and facilitate this human--machine interaction; however, fewer studies have addressed the potential defects the designs can cause. Effective interaction often requires exposing the user to the training data or its statistics. The design of the system is then critical, as this can lead to double use of data and overfitting, if the user reinforces noisy patterns in the data. We propose a user modelling methodology, by assuming simple rational behaviour, to correct the problem. We show, in a user study with 48 participants, that the method improves predictive performance in a sparse linear regression sentiment analysis task, where graded user knowledge on feature relevance is elicited. We believe that the key idea of inferring user knowledge with probabilistic user models has general applicability in guarding against overfitting and improving interactive machine learning.","PeriodicalId":117649,"journal":{"name":"23rd International Conference on Intelligent User Interfaces","volume":"25 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"20","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"23rd International Conference on Intelligent User Interfaces","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3172944.3172989","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 20

Abstract

In human-in-the-loop machine learning, the user provides information beyond that in the training data. Many algorithms and user interfaces have been designed to optimize and facilitate this human--machine interaction; however, fewer studies have addressed the potential defects the designs can cause. Effective interaction often requires exposing the user to the training data or its statistics. The design of the system is then critical, as this can lead to double use of data and overfitting, if the user reinforces noisy patterns in the data. We propose a user modelling methodology, by assuming simple rational behaviour, to correct the problem. We show, in a user study with 48 participants, that the method improves predictive performance in a sparse linear regression sentiment analysis task, where graded user knowledge on feature relevance is elicited. We believe that the key idea of inferring user knowledge with probabilistic user models has general applicability in guarding against overfitting and improving interactive machine learning.
面向预测的交互式知识获取中避免过拟合的用户建模
在人在循环机器学习中,用户提供的信息超出了训练数据。许多算法和用户界面的设计都是为了优化和促进这种人机交互;然而,很少有研究解决了这种设计可能导致的潜在缺陷。有效的交互通常需要向用户展示训练数据或其统计数据。系统的设计是至关重要的,因为如果用户强化数据中的噪声模式,这可能导致数据的双重使用和过拟合。我们提出了一种用户建模方法,通过假设简单的理性行为来纠正这个问题。我们在一项有48名参与者的用户研究中表明,该方法提高了稀疏线性回归情感分析任务的预测性能,在该任务中,用户对特征相关性的分级知识被激发出来。我们认为,用概率用户模型推断用户知识的关键思想在防止过拟合和改进交互式机器学习方面具有普遍的适用性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信