User Modelling for Avoiding Overfitting in Interactive Knowledge Elicitation for Prediction

23rd International Conference on Intelligent User Interfaces Pub Date : 2017-10-13 DOI:10.1145/3172944.3172989

Pedram Daee, Tomi Peltola, Aki Vehtari, Samuel Kaski

引用次数: 20

Abstract

In human-in-the-loop machine learning, the user provides information beyond that in the training data. Many algorithms and user interfaces have been designed to optimize and facilitate this human--machine interaction; however, fewer studies have addressed the potential defects the designs can cause. Effective interaction often requires exposing the user to the training data or its statistics. The design of the system is then critical, as this can lead to double use of data and overfitting, if the user reinforces noisy patterns in the data. We propose a user modelling methodology, by assuming simple rational behaviour, to correct the problem. We show, in a user study with 48 participants, that the method improves predictive performance in a sparse linear regression sentiment analysis task, where graded user knowledge on feature relevance is elicited. We believe that the key idea of inferring user knowledge with probabilistic user models has general applicability in guarding against overfitting and improving interactive machine learning.

查看原文本刊更多论文

面向预测的交互式知识获取中避免过拟合的用户建模

在人在循环机器学习中，用户提供的信息超出了训练数据。许多算法和用户界面的设计都是为了优化和促进这种人机交互;然而，很少有研究解决了这种设计可能导致的潜在缺陷。有效的交互通常需要向用户展示训练数据或其统计数据。系统的设计是至关重要的，因为如果用户强化数据中的噪声模式，这可能导致数据的双重使用和过拟合。我们提出了一种用户建模方法，通过假设简单的理性行为来纠正这个问题。我们在一项有48名参与者的用户研究中表明，该方法提高了稀疏线性回归情感分析任务的预测性能，在该任务中，用户对特征相关性的分级知识被激发出来。我们认为，用概率用户模型推断用户知识的关键思想在防止过拟合和改进交互式机器学习方面具有普遍的适用性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

23rd International Conference on Intelligent User Interfaces

自引率

0.00%

发文量