Interactive Prior Elicitation of Feature Similarities for Small Sample Size Prediction

Homayun Afrabandpey, Tomi Peltola, Samuel Kaski
{"title":"Interactive Prior Elicitation of Feature Similarities for Small Sample Size Prediction","authors":"Homayun Afrabandpey, Tomi Peltola, Samuel Kaski","doi":"10.1145/3079628.3079698","DOIUrl":null,"url":null,"abstract":"Regression under the \"small n$, large p\" condition, of small sample size n and large number of features p in the learning data set, is a recurring setting in which learning from data is difficult. With prior knowledge about relationships of the features, p can effectively be reduced, but explicating such prior knowledge is difficult for experts. In this paper we introduce a new method for eliciting expert prior knowledge about the similarity of the roles of features in the prediction task. The key idea is to use an interactive multidimensional-scaling (MDS) type scatterplot display of the features to elicit the similarity relationships, and then use the elicited relationships in the prior distribution of prediction parameters. Specifically, for learning to predict a target variable with Bayesian linear regression, the feature relationships are used to construct a Gaussian prior with a full covariance matrix for the regression coefficients. Evaluation of our method in experiments with simulated and real users on text data confirm that prior elicitation of feature similarities improves prediction accuracy. Furthermore, elicitation with an interactive scatterplot display outperforms straightforward elicitation where the users choose feature pairs from a feature list.","PeriodicalId":216017,"journal":{"name":"Proceedings of the 25th Conference on User Modeling, Adaptation and Personalization","volume":"72 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-12-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"11","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 25th Conference on User Modeling, Adaptation and Personalization","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3079628.3079698","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 11

Abstract

Regression under the "small n$, large p" condition, of small sample size n and large number of features p in the learning data set, is a recurring setting in which learning from data is difficult. With prior knowledge about relationships of the features, p can effectively be reduced, but explicating such prior knowledge is difficult for experts. In this paper we introduce a new method for eliciting expert prior knowledge about the similarity of the roles of features in the prediction task. The key idea is to use an interactive multidimensional-scaling (MDS) type scatterplot display of the features to elicit the similarity relationships, and then use the elicited relationships in the prior distribution of prediction parameters. Specifically, for learning to predict a target variable with Bayesian linear regression, the feature relationships are used to construct a Gaussian prior with a full covariance matrix for the regression coefficients. Evaluation of our method in experiments with simulated and real users on text data confirm that prior elicitation of feature similarities improves prediction accuracy. Furthermore, elicitation with an interactive scatterplot display outperforms straightforward elicitation where the users choose feature pairs from a feature list.
小样本量预测中特征相似性的交互式先验启发
“小n$,大p”条件下的回归,即学习数据集中样本量n小,特征p多的情况,是一种反复出现的情况,很难从数据中学习。有了关于特征关系的先验知识,可以有效地降低p,但对专家来说,解释这种先验知识是困难的。在本文中,我们引入了一种新的方法来获取预测任务中关于特征角色相似性的专家先验知识。其关键思想是利用交互式多维尺度(MDS)类型的散点图显示特征来引出相似关系,然后将得到的关系用于预测参数的先验分布中。具体来说,为了学习使用贝叶斯线性回归预测目标变量,使用特征关系构建高斯先验,并为回归系数提供完整的协方差矩阵。我们的方法在模拟用户和真实用户对文本数据的实验中进行了评估,证实了预先提取特征相似度可以提高预测精度。此外,具有交互式散点图显示的启发优于用户从特征列表中选择特征对的直接启发。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信