通过人类响应时间增强基于偏好的线性匪帮

Shen Li, Yuyang Zhang, Zhaolin Ren, Claire Liang, Na Li, Julie A. Shah
{"title":"通过人类响应时间增强基于偏好的线性匪帮","authors":"Shen Li, Yuyang Zhang, Zhaolin Ren, Claire Liang, Na Li, Julie A. Shah","doi":"arxiv-2409.05798","DOIUrl":null,"url":null,"abstract":"Binary human choice feedback is widely used in interactive preference\nlearning for its simplicity, but it provides limited information about\npreference strength. To overcome this limitation, we leverage human response\ntimes, which inversely correlate with preference strength, as complementary\ninformation. Our work integrates the EZ-diffusion model, which jointly models\nhuman choices and response times, into preference-based linear bandits. We\nintroduce a computationally efficient utility estimator that reformulates the\nutility estimation problem using both choices and response times as a linear\nregression problem. Theoretical and empirical comparisons with traditional\nchoice-only estimators reveal that for queries with strong preferences (\"easy\"\nqueries), choices alone provide limited information, while response times offer\nvaluable complementary information about preference strength. As a result,\nincorporating response times makes easy queries more useful. We demonstrate\nthis advantage in the fixed-budget best-arm identification problem, with\nsimulations based on three real-world datasets, consistently showing\naccelerated learning when response times are incorporated.","PeriodicalId":501293,"journal":{"name":"arXiv - ECON - Econometrics","volume":"34 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Enhancing Preference-based Linear Bandits via Human Response Time\",\"authors\":\"Shen Li, Yuyang Zhang, Zhaolin Ren, Claire Liang, Na Li, Julie A. Shah\",\"doi\":\"arxiv-2409.05798\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Binary human choice feedback is widely used in interactive preference\\nlearning for its simplicity, but it provides limited information about\\npreference strength. To overcome this limitation, we leverage human response\\ntimes, which inversely correlate with preference strength, as complementary\\ninformation. Our work integrates the EZ-diffusion model, which jointly models\\nhuman choices and response times, into preference-based linear bandits. We\\nintroduce a computationally efficient utility estimator that reformulates the\\nutility estimation problem using both choices and response times as a linear\\nregression problem. Theoretical and empirical comparisons with traditional\\nchoice-only estimators reveal that for queries with strong preferences (\\\"easy\\\"\\nqueries), choices alone provide limited information, while response times offer\\nvaluable complementary information about preference strength. As a result,\\nincorporating response times makes easy queries more useful. We demonstrate\\nthis advantage in the fixed-budget best-arm identification problem, with\\nsimulations based on three real-world datasets, consistently showing\\naccelerated learning when response times are incorporated.\",\"PeriodicalId\":501293,\"journal\":{\"name\":\"arXiv - ECON - Econometrics\",\"volume\":\"34 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-09-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - ECON - Econometrics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2409.05798\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - ECON - Econometrics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.05798","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

二进制人类选择反馈因其简单性被广泛应用于交互式偏好学习中,但它提供的偏好强度信息有限。为了克服这一局限,我们利用与偏好强度成反比的人类反应时间作为补充信息。我们的工作将 EZ 扩散模型与基于偏好的线性匪帮模型相结合,EZ 扩散模型可以对人类的选择和响应时间进行联合建模。我们引入了一种计算效率高的效用估计器,它将使用选择和响应时间的效用估计问题重新表述为线性回归问题。通过与传统的仅有选择的估计器进行理论和实证比较,我们发现对于具有强烈偏好的查询("简单 "查询),仅有选择提供的信息是有限的,而响应时间则提供了关于偏好强度的宝贵补充信息。因此,加入响应时间会使简单查询更有用。我们在固定预算最佳臂识别问题中证明了这一优势,并基于三个真实世界数据集进行了模拟。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Enhancing Preference-based Linear Bandits via Human Response Time
Binary human choice feedback is widely used in interactive preference learning for its simplicity, but it provides limited information about preference strength. To overcome this limitation, we leverage human response times, which inversely correlate with preference strength, as complementary information. Our work integrates the EZ-diffusion model, which jointly models human choices and response times, into preference-based linear bandits. We introduce a computationally efficient utility estimator that reformulates the utility estimation problem using both choices and response times as a linear regression problem. Theoretical and empirical comparisons with traditional choice-only estimators reveal that for queries with strong preferences ("easy" queries), choices alone provide limited information, while response times offer valuable complementary information about preference strength. As a result, incorporating response times makes easy queries more useful. We demonstrate this advantage in the fixed-budget best-arm identification problem, with simulations based on three real-world datasets, consistently showing accelerated learning when response times are incorporated.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信