Inference of Utilities and Time Preference in Sequential Decision-Making

arXiv - QuantFin - Computational Finance Pub Date : 2024-05-24 DOI:arxiv-2405.15975

Haoyang Cao, Zhengqi Wu, Renyuan Xu

{"title":"Inference of Utilities and Time Preference in Sequential Decision-Making","authors":"Haoyang Cao, Zhengqi Wu, Renyuan Xu","doi":"arxiv-2405.15975","DOIUrl":null,"url":null,"abstract":"This paper introduces a novel stochastic control framework to enhance the\ncapabilities of automated investment managers, or robo-advisors, by accurately\ninferring clients' investment preferences from past activities. Our approach\nleverages a continuous-time model that incorporates utility functions and a\ngeneric discounting scheme of a time-varying rate, tailored to each client's\nrisk tolerance, valuation of daily consumption, and significant life goals. We\naddress the resulting time inconsistency issue through state augmentation and\nthe establishment of the dynamic programming principle and the verification\ntheorem. Additionally, we provide sufficient conditions for the identifiability\nof client investment preferences. To complement our theoretical developments,\nwe propose a learning algorithm based on maximum likelihood estimation within a\ndiscrete-time Markov Decision Process framework, augmented with entropy\nregularization. We prove that the log-likelihood function is locally concave,\nfacilitating the fast convergence of our proposed algorithm. Practical\neffectiveness and efficiency are showcased through two numerical examples,\nincluding Merton's problem and an investment problem with unhedgeable risks. Our proposed framework not only advances financial technology by improving\npersonalized investment advice but also contributes broadly to other fields\nsuch as healthcare, economics, and artificial intelligence, where understanding\nindividual preferences is crucial.","PeriodicalId":501294,"journal":{"name":"arXiv - QuantFin - Computational Finance","volume":"22 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-05-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - QuantFin - Computational Finance","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2405.15975","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

This paper introduces a novel stochastic control framework to enhance the capabilities of automated investment managers, or robo-advisors, by accurately inferring clients' investment preferences from past activities. Our approach leverages a continuous-time model that incorporates utility functions and a generic discounting scheme of a time-varying rate, tailored to each client's risk tolerance, valuation of daily consumption, and significant life goals. We address the resulting time inconsistency issue through state augmentation and the establishment of the dynamic programming principle and the verification theorem. Additionally, we provide sufficient conditions for the identifiability of client investment preferences. To complement our theoretical developments, we propose a learning algorithm based on maximum likelihood estimation within a discrete-time Markov Decision Process framework, augmented with entropy regularization. We prove that the log-likelihood function is locally concave, facilitating the fast convergence of our proposed algorithm. Practical effectiveness and efficiency are showcased through two numerical examples, including Merton's problem and an investment problem with unhedgeable risks. Our proposed framework not only advances financial technology by improving personalized investment advice but also contributes broadly to other fields such as healthcare, economics, and artificial intelligence, where understanding individual preferences is crucial.

查看原文本刊更多论文

顺序决策中的效用和时间偏好推断

本文介绍了一种新颖的随机控制框架，通过从过去的活动中准确推断客户的投资偏好来增强自动投资经理或机器人顾问的能力。我们的方法利用了一个连续时间模型，该模型结合了效用函数和时间变化率的通用贴现方案，根据每位客户的风险承受能力、日常消费估值和重要人生目标量身定制。我们通过状态增强以及动态编程原理和验证定理的建立，解决了由此产生的时间不一致性问题。此外，我们还为客户投资偏好的可识别性提供了充分条件。为了补充我们的理论发展，我们在离散时间马尔可夫决策过程框架内提出了一种基于最大似然估计的学习算法，并对其进行了熵正则化处理。我们证明了对数似然函数是局部凹陷的，这有助于我们提出的算法快速收敛。通过两个数值示例，包括默顿问题和不可对冲风险的投资问题，展示了算法的实用性和效率。我们提出的框架不仅通过改进个性化投资建议推动了金融技术的发展，而且还为医疗保健、经济学和人工智能等其他领域做出了广泛贡献，在这些领域，理解个人偏好至关重要。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

arXiv - QuantFin - Computational Finance

自引率

0.00%

发文量