Wei Chen, Shivam Gupta, Milind Dawande, G. Janakiraman
{"title":"3 Years, 2 Papers, 1 Course Off: Optimal Non-Monetary Reward Policies","authors":"Wei Chen, Shivam Gupta, Milind Dawande, G. Janakiraman","doi":"10.2139/ssrn.3647569","DOIUrl":null,"url":null,"abstract":"We consider a principal who periodically offers a fixed, binary, and costly non-monetary reward to agents endowed with private information, to incentivize the agents to invest effort over the long run. An agent's output, as a function of his effort, is a priori uncertain and is worth a fixed per-unit value to the principal. The principal's goal is to design an attractive reward policy that specifies how the rewards are to be given to an agent over time, based on that agent's past performance. This problem, which we denote by P, is motivated by practical examples from both academia (a reduced teaching load for achieving a certain research-productivity threshold) and industry (\"Supplier of the Year\" awards in recognition of excellent past performance). The following \"limited-term'' reward policy structure has been quite popular in practice: The principal evaluates each agent periodically; if an agent's performance over a certain (limited) number of periods in the immediate past exceeds a pre-defined threshold, then the principal rewards him for a certain (limited) number of periods in the immediate future. For the deterministic special case of problem P, where there is no uncertainty in any agent's output given his effort, we show that there always exists an optimal policy that is a limited-term policy and also obtain such a policy. When agents' outputs are stochastic, we show that the class of limited-term policies may not contain any optimal policy of problem P but is guaranteed to contain policies that are arbitrarily near-optimal: Given any epsilon>0, we show how to obtain a limited-term policy whose performance is within epsilon of that of an optimal policy. This guarantee depends crucially on the use of sufficiently long histories of the agents' outputs for the determination of the rewards. In situations where access to this historical information is limited, we derive structural insights on the role played by (i) the length of the available history and (ii) the variability in the random variable governing an agent's output, on the performance of this class of policies. Finally, we introduce and analyze the class of \"score-based'' reward policies - we show that this class is guaranteed to contain an optimal policy and also obtain such a policy.","PeriodicalId":119201,"journal":{"name":"Microeconomics: Asymmetric & Private Information eJournal","volume":"281 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-11-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Microeconomics: Asymmetric & Private Information eJournal","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2139/ssrn.3647569","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
We consider a principal who periodically offers a fixed, binary, and costly non-monetary reward to agents endowed with private information, to incentivize the agents to invest effort over the long run. An agent's output, as a function of his effort, is a priori uncertain and is worth a fixed per-unit value to the principal. The principal's goal is to design an attractive reward policy that specifies how the rewards are to be given to an agent over time, based on that agent's past performance. This problem, which we denote by P, is motivated by practical examples from both academia (a reduced teaching load for achieving a certain research-productivity threshold) and industry ("Supplier of the Year" awards in recognition of excellent past performance). The following "limited-term'' reward policy structure has been quite popular in practice: The principal evaluates each agent periodically; if an agent's performance over a certain (limited) number of periods in the immediate past exceeds a pre-defined threshold, then the principal rewards him for a certain (limited) number of periods in the immediate future. For the deterministic special case of problem P, where there is no uncertainty in any agent's output given his effort, we show that there always exists an optimal policy that is a limited-term policy and also obtain such a policy. When agents' outputs are stochastic, we show that the class of limited-term policies may not contain any optimal policy of problem P but is guaranteed to contain policies that are arbitrarily near-optimal: Given any epsilon>0, we show how to obtain a limited-term policy whose performance is within epsilon of that of an optimal policy. This guarantee depends crucially on the use of sufficiently long histories of the agents' outputs for the determination of the rewards. In situations where access to this historical information is limited, we derive structural insights on the role played by (i) the length of the available history and (ii) the variability in the random variable governing an agent's output, on the performance of this class of policies. Finally, we introduce and analyze the class of "score-based'' reward policies - we show that this class is guaranteed to contain an optimal policy and also obtain such a policy.