Nicolò Cesa-Bianchi, Roberto Colomboni, Maximilian Kasy
{"title":"社会福利的适应性最大化","authors":"Nicolò Cesa-Bianchi, Roberto Colomboni, Maximilian Kasy","doi":"10.3982/ECTA22351","DOIUrl":null,"url":null,"abstract":"<div>\n <p>We consider the problem of repeatedly choosing policies to maximize social welfare. Welfare is a weighted sum of private utility and public revenue. Earlier outcomes inform later policies. Utility is not observed, but indirectly inferred. Response functions are learned through experimentation.</p>\n <p>We derive a lower bound on regret, and a matching adversarial upper bound for a variant of the Exp3 algorithm. Cumulative regret grows at a rate of <i>T</i><sup>2/3</sup>. This implies that (i) welfare maximization is harder than the multiarmed bandit problem (with a rate of <i>T</i><sup>1/2</sup> for finite policy sets), and (ii) our algorithm achieves the optimal rate. For the stochastic setting, if social welfare is concave, we can achieve a rate of <i>T</i><sup>1/2</sup> (for continuous policy sets), using a dyadic search algorithm.</p>\n <p>We analyze an extension to nonlinear income taxation, and sketch an extension to commodity taxation. We compare our setting to monopoly pricing (which is easier), and price setting for bilateral trade (which is harder).</p>\n </div>","PeriodicalId":50556,"journal":{"name":"Econometrica","volume":"93 3","pages":"1073-1104"},"PeriodicalIF":6.6000,"publicationDate":"2025-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.3982/ECTA22351","citationCount":"0","resultStr":"{\"title\":\"Adaptive Maximization of Social Welfare\",\"authors\":\"Nicolò Cesa-Bianchi, Roberto Colomboni, Maximilian Kasy\",\"doi\":\"10.3982/ECTA22351\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div>\\n <p>We consider the problem of repeatedly choosing policies to maximize social welfare. Welfare is a weighted sum of private utility and public revenue. Earlier outcomes inform later policies. Utility is not observed, but indirectly inferred. Response functions are learned through experimentation.</p>\\n <p>We derive a lower bound on regret, and a matching adversarial upper bound for a variant of the Exp3 algorithm. Cumulative regret grows at a rate of <i>T</i><sup>2/3</sup>. This implies that (i) welfare maximization is harder than the multiarmed bandit problem (with a rate of <i>T</i><sup>1/2</sup> for finite policy sets), and (ii) our algorithm achieves the optimal rate. For the stochastic setting, if social welfare is concave, we can achieve a rate of <i>T</i><sup>1/2</sup> (for continuous policy sets), using a dyadic search algorithm.</p>\\n <p>We analyze an extension to nonlinear income taxation, and sketch an extension to commodity taxation. We compare our setting to monopoly pricing (which is easier), and price setting for bilateral trade (which is harder).</p>\\n </div>\",\"PeriodicalId\":50556,\"journal\":{\"name\":\"Econometrica\",\"volume\":\"93 3\",\"pages\":\"1073-1104\"},\"PeriodicalIF\":6.6000,\"publicationDate\":\"2025-06-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://onlinelibrary.wiley.com/doi/epdf/10.3982/ECTA22351\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Econometrica\",\"FirstCategoryId\":\"96\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.3982/ECTA22351\",\"RegionNum\":1,\"RegionCategory\":\"经济学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ECONOMICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Econometrica","FirstCategoryId":"96","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.3982/ECTA22351","RegionNum":1,"RegionCategory":"经济学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ECONOMICS","Score":null,"Total":0}
We consider the problem of repeatedly choosing policies to maximize social welfare. Welfare is a weighted sum of private utility and public revenue. Earlier outcomes inform later policies. Utility is not observed, but indirectly inferred. Response functions are learned through experimentation.
We derive a lower bound on regret, and a matching adversarial upper bound for a variant of the Exp3 algorithm. Cumulative regret grows at a rate of T2/3. This implies that (i) welfare maximization is harder than the multiarmed bandit problem (with a rate of T1/2 for finite policy sets), and (ii) our algorithm achieves the optimal rate. For the stochastic setting, if social welfare is concave, we can achieve a rate of T1/2 (for continuous policy sets), using a dyadic search algorithm.
We analyze an extension to nonlinear income taxation, and sketch an extension to commodity taxation. We compare our setting to monopoly pricing (which is easier), and price setting for bilateral trade (which is harder).
期刊介绍:
Econometrica publishes original articles in all branches of economics - theoretical and empirical, abstract and applied, providing wide-ranging coverage across the subject area. It promotes studies that aim at the unification of the theoretical-quantitative and the empirical-quantitative approach to economic problems and that are penetrated by constructive and rigorous thinking. It explores a unique range of topics each year - from the frontier of theoretical developments in many new and important areas, to research on current and applied economic problems, to methodologically innovative, theoretical and applied studies in econometrics.
Econometrica maintains a long tradition that submitted articles are refereed carefully and that detailed and thoughtful referee reports are provided to the author as an aid to scientific research, thus ensuring the high calibre of papers found in Econometrica. An international board of editors, together with the referees it has selected, has succeeded in substantially reducing editorial turnaround time, thereby encouraging submissions of the highest quality.
We strongly encourage recent Ph. D. graduates to submit their work to Econometrica. Our policy is to take into account the fact that recent graduates are less experienced in the process of writing and submitting papers.