{"title":"均场博弈和均场控制问题的统一连续时间 q-learning","authors":"Xiaoli Wei, Xiang Yu, Fengyi Yuan","doi":"arxiv-2407.04521","DOIUrl":null,"url":null,"abstract":"This paper studies the continuous-time q-learning in the mean-field\njump-diffusion models from the representative agent's perspective. To overcome\nthe challenge when the population distribution may not be directly observable,\nwe introduce the integrated q-function in decoupled form (decoupled\nIq-function) and establish its martingale characterization together with the\nvalue function, which provides a unified policy evaluation rule for both\nmean-field game (MFG) and mean-field control (MFC) problems. Moreover,\ndepending on the task to solve the MFG or MFC problem, we can employ the\ndecoupled Iq-function by different means to learn the mean-field equilibrium\npolicy or the mean-field optimal policy respectively. As a result, we devise a\nunified q-learning algorithm for both MFG and MFC problems by utilizing all\ntest policies stemming from the mean-field interactions. For several examples\nin the jump-diffusion setting, within and beyond the LQ framework, we can\nobtain the exact parameterization of the decoupled Iq-functions and the value\nfunctions, and illustrate our algorithm from the representative agent's\nperspective with satisfactory performance.","PeriodicalId":501294,"journal":{"name":"arXiv - QuantFin - Computational Finance","volume":"37 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-07-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Unified continuous-time q-learning for mean-field game and mean-field control problems\",\"authors\":\"Xiaoli Wei, Xiang Yu, Fengyi Yuan\",\"doi\":\"arxiv-2407.04521\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper studies the continuous-time q-learning in the mean-field\\njump-diffusion models from the representative agent's perspective. To overcome\\nthe challenge when the population distribution may not be directly observable,\\nwe introduce the integrated q-function in decoupled form (decoupled\\nIq-function) and establish its martingale characterization together with the\\nvalue function, which provides a unified policy evaluation rule for both\\nmean-field game (MFG) and mean-field control (MFC) problems. Moreover,\\ndepending on the task to solve the MFG or MFC problem, we can employ the\\ndecoupled Iq-function by different means to learn the mean-field equilibrium\\npolicy or the mean-field optimal policy respectively. As a result, we devise a\\nunified q-learning algorithm for both MFG and MFC problems by utilizing all\\ntest policies stemming from the mean-field interactions. For several examples\\nin the jump-diffusion setting, within and beyond the LQ framework, we can\\nobtain the exact parameterization of the decoupled Iq-functions and the value\\nfunctions, and illustrate our algorithm from the representative agent's\\nperspective with satisfactory performance.\",\"PeriodicalId\":501294,\"journal\":{\"name\":\"arXiv - QuantFin - Computational Finance\",\"volume\":\"37 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-07-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - QuantFin - Computational Finance\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2407.04521\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - QuantFin - Computational Finance","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2407.04521","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Unified continuous-time q-learning for mean-field game and mean-field control problems
This paper studies the continuous-time q-learning in the mean-field
jump-diffusion models from the representative agent's perspective. To overcome
the challenge when the population distribution may not be directly observable,
we introduce the integrated q-function in decoupled form (decoupled
Iq-function) and establish its martingale characterization together with the
value function, which provides a unified policy evaluation rule for both
mean-field game (MFG) and mean-field control (MFC) problems. Moreover,
depending on the task to solve the MFG or MFC problem, we can employ the
decoupled Iq-function by different means to learn the mean-field equilibrium
policy or the mean-field optimal policy respectively. As a result, we devise a
unified q-learning algorithm for both MFG and MFC problems by utilizing all
test policies stemming from the mean-field interactions. For several examples
in the jump-diffusion setting, within and beyond the LQ framework, we can
obtain the exact parameterization of the decoupled Iq-functions and the value
functions, and illustrate our algorithm from the representative agent's
perspective with satisfactory performance.