Ze Hu , Ziqing Zhu , Xiang Wei , Ka Wing Chan , Siqi Bu
{"title":"未来智能零售市场实时定价和需求响应中的混合策略纳什均衡分析","authors":"Ze Hu , Ziqing Zhu , Xiang Wei , Ka Wing Chan , Siqi Bu","doi":"10.1016/j.apenergy.2025.125815","DOIUrl":null,"url":null,"abstract":"<div><div>Real-time pricing and demand response (RTP-DR) is a key problem for profit-maximizing and policy-making in the deregulated retail electricity market (REM). However, previous studies overlooked the non-convexity and multi-equilibria caused by the network constraints and the temporally-related non-linear power consumption characteristics of end-users (EUs) in a privacy-protected environment. This paper employs mixed strategy Nash equilibrium (MSNE) to analyze the multiple equilibria in the non-convex game of the RTP-DR problem, providing a comprehensive view of the potential transaction results. A novel multi-agent Q-learning algorithm is developed to estimate subgame perfect equilibrium (SPE) in the proposed game. As a multi-agent reinforcement learning (MARL) algorithm, it enables players in the game to be rational “agents” that learn from “trial and error” to make optimal decisions across time periods. Moreover, the proposed algorithm has a bi-level structure and adopts probability distributions to denote Q-values, representing the belief in environmental response. Through validation on a Northern Illinois utility dataset, our proposed approach demonstrates notable advantages over benchmark algorithms. Specifically, it provides more profitable pricing decisions for monopoly retailers in REM, leading to strategic outcomes for EUs. The numerical results also find that multiple optimal pricing decisions over a day exist simultaneously by providing almost identical profits to the retailer, while leading to different energy consumption patterns and also significant differences in total energy usage on the demand side.</div></div>","PeriodicalId":246,"journal":{"name":"Applied Energy","volume":"391 ","pages":"Article 125815"},"PeriodicalIF":10.1000,"publicationDate":"2025-04-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Mixed strategy Nash equilibrium analysis in real-time pricing and demand response for future smart retail market\",\"authors\":\"Ze Hu , Ziqing Zhu , Xiang Wei , Ka Wing Chan , Siqi Bu\",\"doi\":\"10.1016/j.apenergy.2025.125815\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Real-time pricing and demand response (RTP-DR) is a key problem for profit-maximizing and policy-making in the deregulated retail electricity market (REM). However, previous studies overlooked the non-convexity and multi-equilibria caused by the network constraints and the temporally-related non-linear power consumption characteristics of end-users (EUs) in a privacy-protected environment. This paper employs mixed strategy Nash equilibrium (MSNE) to analyze the multiple equilibria in the non-convex game of the RTP-DR problem, providing a comprehensive view of the potential transaction results. A novel multi-agent Q-learning algorithm is developed to estimate subgame perfect equilibrium (SPE) in the proposed game. As a multi-agent reinforcement learning (MARL) algorithm, it enables players in the game to be rational “agents” that learn from “trial and error” to make optimal decisions across time periods. Moreover, the proposed algorithm has a bi-level structure and adopts probability distributions to denote Q-values, representing the belief in environmental response. Through validation on a Northern Illinois utility dataset, our proposed approach demonstrates notable advantages over benchmark algorithms. Specifically, it provides more profitable pricing decisions for monopoly retailers in REM, leading to strategic outcomes for EUs. The numerical results also find that multiple optimal pricing decisions over a day exist simultaneously by providing almost identical profits to the retailer, while leading to different energy consumption patterns and also significant differences in total energy usage on the demand side.</div></div>\",\"PeriodicalId\":246,\"journal\":{\"name\":\"Applied Energy\",\"volume\":\"391 \",\"pages\":\"Article 125815\"},\"PeriodicalIF\":10.1000,\"publicationDate\":\"2025-04-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Applied Energy\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0306261925005458\",\"RegionNum\":1,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ENERGY & FUELS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Energy","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0306261925005458","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENERGY & FUELS","Score":null,"Total":0}
Mixed strategy Nash equilibrium analysis in real-time pricing and demand response for future smart retail market
Real-time pricing and demand response (RTP-DR) is a key problem for profit-maximizing and policy-making in the deregulated retail electricity market (REM). However, previous studies overlooked the non-convexity and multi-equilibria caused by the network constraints and the temporally-related non-linear power consumption characteristics of end-users (EUs) in a privacy-protected environment. This paper employs mixed strategy Nash equilibrium (MSNE) to analyze the multiple equilibria in the non-convex game of the RTP-DR problem, providing a comprehensive view of the potential transaction results. A novel multi-agent Q-learning algorithm is developed to estimate subgame perfect equilibrium (SPE) in the proposed game. As a multi-agent reinforcement learning (MARL) algorithm, it enables players in the game to be rational “agents” that learn from “trial and error” to make optimal decisions across time periods. Moreover, the proposed algorithm has a bi-level structure and adopts probability distributions to denote Q-values, representing the belief in environmental response. Through validation on a Northern Illinois utility dataset, our proposed approach demonstrates notable advantages over benchmark algorithms. Specifically, it provides more profitable pricing decisions for monopoly retailers in REM, leading to strategic outcomes for EUs. The numerical results also find that multiple optimal pricing decisions over a day exist simultaneously by providing almost identical profits to the retailer, while leading to different energy consumption patterns and also significant differences in total energy usage on the demand side.
期刊介绍:
Applied Energy serves as a platform for sharing innovations, research, development, and demonstrations in energy conversion, conservation, and sustainable energy systems. The journal covers topics such as optimal energy resource use, environmental pollutant mitigation, and energy process analysis. It welcomes original papers, review articles, technical notes, and letters to the editor. Authors are encouraged to submit manuscripts that bridge the gap between research, development, and implementation. The journal addresses a wide spectrum of topics, including fossil and renewable energy technologies, energy economics, and environmental impacts. Applied Energy also explores modeling and forecasting, conservation strategies, and the social and economic implications of energy policies, including climate change mitigation. It is complemented by the open-access journal Advances in Applied Energy.