基于强化学习的不完全信息差异化产品的动态定价

IF 2.5 Q2 ENGINEERING, INDUSTRIAL

IET Collaborative Intelligent Manufacturing Pub Date : 2022-05-24 DOI:10.1049/cim2.12050

Cheng Wang, Senbing Cui, Runhua Wu, Ziteng Wang

{"title":"基于强化学习的不完全信息差异化产品的动态定价","authors":"Cheng Wang, Senbing Cui, Runhua Wu, Ziteng Wang","doi":"10.1049/cim2.12050","DOIUrl":null,"url":null,"abstract":"<p>With the rapid development of the social economy, consumer demand is evolving towards diversification. To satisfy market demand, enterprises tend to improve competitiveness by providing differentiated products. How to price differentiated products becomes a hot topic. Traditionally, customers' preferences are assumed to be independent and identically distributed. With a known distribution, companies can easily make pricing decisions for differentiated products. However, such an assumption may be invalid in practice, especially for rapidly updating products. In this paper, a dynamic pricing policy for differentiated products with incomplete information is developed. An adaptive multi-armed bandit algorithm based on reinforcement learning is proposed to balance exploration and exploitation. Numerical examples show that the frequency of price adjustment affects the total profit significantly. Specifically, the more chances to adjust the price, the higher the total profit. Furthermore, experiments show that the dynamic pricing policy proposed in this paper outperforms other algorithms, such as Softmax and UCB1.</p>","PeriodicalId":33286,"journal":{"name":"IET Collaborative Intelligent Manufacturing","volume":"4 2","pages":"123-138"},"PeriodicalIF":2.5000,"publicationDate":"2022-05-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ietresearch.onlinelibrary.wiley.com/doi/epdf/10.1049/cim2.12050","citationCount":"0","resultStr":"{\"title\":\"Dynamic pricing of differentiated products with incomplete information based on reinforcement learning\",\"authors\":\"Cheng Wang, Senbing Cui, Runhua Wu, Ziteng Wang\",\"doi\":\"10.1049/cim2.12050\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>With the rapid development of the social economy, consumer demand is evolving towards diversification. To satisfy market demand, enterprises tend to improve competitiveness by providing differentiated products. How to price differentiated products becomes a hot topic. Traditionally, customers' preferences are assumed to be independent and identically distributed. With a known distribution, companies can easily make pricing decisions for differentiated products. However, such an assumption may be invalid in practice, especially for rapidly updating products. In this paper, a dynamic pricing policy for differentiated products with incomplete information is developed. An adaptive multi-armed bandit algorithm based on reinforcement learning is proposed to balance exploration and exploitation. Numerical examples show that the frequency of price adjustment affects the total profit significantly. Specifically, the more chances to adjust the price, the higher the total profit. Furthermore, experiments show that the dynamic pricing policy proposed in this paper outperforms other algorithms, such as Softmax and UCB1.</p>\",\"PeriodicalId\":33286,\"journal\":{\"name\":\"IET Collaborative Intelligent Manufacturing\",\"volume\":\"4 2\",\"pages\":\"123-138\"},\"PeriodicalIF\":2.5000,\"publicationDate\":\"2022-05-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://ietresearch.onlinelibrary.wiley.com/doi/epdf/10.1049/cim2.12050\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IET Collaborative Intelligent Manufacturing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1049/cim2.12050\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"ENGINEERING, INDUSTRIAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IET Collaborative Intelligent Manufacturing","FirstCategoryId":"1085","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1049/cim2.12050","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, INDUSTRIAL","Score":null,"Total":0}

引用次数: 0

摘要

随着社会经济的快速发展，消费需求也在向多元化发展。为了满足市场需求，企业倾向于通过提供差异化的产品来提高竞争力。如何对产品进行差异化定价成为一个热门话题。传统上，顾客的偏好被认为是独立的、同分布的。有了已知的分布，公司可以很容易地为差异化产品做出定价决策。然而，这种假设在实践中可能是无效的，特别是对于快速更新的产品。本文研究了不完全信息条件下差异化产品的动态定价策略。提出了一种基于强化学习的自适应多臂强盗算法来平衡探索和利用。数值算例表明，价格调整频率对总利润有显著影响。具体来说，调整价格的机会越多，总利润就越高。此外，实验表明，本文提出的动态定价策略优于其他算法，如Softmax和UCB1。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

Dynamic pricing of differentiated products with incomplete information based on reinforcement learning

查看原文本刊更多论文

Dynamic pricing of differentiated products with incomplete information based on reinforcement learning

With the rapid development of the social economy, consumer demand is evolving towards diversification. To satisfy market demand, enterprises tend to improve competitiveness by providing differentiated products. How to price differentiated products becomes a hot topic. Traditionally, customers' preferences are assumed to be independent and identically distributed. With a known distribution, companies can easily make pricing decisions for differentiated products. However, such an assumption may be invalid in practice, especially for rapidly updating products. In this paper, a dynamic pricing policy for differentiated products with incomplete information is developed. An adaptive multi-armed bandit algorithm based on reinforcement learning is proposed to balance exploration and exploitation. Numerical examples show that the frequency of price adjustment affects the total profit significantly. Specifically, the more chances to adjust the price, the higher the total profit. Furthermore, experiments show that the dynamic pricing policy proposed in this paper outperforms other algorithms, such as Softmax and UCB1.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IET Collaborative Intelligent Manufacturing Engineering-Industrial and Manufacturing Engineering

CiteScore

9.10

自引率

2.40%

发文量

审稿时长

20 weeks

期刊介绍： IET Collaborative Intelligent Manufacturing is a Gold Open Access journal that focuses on the development of efficient and adaptive production and distribution systems. It aims to meet the ever-changing market demands by publishing original research on methodologies and techniques for the application of intelligence, data science, and emerging information and communication technologies in various aspects of manufacturing, such as design, modeling, simulation, planning, and optimization of products, processes, production, and assembly. The journal is indexed in COMPENDEX (Elsevier), Directory of Open Access Journals (DOAJ), Emerging Sources Citation Index (Clarivate Analytics), INSPEC (IET), SCOPUS (Elsevier) and Web of Science (Clarivate Analytics).