{"title":"固定成本和模糊需求下的最优价格与订货策略研究","authors":"Jian Yang","doi":"10.2139/ssrn.3554042","DOIUrl":null,"url":null,"abstract":"We study joint inventory-price control in which a firm chooses among a finite number of prices to influence the demand to be realized; also, the firm’s ordering activities incur fixed setup costs. While intending to settle down on an optimal price and figure out an optimal ordering policy all catering to the long-run average criterion, the firm is ambiguous about the stationary distribution of the random demand that it is to face under each price. We propose an adaptive policy in which periods are grouped into intervals, with each being associated with one single price and one single ordering policy. Pricing is based on a learning-while-doing trade-off: a price with the least number of interval visits will be chosen when this number is below a threshold associated with the total number of interval visits under all prices; otherwise, the chosen price will be one with the most promising profit prospect estimated from past experiences. Interval-wise ordering relies on an (s,S) policy most suitable for the empirical distribution learned from past experiences under the chosen price. The power at which the policy’s regret grows in the horizon length T would be below (3 +√29)/10 ≃ 0.839 even when demand patterns are fairly ambiguous. When demand realizations are further confined to a finite support, the bound would be reducible to √2/2 ≃ 0.707.","PeriodicalId":321987,"journal":{"name":"ERN: Pricing (Topic)","volume":"58 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-03-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Learning the Best Price and Ordering Policy under Fixed Costs and Ambiguous Demand\",\"authors\":\"Jian Yang\",\"doi\":\"10.2139/ssrn.3554042\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We study joint inventory-price control in which a firm chooses among a finite number of prices to influence the demand to be realized; also, the firm’s ordering activities incur fixed setup costs. While intending to settle down on an optimal price and figure out an optimal ordering policy all catering to the long-run average criterion, the firm is ambiguous about the stationary distribution of the random demand that it is to face under each price. We propose an adaptive policy in which periods are grouped into intervals, with each being associated with one single price and one single ordering policy. Pricing is based on a learning-while-doing trade-off: a price with the least number of interval visits will be chosen when this number is below a threshold associated with the total number of interval visits under all prices; otherwise, the chosen price will be one with the most promising profit prospect estimated from past experiences. Interval-wise ordering relies on an (s,S) policy most suitable for the empirical distribution learned from past experiences under the chosen price. The power at which the policy’s regret grows in the horizon length T would be below (3 +√29)/10 ≃ 0.839 even when demand patterns are fairly ambiguous. When demand realizations are further confined to a finite support, the bound would be reducible to √2/2 ≃ 0.707.\",\"PeriodicalId\":321987,\"journal\":{\"name\":\"ERN: Pricing (Topic)\",\"volume\":\"58 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-03-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ERN: Pricing (Topic)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.2139/ssrn.3554042\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ERN: Pricing (Topic)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2139/ssrn.3554042","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Learning the Best Price and Ordering Policy under Fixed Costs and Ambiguous Demand
We study joint inventory-price control in which a firm chooses among a finite number of prices to influence the demand to be realized; also, the firm’s ordering activities incur fixed setup costs. While intending to settle down on an optimal price and figure out an optimal ordering policy all catering to the long-run average criterion, the firm is ambiguous about the stationary distribution of the random demand that it is to face under each price. We propose an adaptive policy in which periods are grouped into intervals, with each being associated with one single price and one single ordering policy. Pricing is based on a learning-while-doing trade-off: a price with the least number of interval visits will be chosen when this number is below a threshold associated with the total number of interval visits under all prices; otherwise, the chosen price will be one with the most promising profit prospect estimated from past experiences. Interval-wise ordering relies on an (s,S) policy most suitable for the empirical distribution learned from past experiences under the chosen price. The power at which the policy’s regret grows in the horizon length T would be below (3 +√29)/10 ≃ 0.839 even when demand patterns are fairly ambiguous. When demand realizations are further confined to a finite support, the bound would be reducible to √2/2 ≃ 0.707.