{"title":"Profit-Driven Experimental Design","authors":"Yuhao Wang, Weiming Zhu","doi":"10.2139/ssrn.3896229","DOIUrl":"https://doi.org/10.2139/ssrn.3896229","url":null,"abstract":"From intense competition to the recent pandemic, companies currently face considerable volatility in the business environment. For companies that design experiments to identify parameters of interest and make subsequent policy decisions based on these parameters, the cost of such experimentation has become increasingly comparable to the economic gains obtained, as the insights offered by an experiment can be short-lived due to changing market conditions. In this paper, we develop a general framework to quantify the total expected profit from both the experimental and postexperimental stages given an experimental strategy. The proposed framework is constructed using the asymptotic properties of the underlying parameter estimates as a channel to connect the profits from the two stages. Exploiting this framework, we calculate the difference in the total expected profits between any two experimental strategies, as well as the lower and upper bounds. Furthermore, we derive the actual and the bounds of the optimal sample size that maximizes the total expected profit. The profit and sample size bounds are independent of the ground-truth parameter value and can be calculated before conducting experiments to support experimental planning. In particular, our results demonstrate that when the postexperiment profit can be expressed as the sum of profits from N homogeneous units, the optimal sample size is on the order of O(sqrt{N}). Finally, we showcase how our framework can be applied to different business setups, such as the demand-learning newsvendor problem and the pricing problem.","PeriodicalId":376757,"journal":{"name":"Decision-Making in Operations Research eJournal","volume":"58 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117320921","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Mean-Field Multi-Agent Reinforcement Learning: A Decentralized Network Approach","authors":"Haotian Gu, Xin Guo, Xiaoli Wei, Renyuan Xu","doi":"10.2139/ssrn.3900139","DOIUrl":"https://doi.org/10.2139/ssrn.3900139","url":null,"abstract":"One of the challenges for multi-agent reinforcement learning (MARL) is designing efficient learning algorithms for a large system in which each agent has only limited or partial information of the entire system. In this system, it is desirable to learn policies of a decentralized type. A recent and promising paradigm to analyze such decentralized MARL is to take network structures into consideration. While exciting progress has been made to analyze decentralized MARL with the network of agents, often found in social networks and team video games, little is known theoretically for decentralized MARL with the network of states, frequently used for modeling self-driving vehicles, ride-sharing, and data and traffic routing. \u0000This paper proposes a framework called localized training and decentralized execution to study MARL with network of states, with homogeneous (a.k.a. mean-field type) agents. Localized training means that agents only need to collect local information in their neighboring states during the training phase; decentralized execution implies that, after the training stage, agents can execute the learned decentralized policies, which only requires knowledge of the agents' current states. The key idea is to utilize the homogeneity of agents and regroup them according to their states, thus the formulation of a networked Markov decision process with teams of agents, enabling the update of the Q-function in a localized fashion. In order to design an efficient and scalable reinforcement learning algorithm under such a framework, we adopt the actor-critic approach with over-parameterized neural networks, and establish the convergence and sample complexity for our algorithm, shown to be scalable with respect to the size of both agents and states.","PeriodicalId":376757,"journal":{"name":"Decision-Making in Operations Research eJournal","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133324699","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Engineering Social Learning: Information Design of Time-Locked Sales Campaigns for Online Platforms","authors":"Can Küçükgül, Ö. Özer, Shouqiang Wang","doi":"10.2139/ssrn.3493744","DOIUrl":"https://doi.org/10.2139/ssrn.3493744","url":null,"abstract":"Many online platforms offer time-locked sales campaigns, whereby products are sold at fixed prices for prespecified lengths of time. Platforms often display some information about previous customers’ purchase decisions during campaigns. Using a dynamic Bayesian persuasion framework, we study how a revenue-maximizing platform should optimize its information policy for such a setting. We reformulate the platform’s problem equivalently by reducing the dimensionality of its message space and proprietary history. Specifically, three messages suffice: a neutral recommendation that induces a customer to make her purchase decision according to her private signal about the product and a positive (respectively (resp.), negative) recommendation that induces her to purchase (resp., not purchase) by ignoring her signal. The platform’s proprietary history can be represented by the net purchase position, a single-dimensional summary statistic that computes the cumulative difference between purchases and nonpurchases made by customers having received the neutral recommendation. Subsequently, we establish structural properties of the optimal policy and uncover the platform’s fundamental trade-off: long-term information (and revenue) generation versus short-term revenue extraction. Further, we propose and optimize over a class of heuristic policies. The optimal heuristic policy provides only neutral recommendations up to a cutoff customer and provides only positive or negative recommendations afterward, with the recommendation being positive if and only if the net purchase position after the cutoff customer exceeds a threshold. This policy is easy to implement and numerically shown to perform well. Finally, we demonstrate the generality of our methodology and the robustness of our findings by relaxing some informational assumptions. This paper was accepted by Gabriel Weintraub, revenue management and market analytics.","PeriodicalId":376757,"journal":{"name":"Decision-Making in Operations Research eJournal","volume":"44 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128825295","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Analytical Solution to A Discrete-Time Model for Dynamic Learning and Decision-Making","authors":"Hao Zhang","doi":"10.2139/ssrn.3847049","DOIUrl":"https://doi.org/10.2139/ssrn.3847049","url":null,"abstract":"Problems concerning dynamic learning and decision making are difficult to solve analytically. We study an infinite-horizon discrete-time model with a constant unknown state that may take two possible values. As a special partially observable Markov decision process (POMDP), this model unifies several types of learning-and-doing problems such as sequential hypothesis testing, dynamic pricing with demand learning, and multiarmed bandits. We adopt a relatively new solution framework from the POMDP literature based on the backward construction of the efficient frontier(s) of continuation-value vectors. This framework accommodates different optimality criteria simultaneously. In the infinite-horizon setting, with the aid of a set of signal quality indices, the extreme points on the efficient frontier can be linked through a set of difference equations and solved analytically. The solution carries structural properties analogous to those obtained under continuous-time models, and it provides a useful tool for making new discoveries through discrete-time models. This paper was accepted by Baris Ata, stochastic models and simulation.","PeriodicalId":376757,"journal":{"name":"Decision-Making in Operations Research eJournal","volume":"108 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116173717","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The Solution of Stochastic Time-Dependent First Order Delay Differential Equations Using Block Simpson’s Methods","authors":"B. Osu, C. Chibuisi, G. Egbe, V. C. Egenkonye","doi":"10.24247/IJMCARJUN20211","DOIUrl":"https://doi.org/10.24247/IJMCARJUN20211","url":null,"abstract":"discrete schemes was worked-out in block forms to solve some stochastic time-dependent first order delay differential equations. It was observed that the scheme for step number k = 4 performed better and faster in terms of accuracy than the schemes for step number k = 3 and 2 respectively after the comparisons with their exact solutions and other existing methods","PeriodicalId":376757,"journal":{"name":"Decision-Making in Operations Research eJournal","volume":"113 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-04-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121100814","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yrjö Vartia, Antti Suoperä, K. Nieminen, Hannele Markkanen
{"title":"Chain Error as a function of Seasonal Variation","authors":"Yrjö Vartia, Antti Suoperä, K. Nieminen, Hannele Markkanen","doi":"10.2139/ssrn.3800882","DOIUrl":"https://doi.org/10.2139/ssrn.3800882","url":null,"abstract":"In this study, we examine statistically the dependence between Seasonal Variation of consumed values and the ChainErrors of corresponding excellent indices in different subgroups Ak. <br><br>First, cyclic seasonal variation of values is calculated by simple regression analysis and the ChainError is calculated by the Multi Period Identity Test. Secondly, Quadratic Means QM of these two variables (or dimensions) are used in our analysis. Question is: Does the largeness of the seasonal components in the value series, as measured by its Quadratic Mean (QM) per month during the observation period, reflect itself in the largeness of ChainErrors (CE) derived by Multi Period Identity Test? <br><br>The Quadratic Means of cyclic seasonal variation of values and ChainError (difference between base and chain strategies) both show variation found in typical average months. The dependence between these two quadratic means is shown in the paper by simple regression analysis. We show that there is a very strong statistically significant dependency between Quadratic Means of Chain Errors and Quadratic Means of values in the seasonal index. Our main empirical findings are following: Do not use any construction strategy that is somehow connected with the chain strategy. <br><br>Our test data is a scanner data from one of Finnish retail trade chains including monthly information on unit prices, quantities and values from January 2014 to December 2018, and has more than 20 000 homogeneous commodities that are comparable in quality.","PeriodicalId":376757,"journal":{"name":"Decision-Making in Operations Research eJournal","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-03-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134069239","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Novel Performance Measure for Machine Learning Classification","authors":"Mingxing Gong","doi":"10.5121/IJMIT.2021.13101","DOIUrl":"https://doi.org/10.5121/IJMIT.2021.13101","url":null,"abstract":"Machine learning models have been widely used in numerous classification problems and performance measures play a critical role in machine learning model development, selection, and evaluation. This paper covers a comprehensive overview of performance measures in machine learning classification. Besides, we proposed a framework to construct a novel evaluation metric that is based on the voting results of three performance measures, each of which has strengths and limitations. The new metric can be proved better than accuracy in terms of consistency and discriminancy.","PeriodicalId":376757,"journal":{"name":"Decision-Making in Operations Research eJournal","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-02-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131848644","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Bounds and Heuristics for Multi-Product Personalized Pricing","authors":"G. Gallego, Gerardo Berbeglia","doi":"10.2139/ssrn.3778409","DOIUrl":"https://doi.org/10.2139/ssrn.3778409","url":null,"abstract":"We present tight bounds and heuristics for personalized, multi-product pricing problems. Under mild conditions we show that offering a non-personalize price in the direction of a positive vector (factor) has a tight profit guarantee relative to optimal personalized pricing. An optimal non-personalized price is the choice factor, when known. Using a factor vector with equal components results in uniform pricing and has exceedingly mild sufficient conditions for the bound to hold. A robust factor is presented that achieves the best possible performance guarantee. As an application, our model yields a tight lower-bound on the performance of linear pricing relative to personalized non-linear pricing, and suggests effective non-linear price heuristics relative to personalized solutions. Additionally, our model provides guarantees for simple strategies such as bundle-size pricing and component-pricing with respect to personalized mixed bundling policies. Heuristics to cluster customer types are also developed with the goal of improving performance by allowing each cluster to price along its own factor. Numerical results are presented for a variety of demand models, factors and clustering heuristics. In our experiments, economically motivated factors coupled with machine learning clustering heuristics performed best.","PeriodicalId":376757,"journal":{"name":"Decision-Making in Operations Research eJournal","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131112413","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"On the Similarity of Plane Pulsed Magnetic Fields Continued From Different Coordinate Axes","authors":"V. M. Mikhailov","doi":"10.20998/2074-272x.2020.5.07","DOIUrl":"https://doi.org/10.20998/2074-272x.2020.5.07","url":null,"abstract":"Purpose. The purpose of this work is formulation of similarity conditions for plane magnetic fields at a sharp skin-effect continued in non-conducting and non-magnetic medium from different axes bounding plane surfaces of conductors. Methodology. Classic formulation of Cauchy problem for magnetic vector potential Laplace equations, mathematic physics methods and basics similarity theory are used. Two problems are considered: the problem of initial field continuation from one axis and the problem of similar field continuation form other axis on which magnetic flux density or electrical field strength in unknown. Results. Necessary and sufficient similarity conditions of plane pulsed or high-frequency magnetic fields continued from different axes of rectangular coordinates are formulated. For the given odd and even magnetic flux density distributions on axis of initial field corresponding the distributions on axis and solution of continued similar field problem are obtained. Originality. It is proved that for similarity of examined fields the proportion of corresponding vector field projections represented by dimensionless numbers in similar points of axes is necessary and sufficient.","PeriodicalId":376757,"journal":{"name":"Decision-Making in Operations Research eJournal","volume":"86 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-10-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133937542","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}