{"title":"Mechanism design for data science","authors":"Shuchi Chawla, Jason D. Hartline, Denis Nekipelov","doi":"10.1145/2600057.2602881","DOIUrl":null,"url":null,"abstract":"The promise of data science is that if data from a system can be recorded and understood then this understanding can potentially be utilized to improve the system. Behavioral and economic data, however, is different from scientific data in that it is subjective to the system. Behavior changes when the system changes, and to predict behavior for any given system change or to optimize over system changes, the behavioral model that generates the data must be inferred from the data. The ease with which this inference can be performed generally also depends on the system. Trivially, a system that ignores behavior does not admit any inference of a behavior generating model that can be used to predict behavior in a system that is responsive to behavior. To realize the promise of data science in economic systems, a theory for the design of such systems must also incorporate the desired inference properties. Consider as an example the revenue-maximizing auctioneer. If the auctioneer has knowledge of the distribution of bidder values then she can run the first-price auction with a reserve price that is tuned to the distribution. Under some mild distributional assumptions, with the appropriate reserve price the first-price auction is revenue optimal [Myerson 1981]. Notice that the historical bid data for the first-price auction with a reserve price will in most cases not have bids for bidders whose values are below the reserve. Therefore, there is no data analysis that the auctioneer can perform that will enable properties of the distribution of bidder values below the reserve price to be inferred. It could be, nonetheless, that over time the population of potential bidders evolves and the optimal reserve price lowers. This change could go completely unnoticed in the auctioneer's data. The two main tools for optimizing revenue in an auction are reserve prices (as above) and ironing. Both of these tools cause pooling behavior (i.e., bidders with distinct values take the same action) and economic inference cannot thereafter differentiate these pooled bidders. In order to maintain the distributional knowledge necessary to be able to run a good auction in the long term, the auctioneer must sacrifice the short-term revenue by running a non-revenue-optimal auction.","PeriodicalId":203155,"journal":{"name":"Proceedings of the fifteenth ACM conference on Economics and computation","volume":"34 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"48","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the fifteenth ACM conference on Economics and computation","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2600057.2602881","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 48
Abstract
The promise of data science is that if data from a system can be recorded and understood then this understanding can potentially be utilized to improve the system. Behavioral and economic data, however, is different from scientific data in that it is subjective to the system. Behavior changes when the system changes, and to predict behavior for any given system change or to optimize over system changes, the behavioral model that generates the data must be inferred from the data. The ease with which this inference can be performed generally also depends on the system. Trivially, a system that ignores behavior does not admit any inference of a behavior generating model that can be used to predict behavior in a system that is responsive to behavior. To realize the promise of data science in economic systems, a theory for the design of such systems must also incorporate the desired inference properties. Consider as an example the revenue-maximizing auctioneer. If the auctioneer has knowledge of the distribution of bidder values then she can run the first-price auction with a reserve price that is tuned to the distribution. Under some mild distributional assumptions, with the appropriate reserve price the first-price auction is revenue optimal [Myerson 1981]. Notice that the historical bid data for the first-price auction with a reserve price will in most cases not have bids for bidders whose values are below the reserve. Therefore, there is no data analysis that the auctioneer can perform that will enable properties of the distribution of bidder values below the reserve price to be inferred. It could be, nonetheless, that over time the population of potential bidders evolves and the optimal reserve price lowers. This change could go completely unnoticed in the auctioneer's data. The two main tools for optimizing revenue in an auction are reserve prices (as above) and ironing. Both of these tools cause pooling behavior (i.e., bidders with distinct values take the same action) and economic inference cannot thereafter differentiate these pooled bidders. In order to maintain the distributional knowledge necessary to be able to run a good auction in the long term, the auctioneer must sacrifice the short-term revenue by running a non-revenue-optimal auction.