P. Atanasov, Jens Witkowski, B. Mellers, P. Tetlock
{"title":"Crowd Prediction Systems: Markets, Polls, and Elite Forecasters","authors":"P. Atanasov, Jens Witkowski, B. Mellers, P. Tetlock","doi":"10.1145/3490486.3538265","DOIUrl":null,"url":null,"abstract":"Crowd prediction systems, such as prediction markets, provide the infrastructure to elicit and combine the predictions from a group (“crowd”) of forecasters. In contrast to data-driven approaches, crowd predictions are especially useful in settings with little historical data, such as in new product development, vaccine trials, pandemics, or geopolitical events. Our contributions in this area are threefold. First, we provide an experimental evaluation of two popular types of prediction market architectures: continuous double auction (CDA) markets and logarithmic market scoring rules (LMSR) markets. To the best of our knowledge, we are the first to study these methods in a large, randomized experiment. Prior research reporting on CDA and LMSR market performance did not compare the two designs directly but had separate sets of questions for each [2]. Using data from over 1300 forecasters and 147 forecasting questions, we find that the LMSR market achieves higher accuracy than the CDA market. The LMSR market achieves 14% lower Brier scores (MCDA = 0.245, SDCDA = 0.327 versus MLMSR = 0.211, SDLMSR = 0.280; t(146) = 2.28, p = 0.024). In exploratory analyses, we find that the better performance of the LMSR market appears particularly pronounced for questions that attracted few traders as well as early in a question when only few traders had placed orders on the question. Relative to LMSR, the CDA market underperformed in thin-market settings, consistent with Robin Hanson’s motivation for the LMSR market mechanism. Second, we quantify the impact of prediction system architecture and individual forecaster track record on aggregate performance. Previous research studied how the performance of CDA prediction markets and prediction polls compares when populated by sub-elite forecasters [1] while most previous work on elite forecasters has only examined their individual performance [3]. We are the first to compare the aggregate performance of small, elite forecaster crowds across two prediction systems: LMSR prediction markets and team prediction polls. Moreover, we compare the aggregate accuracy of elite forecaster crowds to larger, sub-elite crowds using the same prediction","PeriodicalId":209859,"journal":{"name":"Proceedings of the 23rd ACM Conference on Economics and Computation","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2022-07-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 23rd ACM Conference on Economics and Computation","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3490486.3538265","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3
Abstract
Crowd prediction systems, such as prediction markets, provide the infrastructure to elicit and combine the predictions from a group (“crowd”) of forecasters. In contrast to data-driven approaches, crowd predictions are especially useful in settings with little historical data, such as in new product development, vaccine trials, pandemics, or geopolitical events. Our contributions in this area are threefold. First, we provide an experimental evaluation of two popular types of prediction market architectures: continuous double auction (CDA) markets and logarithmic market scoring rules (LMSR) markets. To the best of our knowledge, we are the first to study these methods in a large, randomized experiment. Prior research reporting on CDA and LMSR market performance did not compare the two designs directly but had separate sets of questions for each [2]. Using data from over 1300 forecasters and 147 forecasting questions, we find that the LMSR market achieves higher accuracy than the CDA market. The LMSR market achieves 14% lower Brier scores (MCDA = 0.245, SDCDA = 0.327 versus MLMSR = 0.211, SDLMSR = 0.280; t(146) = 2.28, p = 0.024). In exploratory analyses, we find that the better performance of the LMSR market appears particularly pronounced for questions that attracted few traders as well as early in a question when only few traders had placed orders on the question. Relative to LMSR, the CDA market underperformed in thin-market settings, consistent with Robin Hanson’s motivation for the LMSR market mechanism. Second, we quantify the impact of prediction system architecture and individual forecaster track record on aggregate performance. Previous research studied how the performance of CDA prediction markets and prediction polls compares when populated by sub-elite forecasters [1] while most previous work on elite forecasters has only examined their individual performance [3]. We are the first to compare the aggregate performance of small, elite forecaster crowds across two prediction systems: LMSR prediction markets and team prediction polls. Moreover, we compare the aggregate accuracy of elite forecaster crowds to larger, sub-elite crowds using the same prediction