{"title":"《反应:评吕斯蒂克和泰特洛克》(2021)","authors":"Jan Kwakkel, Willem Auping","doi":"10.1002/ffo2.84","DOIUrl":null,"url":null,"abstract":"<p>Lustick and Tetlock (<span>2021</span>) argue for the use of theory-guided simulation for aiding geopolitical planning and decision-making. This is a very welcome contribution, as in our experience, geopolitical analyses are often qualitative in nature and prone to group think. The complexities of geopolitical issues, however, really call for the use of theory-guided simulations. These issues are too complex for mental simulation (Atkins et al., <span>2002</span>; Sterman, <span>1989</span>, <span>1994</span>). Dynamic simulations, if properly grounded in appropriate theories and well-motivated assumptions, can derive the possible dynamics from interacting nonlinear processes, and thus aid human reasoning about system behavior (Sterman, <span>2002</span>). Moreover, since geopolitical issues are subject to uncertainty, scarce data, and conflicting information, using an ensemble modeling approach is appropriate. An ensemble of simulations enables reasoning across alternative assumptions consistent with the available data and information. Such an ensemble can capture much more of the available theories, information, and educated guesses than any single model in isolation (Bankes, <span>2002</span>). With the rising computational power, ensemble modeling is increasingly a feasible research strategy.</p><p>Despite our broad agreement with Lustick and Tetlock (<span>2021</span>), we have three major comments on their work. First, from the broader perspective of modeling and simulation, they offer little that is truly novel or surprising. The envisioned approach of ensemble simulations is already well established under the label of exploratory modeling. By not engaging with this literature, the authors have deprived themselves from a rich set of analytical techniques that could have substantially strengthened their case study, as well as relevant theories and concepts which would have strengthened the appeal of the manifesto. Second, we content that validating simulation models of complex systems with partially open system boundaries should focus of perceived usefulness, not supposedly predictive accuracy captured through brier scores. Third, in interpreting results from simulation models it is possible and useful to try and increase understanding between the system's structural characteristics and theories, instead of using simulations as point predictions.</p><p>There is ample literature that has emerged over the last 30 years on the use of computational experimentation with simulation models to aid planning and decision-making. Hodges (<span>1991</span>) identified six things that could be done with simulation models in the absence of good data. Hodges and Dewar (<span>1992</span>) identified a seventh use case. Bankes (<span>1993</span>) moved away from enumerating the number of use cases, and simply spoke of exploratory modeling. Since these formative ideas from the early 1990s, a large body of literature on exploratory modeling has emerged (see, e.g., Kwakkel, <span>2017</span>; Moallemi et al., <span>2020</span> for detailed overviews). Perhaps most importantly, exploratory modeling has become one of the cornerstones of the field of decision-making under deep uncertainty (Marchau et al., <span>2019</span>).</p><p>It is quite unfortunately that the authors seem to be utterly unaware, with the exception of Davis et al. (<span>2007</span>), of this rich existing body of literature on exploratory modeling. The reported case study could have benefitted from various techniques for model analysis like scenario discovery (Bryant & Lempert, <span>2010</span>; Kwakkel & Jaxa-Rozen, <span>2016</span>) and global sensitivity analysis (Razavi et al., <span>2021</span>). In particular, dynamic scenario discovery (Kwakkel et al., <span>2013</span>; Steinmann et al., <span>2020</span>) could have been useful in identifying characteristics dynamics over time, as well as their conditions for occurring. Dynamics scenario discovery and global sensitivity analysis could substantially enhance the mapping of the state space of possibilities by identifying the conditions under which the different possibilities would occur. So, in their Bangladeshi case study, these techniques could have helped in answering questions about the representativeness of the three scenarios for the entire ensemble, and the regions of the model input space from which these three scenarios originate. This, in turn, would enable grounding the resulting narratives much more strongly in the underlying model (Greeven et al., <span>2016</span>).</p><p>Lustick and Tetlock (<span>2021</span>) strongly emphasize the importance of theoretically grounding the simulation models. Unfortunately, for many social phenomena, we have a variety of possible theoretical explanations. Rather than narrowing down on a preferred theory, exploratory modeling is increasingly being used for opening up the conversation by exploring over a range of alternative theories. For example, Pruyt and Kwakkel (<span>2014</span>) explore the rise of home-grown terrorism following two rival theories and use this to identify policy interventions that work in either case. Both Mitchell (<span>2003</span>) and Page (<span>2018</span>) argue more generally for using a plurality of structurally different simulation models because such an ensemble sheds a rich light on the phenomenon under study, which is particularly appropriate in case of complex systems.</p><p>Next to the potential of various techniques that have emerged over the last two decades for exploratory modeling, there have also been ample relevant theoretical developments that are directly relevant to the manifesto. Most importantly, what exactly is the status of individual computational experiments, and what is the status of the ensemble? A key question in this context is whether one can take the frequencies of occurrence of different classes of model results within an ensemble, and directly translate these frequencies into probabilities that hold in the real world. For obvious reasons, you cannot. The frequencies found in model land are critically dependent on the way in which one samples the model input space. Are the various dimensions that characterize the model input space taken as independent, or are correlations assumed? What prior is assumed for each of the dimensions? Modeling choices related to these questions strongly alter the observed frequencies (Quinn et al., <span>2020</span>). So, instead of treating frequencies of possibilities in model land as probabilities that also hold in the real world, Shortridge and Zaitchik (<span>2018</span>) and Taner et al. (<span>2019</span>) explore ways of using scenario discovery results as input to an <i>a posteriori</i> Bayesian probabilistic assessment. This might offer an alternative way of synthesizing the inside and outside view as discussed by Lustick and Tetlock (<span>2021</span>)</p><p>The authors use phrases like validation and prediction. These phrases are deeply problematic and should be used with care. Hodges and Dewar (<span>1992</span>), adopting a very narrow understanding of what proper validation is, argue that most simulation models used for policymaking cannot be validated. A similar, more modest, position is adopted by Oreskes et al. (<span>1994</span>), according to whom validation is only possible for closed systems. However, state stability and basically all other geopolitical issues, do not take place within closed systems. Historical replication, and thus also trying to establish predictive accuracy, is problematic as a source of evidence for validation because of equifinality. Building on this, Oreskes (<span>1998</span>) argues that one cannot proof the predictive reliability of models of complex systems prior to their actual use. Like with natural systems, in case of geopolitical systems one is confronted with limits to measurability, accessibility and a lack of spatiotemporal stationarity. The yardstick for model quality is thus not to be found in its purported predictive quality (e.g., brier scores), but rather in its perceived usefulness for guiding planning and decision-making.</p><p>Ensembles of simulation experiments can meaningfully inform decision-making beyond merely the summary frequencies of types of outcomes. For example, the absence of a particular type of outcome in an ensemble can be highly decision relevant. An example of this is offered by Auping et al. (<span>2014</span>) who analyzed the geopolitical implications of the US’ shale revolution. In short, they first generated an ensemble of energy price dynamics at a global scale. A set of exemplars from this ensemble was subsequently used as input to a state stability model for stress testing a wide variety of rentier states to identify which of those were most vulnerable to price swings. The global energy model, across the ensemble showed low oil prices for the decade of 2010–2020. Under no condition explored by the model would there be high prices. This ran counter to conventional wisdom at, for example, NATO, where the results were presented. However, because of the underlying theory-informed causal model, there was a very good explanation for the absence of such high price simulation runs. This example highlights the value of analyzing the ensemble in more detail, and carefully explaining how the different types of dynamics arise out of the structural assumptions within the simulation models.</p><p>We are in broad agreement with the plea of Lustick and Tetlock (<span>2021</span>) for the use of theory-guided simulations in aiding decision-making on complex issues, and content that this appeal has a reach well beyond geopolitical questions. Saltelli et al. (<span>2019</span>) lament the fact that simulation and modeling is not its own field (c.f., Padilla et al., <span>2017</span>), yet is being practiced across many fields, because it hampers the development of shared best practices. The simulation manifesto of Lustick and Tetlock (<span>2021</span>) unfortunately exemplifies this regrettable state of affairs because it seems that they are trying to reinvent the wheel. We have highlighted here primarily the rich body of literature on exploratory modeling which could have meaningfully informed and strengthened the manifesto.</p><p>Other examples where Lustick and Tetlock (<span>2021</span>) bypass much relevant literature could easily have been given as well. For example, both the shale gas study and the homegrown terrorism study relied on system dynamics models, rather than agent-based models, as advocated by Lustick and Tetlock (<span>2021</span>). Rather than fruitlessly debating which is the better approach, we content that both can be useful for representing key causal mechanisms, and both produce emergent aggregate dynamics. The primary difference is the level at which causal mechanisms are described. System dynamics models use a lumped, mesoscopic representation where the aggregate dynamics are an emergent property of interacting feedback loops involving accumulation and delay. If there is no obvious way of lumping, or compartmentalizing, your agents, a microscopic representation where the aggregate dynamics arise out of local interactions among heterogeneous agents as used in agent-based models is more suitable (c.f., Rahmandad & Sterman, <span>2008</span>). Similar remarks could be made with respect to federating models, which under labels such as multimodeling, multimodel ecologies, and co-simulation is being practices in various scientific fields (Nikolic et al., <span>2019</span>).</p>","PeriodicalId":100567,"journal":{"name":"FUTURES & FORESIGHT SCIENCE","volume":"3 2","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2021-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1002/ffo2.84","citationCount":"0","resultStr":"{\"title\":\"Reaction: A commentary on Lustick and Tetlock (2021)\",\"authors\":\"Jan Kwakkel, Willem Auping\",\"doi\":\"10.1002/ffo2.84\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>Lustick and Tetlock (<span>2021</span>) argue for the use of theory-guided simulation for aiding geopolitical planning and decision-making. This is a very welcome contribution, as in our experience, geopolitical analyses are often qualitative in nature and prone to group think. The complexities of geopolitical issues, however, really call for the use of theory-guided simulations. These issues are too complex for mental simulation (Atkins et al., <span>2002</span>; Sterman, <span>1989</span>, <span>1994</span>). Dynamic simulations, if properly grounded in appropriate theories and well-motivated assumptions, can derive the possible dynamics from interacting nonlinear processes, and thus aid human reasoning about system behavior (Sterman, <span>2002</span>). Moreover, since geopolitical issues are subject to uncertainty, scarce data, and conflicting information, using an ensemble modeling approach is appropriate. An ensemble of simulations enables reasoning across alternative assumptions consistent with the available data and information. Such an ensemble can capture much more of the available theories, information, and educated guesses than any single model in isolation (Bankes, <span>2002</span>). With the rising computational power, ensemble modeling is increasingly a feasible research strategy.</p><p>Despite our broad agreement with Lustick and Tetlock (<span>2021</span>), we have three major comments on their work. First, from the broader perspective of modeling and simulation, they offer little that is truly novel or surprising. The envisioned approach of ensemble simulations is already well established under the label of exploratory modeling. By not engaging with this literature, the authors have deprived themselves from a rich set of analytical techniques that could have substantially strengthened their case study, as well as relevant theories and concepts which would have strengthened the appeal of the manifesto. Second, we content that validating simulation models of complex systems with partially open system boundaries should focus of perceived usefulness, not supposedly predictive accuracy captured through brier scores. Third, in interpreting results from simulation models it is possible and useful to try and increase understanding between the system's structural characteristics and theories, instead of using simulations as point predictions.</p><p>There is ample literature that has emerged over the last 30 years on the use of computational experimentation with simulation models to aid planning and decision-making. Hodges (<span>1991</span>) identified six things that could be done with simulation models in the absence of good data. Hodges and Dewar (<span>1992</span>) identified a seventh use case. Bankes (<span>1993</span>) moved away from enumerating the number of use cases, and simply spoke of exploratory modeling. Since these formative ideas from the early 1990s, a large body of literature on exploratory modeling has emerged (see, e.g., Kwakkel, <span>2017</span>; Moallemi et al., <span>2020</span> for detailed overviews). Perhaps most importantly, exploratory modeling has become one of the cornerstones of the field of decision-making under deep uncertainty (Marchau et al., <span>2019</span>).</p><p>It is quite unfortunately that the authors seem to be utterly unaware, with the exception of Davis et al. (<span>2007</span>), of this rich existing body of literature on exploratory modeling. The reported case study could have benefitted from various techniques for model analysis like scenario discovery (Bryant & Lempert, <span>2010</span>; Kwakkel & Jaxa-Rozen, <span>2016</span>) and global sensitivity analysis (Razavi et al., <span>2021</span>). In particular, dynamic scenario discovery (Kwakkel et al., <span>2013</span>; Steinmann et al., <span>2020</span>) could have been useful in identifying characteristics dynamics over time, as well as their conditions for occurring. Dynamics scenario discovery and global sensitivity analysis could substantially enhance the mapping of the state space of possibilities by identifying the conditions under which the different possibilities would occur. So, in their Bangladeshi case study, these techniques could have helped in answering questions about the representativeness of the three scenarios for the entire ensemble, and the regions of the model input space from which these three scenarios originate. This, in turn, would enable grounding the resulting narratives much more strongly in the underlying model (Greeven et al., <span>2016</span>).</p><p>Lustick and Tetlock (<span>2021</span>) strongly emphasize the importance of theoretically grounding the simulation models. Unfortunately, for many social phenomena, we have a variety of possible theoretical explanations. Rather than narrowing down on a preferred theory, exploratory modeling is increasingly being used for opening up the conversation by exploring over a range of alternative theories. For example, Pruyt and Kwakkel (<span>2014</span>) explore the rise of home-grown terrorism following two rival theories and use this to identify policy interventions that work in either case. Both Mitchell (<span>2003</span>) and Page (<span>2018</span>) argue more generally for using a plurality of structurally different simulation models because such an ensemble sheds a rich light on the phenomenon under study, which is particularly appropriate in case of complex systems.</p><p>Next to the potential of various techniques that have emerged over the last two decades for exploratory modeling, there have also been ample relevant theoretical developments that are directly relevant to the manifesto. Most importantly, what exactly is the status of individual computational experiments, and what is the status of the ensemble? A key question in this context is whether one can take the frequencies of occurrence of different classes of model results within an ensemble, and directly translate these frequencies into probabilities that hold in the real world. For obvious reasons, you cannot. The frequencies found in model land are critically dependent on the way in which one samples the model input space. Are the various dimensions that characterize the model input space taken as independent, or are correlations assumed? What prior is assumed for each of the dimensions? Modeling choices related to these questions strongly alter the observed frequencies (Quinn et al., <span>2020</span>). So, instead of treating frequencies of possibilities in model land as probabilities that also hold in the real world, Shortridge and Zaitchik (<span>2018</span>) and Taner et al. (<span>2019</span>) explore ways of using scenario discovery results as input to an <i>a posteriori</i> Bayesian probabilistic assessment. This might offer an alternative way of synthesizing the inside and outside view as discussed by Lustick and Tetlock (<span>2021</span>)</p><p>The authors use phrases like validation and prediction. These phrases are deeply problematic and should be used with care. Hodges and Dewar (<span>1992</span>), adopting a very narrow understanding of what proper validation is, argue that most simulation models used for policymaking cannot be validated. A similar, more modest, position is adopted by Oreskes et al. (<span>1994</span>), according to whom validation is only possible for closed systems. However, state stability and basically all other geopolitical issues, do not take place within closed systems. Historical replication, and thus also trying to establish predictive accuracy, is problematic as a source of evidence for validation because of equifinality. Building on this, Oreskes (<span>1998</span>) argues that one cannot proof the predictive reliability of models of complex systems prior to their actual use. Like with natural systems, in case of geopolitical systems one is confronted with limits to measurability, accessibility and a lack of spatiotemporal stationarity. The yardstick for model quality is thus not to be found in its purported predictive quality (e.g., brier scores), but rather in its perceived usefulness for guiding planning and decision-making.</p><p>Ensembles of simulation experiments can meaningfully inform decision-making beyond merely the summary frequencies of types of outcomes. For example, the absence of a particular type of outcome in an ensemble can be highly decision relevant. An example of this is offered by Auping et al. (<span>2014</span>) who analyzed the geopolitical implications of the US’ shale revolution. In short, they first generated an ensemble of energy price dynamics at a global scale. A set of exemplars from this ensemble was subsequently used as input to a state stability model for stress testing a wide variety of rentier states to identify which of those were most vulnerable to price swings. The global energy model, across the ensemble showed low oil prices for the decade of 2010–2020. Under no condition explored by the model would there be high prices. This ran counter to conventional wisdom at, for example, NATO, where the results were presented. However, because of the underlying theory-informed causal model, there was a very good explanation for the absence of such high price simulation runs. This example highlights the value of analyzing the ensemble in more detail, and carefully explaining how the different types of dynamics arise out of the structural assumptions within the simulation models.</p><p>We are in broad agreement with the plea of Lustick and Tetlock (<span>2021</span>) for the use of theory-guided simulations in aiding decision-making on complex issues, and content that this appeal has a reach well beyond geopolitical questions. Saltelli et al. (<span>2019</span>) lament the fact that simulation and modeling is not its own field (c.f., Padilla et al., <span>2017</span>), yet is being practiced across many fields, because it hampers the development of shared best practices. The simulation manifesto of Lustick and Tetlock (<span>2021</span>) unfortunately exemplifies this regrettable state of affairs because it seems that they are trying to reinvent the wheel. We have highlighted here primarily the rich body of literature on exploratory modeling which could have meaningfully informed and strengthened the manifesto.</p><p>Other examples where Lustick and Tetlock (<span>2021</span>) bypass much relevant literature could easily have been given as well. For example, both the shale gas study and the homegrown terrorism study relied on system dynamics models, rather than agent-based models, as advocated by Lustick and Tetlock (<span>2021</span>). Rather than fruitlessly debating which is the better approach, we content that both can be useful for representing key causal mechanisms, and both produce emergent aggregate dynamics. The primary difference is the level at which causal mechanisms are described. System dynamics models use a lumped, mesoscopic representation where the aggregate dynamics are an emergent property of interacting feedback loops involving accumulation and delay. If there is no obvious way of lumping, or compartmentalizing, your agents, a microscopic representation where the aggregate dynamics arise out of local interactions among heterogeneous agents as used in agent-based models is more suitable (c.f., Rahmandad & Sterman, <span>2008</span>). Similar remarks could be made with respect to federating models, which under labels such as multimodeling, multimodel ecologies, and co-simulation is being practices in various scientific fields (Nikolic et al., <span>2019</span>).</p>\",\"PeriodicalId\":100567,\"journal\":{\"name\":\"FUTURES & FORESIGHT SCIENCE\",\"volume\":\"3 2\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-04-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://sci-hub-pdf.com/10.1002/ffo2.84\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"FUTURES & FORESIGHT SCIENCE\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1002/ffo2.84\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"FUTURES & FORESIGHT SCIENCE","FirstCategoryId":"1085","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/ffo2.84","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
摘要
Mitchell(2003)和Page(2018)都更普遍地主张使用多个结构不同的模拟模型,因为这样的集合可以丰富地揭示所研究的现象,这在复杂系统的情况下特别合适。除了在过去二十年中出现的用于探索性建模的各种技术的潜力之外,也有大量与宣言直接相关的相关理论发展。最重要的是,个体计算实验的具体状态是什么,整体的状态是什么?在这种情况下的一个关键问题是,是否可以在一个集合中取不同类别的模型结果出现的频率,并直接将这些频率转化为在现实世界中保持不变的概率。很明显,你做不到。在模型中发现的频率严重依赖于对模型输入空间进行采样的方式。表征模型输入空间的各种维度是独立的,还是假设相关性?每个维度的先验假设是什么?与这些问题相关的建模选择强烈地改变了观测到的频率(Quinn et al., 2020)。因此,Shortridge和Zaitchik(2018)以及Taner等人(2019)并没有将模型中的可能性频率视为在现实世界中也存在的概率,而是探索了使用场景发现结果作为后验贝叶斯概率评估输入的方法。正如Lustick和Tetlock(2021)所讨论的那样,这可能提供一种综合内部和外部视图的替代方法。作者使用验证和预测等短语。这些短语很有问题,应该谨慎使用。Hodges和Dewar(1992)对什么是适当的验证的理解非常狭隘,他们认为大多数用于政策制定的模拟模型都无法得到验证。Oreskes等人(1994)也采取了类似的、更温和的立场,根据他们的观点,验证只可能用于封闭系统。然而,国家稳定和基本上所有其他地缘政治问题都不会在封闭的系统内发生。历史复制,因此也试图建立预测的准确性,是有问题的证据来源验证,因为等性。在此基础上,Oreskes(1998)认为,在实际使用复杂系统模型之前,人们无法证明其预测可靠性。与自然系统一样,在地缘政治系统中,人们面临着可测量性、可及性和缺乏时空平稳性的限制。因此,模型质量的标准不是在其预期的预测质量(例如,简单的分数)中找到的,而是在其对指导计划和决策的感知有用性中找到的。模拟实验的集合可以为决策提供有意义的信息,而不仅仅是结果类型的总结频率。例如,在集成中缺少特定类型的结果可能与决策高度相关。Auping等人(2014)提供了一个例子,他们分析了美国页岩革命的地缘政治影响。简而言之,它们首先产生了全球范围内能源价格动态的集合。随后,从该集合中获得的一组示例被用作状态稳定性模型的输入,用于对各种食利者状态进行压力测试,以确定哪些状态最容易受到价格波动的影响。整体的全球能源模型显示,2010-2020年这十年油价处于低位。在模型所探索的任何条件下都不会出现高价格。这与北约等国的传统观念背道而驰,而北约正是在那里公布了选举结果。然而,由于基于理论的因果模型,对于没有出现如此高的价格模拟运行有一个很好的解释。这个例子强调了更详细地分析集成的价值,并仔细解释了不同类型的动力学是如何从模拟模型中的结构假设中产生的。我们广泛同意Lustick和Tetlock(2021)的请求,即使用理论指导的模拟来帮助复杂问题的决策,并认为这种呼吁的范围远远超出了地缘政治问题。Saltelli等人(2019)哀叹,模拟和建模不是它自己的领域(cf, Padilla等人,2017),但正在许多领域进行实践,因为它阻碍了共享最佳实践的发展。Lustick和Tetlock(2021)的模拟宣言不幸地体现了这种令人遗憾的状况,因为他们似乎在试图重新发明轮子。我们在这里主要强调了关于探索性建模的丰富文献,这些文献可以有意义地告知和加强宣言。 Lustick和Tetlock(2021)绕过许多相关文献的其他例子也很容易给出。例如,页岩气研究和本土恐怖主义研究都依赖于系统动力学模型,而不是Lustick和Tetlock(2021)所倡导的基于主体的模型。与其毫无结果地争论哪一种方法更好,我们认为两者都可以用于表示关键的因果机制,并且都可以产生紧急的总体动态。主要区别在于描述因果机制的层次。系统动力学模型使用集中的介观表示,其中聚合动力学是涉及积累和延迟的交互反馈回路的紧急属性。如果没有明显的集中或划分代理的方法,那么在基于代理的模型中使用的由异质代理之间的局部相互作用产生聚合动态的微观表示更合适(cf, Rahmandad &斯特曼,2008)。关于联邦模型也可以发表类似的评论,在多建模、多模型生态学和联合模拟等标签下,联邦模型正在各个科学领域得到实践(Nikolic等人,2019)。
Reaction: A commentary on Lustick and Tetlock (2021)
Lustick and Tetlock (2021) argue for the use of theory-guided simulation for aiding geopolitical planning and decision-making. This is a very welcome contribution, as in our experience, geopolitical analyses are often qualitative in nature and prone to group think. The complexities of geopolitical issues, however, really call for the use of theory-guided simulations. These issues are too complex for mental simulation (Atkins et al., 2002; Sterman, 1989, 1994). Dynamic simulations, if properly grounded in appropriate theories and well-motivated assumptions, can derive the possible dynamics from interacting nonlinear processes, and thus aid human reasoning about system behavior (Sterman, 2002). Moreover, since geopolitical issues are subject to uncertainty, scarce data, and conflicting information, using an ensemble modeling approach is appropriate. An ensemble of simulations enables reasoning across alternative assumptions consistent with the available data and information. Such an ensemble can capture much more of the available theories, information, and educated guesses than any single model in isolation (Bankes, 2002). With the rising computational power, ensemble modeling is increasingly a feasible research strategy.
Despite our broad agreement with Lustick and Tetlock (2021), we have three major comments on their work. First, from the broader perspective of modeling and simulation, they offer little that is truly novel or surprising. The envisioned approach of ensemble simulations is already well established under the label of exploratory modeling. By not engaging with this literature, the authors have deprived themselves from a rich set of analytical techniques that could have substantially strengthened their case study, as well as relevant theories and concepts which would have strengthened the appeal of the manifesto. Second, we content that validating simulation models of complex systems with partially open system boundaries should focus of perceived usefulness, not supposedly predictive accuracy captured through brier scores. Third, in interpreting results from simulation models it is possible and useful to try and increase understanding between the system's structural characteristics and theories, instead of using simulations as point predictions.
There is ample literature that has emerged over the last 30 years on the use of computational experimentation with simulation models to aid planning and decision-making. Hodges (1991) identified six things that could be done with simulation models in the absence of good data. Hodges and Dewar (1992) identified a seventh use case. Bankes (1993) moved away from enumerating the number of use cases, and simply spoke of exploratory modeling. Since these formative ideas from the early 1990s, a large body of literature on exploratory modeling has emerged (see, e.g., Kwakkel, 2017; Moallemi et al., 2020 for detailed overviews). Perhaps most importantly, exploratory modeling has become one of the cornerstones of the field of decision-making under deep uncertainty (Marchau et al., 2019).
It is quite unfortunately that the authors seem to be utterly unaware, with the exception of Davis et al. (2007), of this rich existing body of literature on exploratory modeling. The reported case study could have benefitted from various techniques for model analysis like scenario discovery (Bryant & Lempert, 2010; Kwakkel & Jaxa-Rozen, 2016) and global sensitivity analysis (Razavi et al., 2021). In particular, dynamic scenario discovery (Kwakkel et al., 2013; Steinmann et al., 2020) could have been useful in identifying characteristics dynamics over time, as well as their conditions for occurring. Dynamics scenario discovery and global sensitivity analysis could substantially enhance the mapping of the state space of possibilities by identifying the conditions under which the different possibilities would occur. So, in their Bangladeshi case study, these techniques could have helped in answering questions about the representativeness of the three scenarios for the entire ensemble, and the regions of the model input space from which these three scenarios originate. This, in turn, would enable grounding the resulting narratives much more strongly in the underlying model (Greeven et al., 2016).
Lustick and Tetlock (2021) strongly emphasize the importance of theoretically grounding the simulation models. Unfortunately, for many social phenomena, we have a variety of possible theoretical explanations. Rather than narrowing down on a preferred theory, exploratory modeling is increasingly being used for opening up the conversation by exploring over a range of alternative theories. For example, Pruyt and Kwakkel (2014) explore the rise of home-grown terrorism following two rival theories and use this to identify policy interventions that work in either case. Both Mitchell (2003) and Page (2018) argue more generally for using a plurality of structurally different simulation models because such an ensemble sheds a rich light on the phenomenon under study, which is particularly appropriate in case of complex systems.
Next to the potential of various techniques that have emerged over the last two decades for exploratory modeling, there have also been ample relevant theoretical developments that are directly relevant to the manifesto. Most importantly, what exactly is the status of individual computational experiments, and what is the status of the ensemble? A key question in this context is whether one can take the frequencies of occurrence of different classes of model results within an ensemble, and directly translate these frequencies into probabilities that hold in the real world. For obvious reasons, you cannot. The frequencies found in model land are critically dependent on the way in which one samples the model input space. Are the various dimensions that characterize the model input space taken as independent, or are correlations assumed? What prior is assumed for each of the dimensions? Modeling choices related to these questions strongly alter the observed frequencies (Quinn et al., 2020). So, instead of treating frequencies of possibilities in model land as probabilities that also hold in the real world, Shortridge and Zaitchik (2018) and Taner et al. (2019) explore ways of using scenario discovery results as input to an a posteriori Bayesian probabilistic assessment. This might offer an alternative way of synthesizing the inside and outside view as discussed by Lustick and Tetlock (2021)
The authors use phrases like validation and prediction. These phrases are deeply problematic and should be used with care. Hodges and Dewar (1992), adopting a very narrow understanding of what proper validation is, argue that most simulation models used for policymaking cannot be validated. A similar, more modest, position is adopted by Oreskes et al. (1994), according to whom validation is only possible for closed systems. However, state stability and basically all other geopolitical issues, do not take place within closed systems. Historical replication, and thus also trying to establish predictive accuracy, is problematic as a source of evidence for validation because of equifinality. Building on this, Oreskes (1998) argues that one cannot proof the predictive reliability of models of complex systems prior to their actual use. Like with natural systems, in case of geopolitical systems one is confronted with limits to measurability, accessibility and a lack of spatiotemporal stationarity. The yardstick for model quality is thus not to be found in its purported predictive quality (e.g., brier scores), but rather in its perceived usefulness for guiding planning and decision-making.
Ensembles of simulation experiments can meaningfully inform decision-making beyond merely the summary frequencies of types of outcomes. For example, the absence of a particular type of outcome in an ensemble can be highly decision relevant. An example of this is offered by Auping et al. (2014) who analyzed the geopolitical implications of the US’ shale revolution. In short, they first generated an ensemble of energy price dynamics at a global scale. A set of exemplars from this ensemble was subsequently used as input to a state stability model for stress testing a wide variety of rentier states to identify which of those were most vulnerable to price swings. The global energy model, across the ensemble showed low oil prices for the decade of 2010–2020. Under no condition explored by the model would there be high prices. This ran counter to conventional wisdom at, for example, NATO, where the results were presented. However, because of the underlying theory-informed causal model, there was a very good explanation for the absence of such high price simulation runs. This example highlights the value of analyzing the ensemble in more detail, and carefully explaining how the different types of dynamics arise out of the structural assumptions within the simulation models.
We are in broad agreement with the plea of Lustick and Tetlock (2021) for the use of theory-guided simulations in aiding decision-making on complex issues, and content that this appeal has a reach well beyond geopolitical questions. Saltelli et al. (2019) lament the fact that simulation and modeling is not its own field (c.f., Padilla et al., 2017), yet is being practiced across many fields, because it hampers the development of shared best practices. The simulation manifesto of Lustick and Tetlock (2021) unfortunately exemplifies this regrettable state of affairs because it seems that they are trying to reinvent the wheel. We have highlighted here primarily the rich body of literature on exploratory modeling which could have meaningfully informed and strengthened the manifesto.
Other examples where Lustick and Tetlock (2021) bypass much relevant literature could easily have been given as well. For example, both the shale gas study and the homegrown terrorism study relied on system dynamics models, rather than agent-based models, as advocated by Lustick and Tetlock (2021). Rather than fruitlessly debating which is the better approach, we content that both can be useful for representing key causal mechanisms, and both produce emergent aggregate dynamics. The primary difference is the level at which causal mechanisms are described. System dynamics models use a lumped, mesoscopic representation where the aggregate dynamics are an emergent property of interacting feedback loops involving accumulation and delay. If there is no obvious way of lumping, or compartmentalizing, your agents, a microscopic representation where the aggregate dynamics arise out of local interactions among heterogeneous agents as used in agent-based models is more suitable (c.f., Rahmandad & Sterman, 2008). Similar remarks could be made with respect to federating models, which under labels such as multimodeling, multimodel ecologies, and co-simulation is being practices in various scientific fields (Nikolic et al., 2019).