{"title":"Validation of models: statistical techniques and data availability","authors":"J. Kleijnen","doi":"10.1145/324138.324450","DOIUrl":null,"url":null,"abstract":"This paper shows which statistical techniques can be used to validate simulation models, depending on which real-life data are available. Concerning this availability, three situations are distinguished: (i) no data; (ii) only output data; and (iii) both input and output data. In case (i)-no real data-the analysts can still experiment with the simulation model to obtain simulated data; such an experiment should be guided by the statistical theory on the design of experiments. In case (ii) only output data-real and simulated output data can be compared through the well-known two-sample Student t statistic or certain other statistics. In case (iii)-input and output data-trace-driven simulation becomes possible, but validation should not proceed in the popular way (make a scatter plot with real and simulated outputs, fit a line, and test whether that line has unit slope and passes through the origin); alternative regression and bootstrap procedures are presented. Several case studies are summarized, to illustrate the three types of situations.","PeriodicalId":287132,"journal":{"name":"Online World Conference on Soft Computing in Industrial Applications","volume":"41 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1999-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"168","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Online World Conference on Soft Computing in Industrial Applications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/324138.324450","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 168
Abstract
This paper shows which statistical techniques can be used to validate simulation models, depending on which real-life data are available. Concerning this availability, three situations are distinguished: (i) no data; (ii) only output data; and (iii) both input and output data. In case (i)-no real data-the analysts can still experiment with the simulation model to obtain simulated data; such an experiment should be guided by the statistical theory on the design of experiments. In case (ii) only output data-real and simulated output data can be compared through the well-known two-sample Student t statistic or certain other statistics. In case (iii)-input and output data-trace-driven simulation becomes possible, but validation should not proceed in the popular way (make a scatter plot with real and simulated outputs, fit a line, and test whether that line has unit slope and passes through the origin); alternative regression and bootstrap procedures are presented. Several case studies are summarized, to illustrate the three types of situations.