{"title":"Preliminary Causal Discovery Results with Software Effort Estimation Data","authors":"Anandi Hira, B. Boehm, R. Stoddard, M. Konrad","doi":"10.1145/3172871.3172876","DOIUrl":null,"url":null,"abstract":"Correlation does not imply causation. Though this is a well-known fact, most analyses depend on correlation as proof of relationships that are often treated as causal. Causal discovery, also referred to as causal model search, involves the application of statistical methods to identify causal relationships from conditional independences (and/or other statistical relationships) in the data. Though software cost estimation models use both domain knowledge and statistics, to date, there has yet to be a published report describing the evaluation of a software dataset using causal discovery. Two of the authors have previously used regression analysis to evaluate the effectiveness of the International Function Points User Group (IFPUG)'s and the Common Software Measurement International Consortium (COSMIC)'s functional size measurement methods for analyzing the Unified Code Count (UCC)1's dataset of maintenance tasks. Using the same dataset, the authors will report in this paper on what types of information causal discovery provides, and how they differ from correlation tests. This paper will introduce causal discovery to software engineering research, and its use in the future may impact how software effort models are built.","PeriodicalId":199550,"journal":{"name":"Proceedings of the 11th Innovations in Software Engineering Conference","volume":"14 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-02-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 11th Innovations in Software Engineering Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3172871.3172876","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5
Abstract
Correlation does not imply causation. Though this is a well-known fact, most analyses depend on correlation as proof of relationships that are often treated as causal. Causal discovery, also referred to as causal model search, involves the application of statistical methods to identify causal relationships from conditional independences (and/or other statistical relationships) in the data. Though software cost estimation models use both domain knowledge and statistics, to date, there has yet to be a published report describing the evaluation of a software dataset using causal discovery. Two of the authors have previously used regression analysis to evaluate the effectiveness of the International Function Points User Group (IFPUG)'s and the Common Software Measurement International Consortium (COSMIC)'s functional size measurement methods for analyzing the Unified Code Count (UCC)1's dataset of maintenance tasks. Using the same dataset, the authors will report in this paper on what types of information causal discovery provides, and how they differ from correlation tests. This paper will introduce causal discovery to software engineering research, and its use in the future may impact how software effort models are built.