{"title":"Some limitations of the concordance correlation coefficient to characterise model accuracy","authors":"Alexandre M.J.-C. Wadoux , Budiman Minasny","doi":"10.1016/j.ecoinf.2024.102820","DOIUrl":null,"url":null,"abstract":"<div><p>Perusal of the environmental modelling literature reveals that the Lin's concordance correlation coefficient is a popular validation statistic to characterise model or map quality. In this communication, we illustrate with synthetic examples three undesirable statistical properties of this coefficient. We argue that ignorance of these properties have led to a frequent misuse of this coefficient in modelling and mapping studies. The stand-alone use of the concordance correlation coefficient is insufficient because i) it does not inform on the relative contribution of bias and correlation, ii) the values cannot be compared across different datasets or studies and iii) it is prone to the same problems as other linear correlation statistics. The concordance coefficient was, in fact, thought initially for evaluating reproducibility studies over repeated trials of the same variable, not for characterising model accuracy. For the validation of models and maps, we recommend calculating statistics that, combined with the concordance correlation coefficient, represent various aspects of the model or map quality, which can be visualised together in a single figure with a Taylor or solar diagram.</p></div>","PeriodicalId":51024,"journal":{"name":"Ecological Informatics","volume":"83 ","pages":"Article 102820"},"PeriodicalIF":5.8000,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S1574954124003625/pdfft?md5=598076128189827bbb1d60591fdbe37f&pid=1-s2.0-S1574954124003625-main.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Ecological Informatics","FirstCategoryId":"93","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1574954124003625","RegionNum":2,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ECOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Perusal of the environmental modelling literature reveals that the Lin's concordance correlation coefficient is a popular validation statistic to characterise model or map quality. In this communication, we illustrate with synthetic examples three undesirable statistical properties of this coefficient. We argue that ignorance of these properties have led to a frequent misuse of this coefficient in modelling and mapping studies. The stand-alone use of the concordance correlation coefficient is insufficient because i) it does not inform on the relative contribution of bias and correlation, ii) the values cannot be compared across different datasets or studies and iii) it is prone to the same problems as other linear correlation statistics. The concordance coefficient was, in fact, thought initially for evaluating reproducibility studies over repeated trials of the same variable, not for characterising model accuracy. For the validation of models and maps, we recommend calculating statistics that, combined with the concordance correlation coefficient, represent various aspects of the model or map quality, which can be visualised together in a single figure with a Taylor or solar diagram.
期刊介绍:
The journal Ecological Informatics is devoted to the publication of high quality, peer-reviewed articles on all aspects of computational ecology, data science and biogeography. The scope of the journal takes into account the data-intensive nature of ecology, the growing capacity of information technology to access, harness and leverage complex data as well as the critical need for informing sustainable management in view of global environmental and climate change.
The nature of the journal is interdisciplinary at the crossover between ecology and informatics. It focuses on novel concepts and techniques for image- and genome-based monitoring and interpretation, sensor- and multimedia-based data acquisition, internet-based data archiving and sharing, data assimilation, modelling and prediction of ecological data.