The relationship between Precision-Recall and ROC curves

Jesse Davis, Mark H. Goadrich
{"title":"The relationship between Precision-Recall and ROC curves","authors":"Jesse Davis, Mark H. Goadrich","doi":"10.1145/1143844.1143874","DOIUrl":null,"url":null,"abstract":"Receiver Operator Characteristic (ROC) curves are commonly used to present results for binary decision problems in machine learning. However, when dealing with highly skewed datasets, Precision-Recall (PR) curves give a more informative picture of an algorithm's performance. We show that a deep connection exists between ROC space and PR space, such that a curve dominates in ROC space if and only if it dominates in PR space. A corollary is the notion of an achievable PR curve, which has properties much like the convex hull in ROC space; we show an efficient algorithm for computing this curve. Finally, we also note differences in the two types of curves are significant for algorithm design. For example, in PR space it is incorrect to linearly interpolate between points. Furthermore, algorithms that optimize the area under the ROC curve are not guaranteed to optimize the area under the PR curve.","PeriodicalId":124011,"journal":{"name":"Proceedings of the 23rd international conference on Machine learning","volume":"270 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2006-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5171","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 23rd international conference on Machine learning","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/1143844.1143874","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5171

Abstract

Receiver Operator Characteristic (ROC) curves are commonly used to present results for binary decision problems in machine learning. However, when dealing with highly skewed datasets, Precision-Recall (PR) curves give a more informative picture of an algorithm's performance. We show that a deep connection exists between ROC space and PR space, such that a curve dominates in ROC space if and only if it dominates in PR space. A corollary is the notion of an achievable PR curve, which has properties much like the convex hull in ROC space; we show an efficient algorithm for computing this curve. Finally, we also note differences in the two types of curves are significant for algorithm design. For example, in PR space it is incorrect to linearly interpolate between points. Furthermore, algorithms that optimize the area under the ROC curve are not guaranteed to optimize the area under the PR curve.
精密度-召回率与ROC曲线的关系
在机器学习中,接收算子特征(ROC)曲线通常用于表示二元决策问题的结果。然而,当处理高度倾斜的数据集时,Precision-Recall (PR)曲线提供了算法性能的更多信息。我们证明了ROC空间和PR空间之间存在着深刻的联系,这样一条曲线在ROC空间中占主导地位当且仅当它在PR空间中占主导地位。推论是可实现PR曲线的概念,它具有与ROC空间中的凸包非常相似的属性;我们给出了计算这条曲线的有效算法。最后,我们还注意到两种曲线的差异对于算法设计是重要的。例如,在PR空间中,在点之间进行线性插值是不正确的。此外,优化ROC曲线下面积的算法并不能保证优化PR曲线下面积。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信