{"title":"使用非对称分布改进文本分类器概率估计","authors":"Paul N. Bennett","doi":"10.1145/860435.860457","DOIUrl":null,"url":null,"abstract":"Text classifiers that give probability estimates are more readily applicable in a variety of scenarios. For example, rather than choosing one set decision threshold, they can be used in a Bayesian risk model to issue a run-time decision which minimizes a user-specified cost function dynamically chosen at prediction time. However, the quality of the probability estimates is crucial. We review a variety of standard approaches to converting scores (and poor probability estimates) from text classifiers to high quality estimates and introduce new models motivated by the intuition that the empirical score distribution for the \"extremely irrelevant\", \"hard to discriminate\", and \"obviously relevant\" items are often significantly different. Finally, we analyze the experimental performance of these models over the outputs of two text classifiers. The analysis demonstrates that one of these models is theoretically attractive (introducing few new parameters while increasing flexibility), computationally efficient, and empirically preferable.","PeriodicalId":209809,"journal":{"name":"Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval","volume":"12 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2003-07-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"63","resultStr":"{\"title\":\"Using asymmetric distributions to improve text classifier probability estimates\",\"authors\":\"Paul N. Bennett\",\"doi\":\"10.1145/860435.860457\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Text classifiers that give probability estimates are more readily applicable in a variety of scenarios. For example, rather than choosing one set decision threshold, they can be used in a Bayesian risk model to issue a run-time decision which minimizes a user-specified cost function dynamically chosen at prediction time. However, the quality of the probability estimates is crucial. We review a variety of standard approaches to converting scores (and poor probability estimates) from text classifiers to high quality estimates and introduce new models motivated by the intuition that the empirical score distribution for the \\\"extremely irrelevant\\\", \\\"hard to discriminate\\\", and \\\"obviously relevant\\\" items are often significantly different. Finally, we analyze the experimental performance of these models over the outputs of two text classifiers. The analysis demonstrates that one of these models is theoretically attractive (introducing few new parameters while increasing flexibility), computationally efficient, and empirically preferable.\",\"PeriodicalId\":209809,\"journal\":{\"name\":\"Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval\",\"volume\":\"12 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2003-07-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"63\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/860435.860457\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/860435.860457","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Using asymmetric distributions to improve text classifier probability estimates
Text classifiers that give probability estimates are more readily applicable in a variety of scenarios. For example, rather than choosing one set decision threshold, they can be used in a Bayesian risk model to issue a run-time decision which minimizes a user-specified cost function dynamically chosen at prediction time. However, the quality of the probability estimates is crucial. We review a variety of standard approaches to converting scores (and poor probability estimates) from text classifiers to high quality estimates and introduce new models motivated by the intuition that the empirical score distribution for the "extremely irrelevant", "hard to discriminate", and "obviously relevant" items are often significantly different. Finally, we analyze the experimental performance of these models over the outputs of two text classifiers. The analysis demonstrates that one of these models is theoretically attractive (introducing few new parameters while increasing flexibility), computationally efficient, and empirically preferable.