Zijian Yang, Vahe Eminyan, Ralf Schlüter, Hermann Ney
{"title":"受约束贝叶斯误差分类误差不匹配的精炼统计边界","authors":"Zijian Yang, Vahe Eminyan, Ralf Schlüter, Hermann Ney","doi":"arxiv-2409.01309","DOIUrl":null,"url":null,"abstract":"In statistical classification/multiple hypothesis testing and machine\nlearning, a model distribution estimated from the training data is usually\napplied to replace the unknown true distribution in the Bayes decision rule,\nwhich introduces a mismatch between the Bayes error and the model-based\nclassification error. In this work, we derive the classification error bound to\nstudy the relationship between the Kullback-Leibler divergence and the\nclassification error mismatch. We first reconsider the statistical bounds based\non classification error mismatch derived in previous works, employing a\ndifferent method of derivation. Then, motivated by the observation that the\nBayes error is typically low in machine learning tasks like speech recognition\nand pattern recognition, we derive a refined Kullback-Leibler-divergence-based\nbound on the error mismatch with the constraint that the Bayes error is lower\nthan a threshold.","PeriodicalId":501082,"journal":{"name":"arXiv - MATH - Information Theory","volume":"56 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Refined Statistical Bounds for Classification Error Mismatches with Constrained Bayes Error\",\"authors\":\"Zijian Yang, Vahe Eminyan, Ralf Schlüter, Hermann Ney\",\"doi\":\"arxiv-2409.01309\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In statistical classification/multiple hypothesis testing and machine\\nlearning, a model distribution estimated from the training data is usually\\napplied to replace the unknown true distribution in the Bayes decision rule,\\nwhich introduces a mismatch between the Bayes error and the model-based\\nclassification error. In this work, we derive the classification error bound to\\nstudy the relationship between the Kullback-Leibler divergence and the\\nclassification error mismatch. We first reconsider the statistical bounds based\\non classification error mismatch derived in previous works, employing a\\ndifferent method of derivation. Then, motivated by the observation that the\\nBayes error is typically low in machine learning tasks like speech recognition\\nand pattern recognition, we derive a refined Kullback-Leibler-divergence-based\\nbound on the error mismatch with the constraint that the Bayes error is lower\\nthan a threshold.\",\"PeriodicalId\":501082,\"journal\":{\"name\":\"arXiv - MATH - Information Theory\",\"volume\":\"56 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-09-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - MATH - Information Theory\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2409.01309\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - MATH - Information Theory","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.01309","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Refined Statistical Bounds for Classification Error Mismatches with Constrained Bayes Error
In statistical classification/multiple hypothesis testing and machine
learning, a model distribution estimated from the training data is usually
applied to replace the unknown true distribution in the Bayes decision rule,
which introduces a mismatch between the Bayes error and the model-based
classification error. In this work, we derive the classification error bound to
study the relationship between the Kullback-Leibler divergence and the
classification error mismatch. We first reconsider the statistical bounds based
on classification error mismatch derived in previous works, employing a
different method of derivation. Then, motivated by the observation that the
Bayes error is typically low in machine learning tasks like speech recognition
and pattern recognition, we derive a refined Kullback-Leibler-divergence-based
bound on the error mismatch with the constraint that the Bayes error is lower
than a threshold.