Hiroki Watanabe, Masashi Hyodo, Yuki Yamada, T. Seo
{"title":"高维数据中基于距离的分类器误分类概率估计","authors":"Hiroki Watanabe, Masashi Hyodo, Yuki Yamada, T. Seo","doi":"10.32917/HMJ/1564106544","DOIUrl":null,"url":null,"abstract":"The Euclidean distance-based classifier is often used to classify an observation into one of several populations in high-dimensional data. One of the most important problems in discriminant analysis is estimating the probability of misclassification. In this paper, we propose a consistent estimator of misclassification probabilities when the dimension of the vector, p, may exceed the sample size, N , and the underlying distribution need not necessarily be normal. A new estimator of quadratic form is also obtained as a by-product. Finally, we numerically verify the high accuracy of our proposed estimator in finite sample applications, inclusive of high-dimensional scenarios. AMS 2000 subject classification: 62H30, 41A60.","PeriodicalId":55054,"journal":{"name":"Hiroshima Mathematical Journal","volume":"1 1","pages":""},"PeriodicalIF":0.5000,"publicationDate":"2019-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"Estimation of misclassification probability for a distance-based classifier in high-dimensional data\",\"authors\":\"Hiroki Watanabe, Masashi Hyodo, Yuki Yamada, T. Seo\",\"doi\":\"10.32917/HMJ/1564106544\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The Euclidean distance-based classifier is often used to classify an observation into one of several populations in high-dimensional data. One of the most important problems in discriminant analysis is estimating the probability of misclassification. In this paper, we propose a consistent estimator of misclassification probabilities when the dimension of the vector, p, may exceed the sample size, N , and the underlying distribution need not necessarily be normal. A new estimator of quadratic form is also obtained as a by-product. Finally, we numerically verify the high accuracy of our proposed estimator in finite sample applications, inclusive of high-dimensional scenarios. AMS 2000 subject classification: 62H30, 41A60.\",\"PeriodicalId\":55054,\"journal\":{\"name\":\"Hiroshima Mathematical Journal\",\"volume\":\"1 1\",\"pages\":\"\"},\"PeriodicalIF\":0.5000,\"publicationDate\":\"2019-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Hiroshima Mathematical Journal\",\"FirstCategoryId\":\"100\",\"ListUrlMain\":\"https://doi.org/10.32917/HMJ/1564106544\",\"RegionNum\":4,\"RegionCategory\":\"数学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"MATHEMATICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Hiroshima Mathematical Journal","FirstCategoryId":"100","ListUrlMain":"https://doi.org/10.32917/HMJ/1564106544","RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"MATHEMATICS","Score":null,"Total":0}
Estimation of misclassification probability for a distance-based classifier in high-dimensional data
The Euclidean distance-based classifier is often used to classify an observation into one of several populations in high-dimensional data. One of the most important problems in discriminant analysis is estimating the probability of misclassification. In this paper, we propose a consistent estimator of misclassification probabilities when the dimension of the vector, p, may exceed the sample size, N , and the underlying distribution need not necessarily be normal. A new estimator of quadratic form is also obtained as a by-product. Finally, we numerically verify the high accuracy of our proposed estimator in finite sample applications, inclusive of high-dimensional scenarios. AMS 2000 subject classification: 62H30, 41A60.
期刊介绍:
Hiroshima Mathematical Journal (HMJ) is a continuation of Journal of Science of the Hiroshima University, Series A, Vol. 1 - 24 (1930 - 1960), and Journal of Science of the Hiroshima University, Series A - I , Vol. 25 - 34 (1961 - 1970).
Starting with Volume 4 (1974), each volume of HMJ consists of three numbers annually. This journal publishes original papers in pure and applied mathematics. HMJ is an (electronically) open access journal from Volume 36, Number 1.