{"title":"直方图值和梯形值数据的k近邻分类","authors":"M. Razmkhah, Fathimah al-Ma’shumah, S. Effati","doi":"10.19139/soic-2310-5070-1451","DOIUrl":null,"url":null,"abstract":"A histogram-valued observation is a specific type of symbolic objects that represents its value by a list of bins (intervals) along with their corresponding relative frequencies or probabilities. \nIn the literature, the raw data in bins of all histogram-valued data have been assumed to be uniformly distributed. A new representation of such observations is proposed in this paper by assuming that the raw data in each bin are linearly distributed, which are called trapezoid-valued data. \nMoreover, new definitions of union and intersection between trapezoid-valued observations are made. \nThis study proposes the k-nearest neighbor technique for classifying histogram-valued data using various dissimilarity measures. \nFurther, the limiting behavior of the computational complexities based on the performed dissimilarity measures are compared. \nSome simulations are done to study the performance of the proposed procedures. Also, the results are applied to three various real data sets. \nEventually, some conclusions are stated.","PeriodicalId":131002,"journal":{"name":"Statistics, Optimization & Information Computing","volume":"198 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-09-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"The k-nearest Neighbor Classification of Histogram- and Trapezoid-Valued Data\",\"authors\":\"M. Razmkhah, Fathimah al-Ma’shumah, S. Effati\",\"doi\":\"10.19139/soic-2310-5070-1451\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"A histogram-valued observation is a specific type of symbolic objects that represents its value by a list of bins (intervals) along with their corresponding relative frequencies or probabilities. \\nIn the literature, the raw data in bins of all histogram-valued data have been assumed to be uniformly distributed. A new representation of such observations is proposed in this paper by assuming that the raw data in each bin are linearly distributed, which are called trapezoid-valued data. \\nMoreover, new definitions of union and intersection between trapezoid-valued observations are made. \\nThis study proposes the k-nearest neighbor technique for classifying histogram-valued data using various dissimilarity measures. \\nFurther, the limiting behavior of the computational complexities based on the performed dissimilarity measures are compared. \\nSome simulations are done to study the performance of the proposed procedures. Also, the results are applied to three various real data sets. \\nEventually, some conclusions are stated.\",\"PeriodicalId\":131002,\"journal\":{\"name\":\"Statistics, Optimization & Information Computing\",\"volume\":\"198 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-09-29\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Statistics, Optimization & Information Computing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.19139/soic-2310-5070-1451\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Statistics, Optimization & Information Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.19139/soic-2310-5070-1451","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
The k-nearest Neighbor Classification of Histogram- and Trapezoid-Valued Data
A histogram-valued observation is a specific type of symbolic objects that represents its value by a list of bins (intervals) along with their corresponding relative frequencies or probabilities.
In the literature, the raw data in bins of all histogram-valued data have been assumed to be uniformly distributed. A new representation of such observations is proposed in this paper by assuming that the raw data in each bin are linearly distributed, which are called trapezoid-valued data.
Moreover, new definitions of union and intersection between trapezoid-valued observations are made.
This study proposes the k-nearest neighbor technique for classifying histogram-valued data using various dissimilarity measures.
Further, the limiting behavior of the computational complexities based on the performed dissimilarity measures are compared.
Some simulations are done to study the performance of the proposed procedures. Also, the results are applied to three various real data sets.
Eventually, some conclusions are stated.