{"title":"基于降维的多标签k最近邻研究","authors":"Song Gao, Xiaodan Yang, Lihua Zhou, Shaowen Yao","doi":"10.1109/SERA.2018.8477210","DOIUrl":null,"url":null,"abstract":"With the in-depth research of data classification, multi-label classification has become a hot issue of research. Multi-label $\\boldsymbol{k}$-nearest neighbor (ML-$\\boldsymbol{k}$ NN) is a classification method which predicts the unclassified instances' labels by learning the classified instances. However, this method doesn't consider the interrelationships between attributes and labels. Considering the relationships between properties and labels can improve accuracy of classification methods, but the diversities of properties and labels will present the curse of dimensionality. This problem make such methods can not be expanded under the background of big data. To solve this problem, this paper proposes three methods, called multi-label $\\boldsymbol{k}$-nearest neighbor based on principal component analysis(PML-$\\boldsymbol{k}\\mathbf{NN}$), coupled similarity multi-label k-nearest neighbor based on principal component analysis(PCSML-$\\boldsymbol{k}\\mathbf{NN}$) and coupled similarity multi-label k-nearest neighbor classification based on feature selection (FCSML-$\\boldsymbol{k}\\mathbf{NN}$), which use feature extraction and feature selection to reduce the dimensions of labels' properties. We test the ML-$\\boldsymbol{k}\\mathbf{NN}$ and the three methods we proposed with two real data, the experimental results show that reduce the dimensions of labels' properties can improve the efficiency of classification methods.","PeriodicalId":161568,"journal":{"name":"2018 IEEE 16th International Conference on Software Engineering Research, Management and Applications (SERA)","volume":"395 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"The Research of Multi-Label $k$-Nearest Neighbor Based on Descending Dimension\",\"authors\":\"Song Gao, Xiaodan Yang, Lihua Zhou, Shaowen Yao\",\"doi\":\"10.1109/SERA.2018.8477210\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"With the in-depth research of data classification, multi-label classification has become a hot issue of research. Multi-label $\\\\boldsymbol{k}$-nearest neighbor (ML-$\\\\boldsymbol{k}$ NN) is a classification method which predicts the unclassified instances' labels by learning the classified instances. However, this method doesn't consider the interrelationships between attributes and labels. Considering the relationships between properties and labels can improve accuracy of classification methods, but the diversities of properties and labels will present the curse of dimensionality. This problem make such methods can not be expanded under the background of big data. To solve this problem, this paper proposes three methods, called multi-label $\\\\boldsymbol{k}$-nearest neighbor based on principal component analysis(PML-$\\\\boldsymbol{k}\\\\mathbf{NN}$), coupled similarity multi-label k-nearest neighbor based on principal component analysis(PCSML-$\\\\boldsymbol{k}\\\\mathbf{NN}$) and coupled similarity multi-label k-nearest neighbor classification based on feature selection (FCSML-$\\\\boldsymbol{k}\\\\mathbf{NN}$), which use feature extraction and feature selection to reduce the dimensions of labels' properties. We test the ML-$\\\\boldsymbol{k}\\\\mathbf{NN}$ and the three methods we proposed with two real data, the experimental results show that reduce the dimensions of labels' properties can improve the efficiency of classification methods.\",\"PeriodicalId\":161568,\"journal\":{\"name\":\"2018 IEEE 16th International Conference on Software Engineering Research, Management and Applications (SERA)\",\"volume\":\"395 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-06-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2018 IEEE 16th International Conference on Software Engineering Research, Management and Applications (SERA)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/SERA.2018.8477210\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 IEEE 16th International Conference on Software Engineering Research, Management and Applications (SERA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SERA.2018.8477210","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
The Research of Multi-Label $k$-Nearest Neighbor Based on Descending Dimension
With the in-depth research of data classification, multi-label classification has become a hot issue of research. Multi-label $\boldsymbol{k}$-nearest neighbor (ML-$\boldsymbol{k}$ NN) is a classification method which predicts the unclassified instances' labels by learning the classified instances. However, this method doesn't consider the interrelationships between attributes and labels. Considering the relationships between properties and labels can improve accuracy of classification methods, but the diversities of properties and labels will present the curse of dimensionality. This problem make such methods can not be expanded under the background of big data. To solve this problem, this paper proposes three methods, called multi-label $\boldsymbol{k}$-nearest neighbor based on principal component analysis(PML-$\boldsymbol{k}\mathbf{NN}$), coupled similarity multi-label k-nearest neighbor based on principal component analysis(PCSML-$\boldsymbol{k}\mathbf{NN}$) and coupled similarity multi-label k-nearest neighbor classification based on feature selection (FCSML-$\boldsymbol{k}\mathbf{NN}$), which use feature extraction and feature selection to reduce the dimensions of labels' properties. We test the ML-$\boldsymbol{k}\mathbf{NN}$ and the three methods we proposed with two real data, the experimental results show that reduce the dimensions of labels' properties can improve the efficiency of classification methods.