Jie Chen, Yifan Hu, Hailin Liu, Bin Lv, Lin Cao, Hui Li
{"title":"Research on Matching Method of Ocean Observation Data Based on DC-WKNN Algorithm","authors":"Jie Chen, Yifan Hu, Hailin Liu, Bin Lv, Lin Cao, Hui Li","doi":"10.1109/CACRE50138.2020.9230223","DOIUrl":null,"url":null,"abstract":"According to the characteristics of ocean observation data, such as massive, heterogeneous, multi-source, multi-class and multi-dimensional, it is difficult to classify and match ocean observation data quickly and accurately with traditional KNN for large-scale integration. A method of ocean observation data matching based on density clipping and weighted KNN (DC-WKNN) is proposed in this paper. Firstly, according to the distribution density of training samples between different classes, the clipping rule is set up. It can cut out representative samples as new training samples, and reduce the calculation amount of traditional KNN algorithm, so that it can improve the efficiency. Then, according to the distribution characteristics of the training samples in the class, the weight assignment model is established. It can allocate the weight for each training sample and decrease the misjudgment of the boundary points far away from the center of the class, and improve the accuracy. A large number of experimental results based on the data set of the seafloor observatory network show that the calculation complexity is reduced by about 20%. And the accuracy of the algorithm is better than that of the traditional KNN and other improved algorithms. It has good performance for big data classification, especially for the classification of ocean observation data characteristics.","PeriodicalId":325195,"journal":{"name":"2020 5th International Conference on Automation, Control and Robotics Engineering (CACRE)","volume":"53 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 5th International Conference on Automation, Control and Robotics Engineering (CACRE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CACRE50138.2020.9230223","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
According to the characteristics of ocean observation data, such as massive, heterogeneous, multi-source, multi-class and multi-dimensional, it is difficult to classify and match ocean observation data quickly and accurately with traditional KNN for large-scale integration. A method of ocean observation data matching based on density clipping and weighted KNN (DC-WKNN) is proposed in this paper. Firstly, according to the distribution density of training samples between different classes, the clipping rule is set up. It can cut out representative samples as new training samples, and reduce the calculation amount of traditional KNN algorithm, so that it can improve the efficiency. Then, according to the distribution characteristics of the training samples in the class, the weight assignment model is established. It can allocate the weight for each training sample and decrease the misjudgment of the boundary points far away from the center of the class, and improve the accuracy. A large number of experimental results based on the data set of the seafloor observatory network show that the calculation complexity is reduced by about 20%. And the accuracy of the algorithm is better than that of the traditional KNN and other improved algorithms. It has good performance for big data classification, especially for the classification of ocean observation data characteristics.