{"title":"基于距离测量的特征选择","authors":"Mingming Yang, Junchuan Yang","doi":"10.32604/JNM.2021.018267","DOIUrl":null,"url":null,"abstract":": Every day we receive a large amount of information through different social media and software, and this data and information can be realized with the advent of data mining methods. In the process of data mining, to solve some high-dimensional problems, feature selection is carried out in limited training samples, and effective features are selected. This paper focuses on two Relief feature selection algorithms: Relief and ReliefF algorithm. The differences between them and their respective applicable scopes are analyzed. Based on Relief algorithm, the high weight feature subset is obtained, and the correlation between features is calculated according to the mutual information distance measure, and the high redundant features are removed to obtain the feature subset with higher quality. Experimental results on six datasets show the effectiveness of our method.","PeriodicalId":69198,"journal":{"name":"新媒体杂志(英文)","volume":"1 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Feature Selection Based on Distance Measurement\",\"authors\":\"Mingming Yang, Junchuan Yang\",\"doi\":\"10.32604/JNM.2021.018267\",\"DOIUrl\":null,\"url\":null,\"abstract\":\": Every day we receive a large amount of information through different social media and software, and this data and information can be realized with the advent of data mining methods. In the process of data mining, to solve some high-dimensional problems, feature selection is carried out in limited training samples, and effective features are selected. This paper focuses on two Relief feature selection algorithms: Relief and ReliefF algorithm. The differences between them and their respective applicable scopes are analyzed. Based on Relief algorithm, the high weight feature subset is obtained, and the correlation between features is calculated according to the mutual information distance measure, and the high redundant features are removed to obtain the feature subset with higher quality. Experimental results on six datasets show the effectiveness of our method.\",\"PeriodicalId\":69198,\"journal\":{\"name\":\"新媒体杂志(英文)\",\"volume\":\"1 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"新媒体杂志(英文)\",\"FirstCategoryId\":\"1092\",\"ListUrlMain\":\"https://doi.org/10.32604/JNM.2021.018267\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"新媒体杂志(英文)","FirstCategoryId":"1092","ListUrlMain":"https://doi.org/10.32604/JNM.2021.018267","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
: Every day we receive a large amount of information through different social media and software, and this data and information can be realized with the advent of data mining methods. In the process of data mining, to solve some high-dimensional problems, feature selection is carried out in limited training samples, and effective features are selected. This paper focuses on two Relief feature selection algorithms: Relief and ReliefF algorithm. The differences between them and their respective applicable scopes are analyzed. Based on Relief algorithm, the high weight feature subset is obtained, and the correlation between features is calculated according to the mutual information distance measure, and the high redundant features are removed to obtain the feature subset with higher quality. Experimental results on six datasets show the effectiveness of our method.