Xiaoyu Zhang, Shupeng Wang, Lei Zhang, Chunjie Zhang, Changsheng Li
{"title":"具有鉴别和代表性属性的集成特征选择用于恶意软件检测","authors":"Xiaoyu Zhang, Shupeng Wang, Lei Zhang, Chunjie Zhang, Changsheng Li","doi":"10.1109/INFCOMW.2016.7562161","DOIUrl":null,"url":null,"abstract":"Malware data are typically depicted with extremely high-dimensional features, which lays an excessive computational burden on detection methods. For the sake of effectiveness and efficiency, feature selection is an indispensable part for malware detection. In this paper, we propose an ensemble feature selection method with integration of discriminative and representative properties for malware detection. Based on the labeled and unlabeled data, the most discriminative and representative features are selected, respectively. The former extracts the features that are most distinctive with respect to the classes, and the latter focuses on the features that best represent the data. A comprehensive metric is subsequently obtained, which retains the most informative features.","PeriodicalId":348177,"journal":{"name":"2016 IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS)","volume":"42 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-04-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"Ensemble feature selection with discriminative and representative properties for malware detection\",\"authors\":\"Xiaoyu Zhang, Shupeng Wang, Lei Zhang, Chunjie Zhang, Changsheng Li\",\"doi\":\"10.1109/INFCOMW.2016.7562161\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Malware data are typically depicted with extremely high-dimensional features, which lays an excessive computational burden on detection methods. For the sake of effectiveness and efficiency, feature selection is an indispensable part for malware detection. In this paper, we propose an ensemble feature selection method with integration of discriminative and representative properties for malware detection. Based on the labeled and unlabeled data, the most discriminative and representative features are selected, respectively. The former extracts the features that are most distinctive with respect to the classes, and the latter focuses on the features that best represent the data. A comprehensive metric is subsequently obtained, which retains the most informative features.\",\"PeriodicalId\":348177,\"journal\":{\"name\":\"2016 IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS)\",\"volume\":\"42 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-04-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/INFCOMW.2016.7562161\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/INFCOMW.2016.7562161","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Ensemble feature selection with discriminative and representative properties for malware detection
Malware data are typically depicted with extremely high-dimensional features, which lays an excessive computational burden on detection methods. For the sake of effectiveness and efficiency, feature selection is an indispensable part for malware detection. In this paper, we propose an ensemble feature selection method with integration of discriminative and representative properties for malware detection. Based on the labeled and unlabeled data, the most discriminative and representative features are selected, respectively. The former extracts the features that are most distinctive with respect to the classes, and the latter focuses on the features that best represent the data. A comprehensive metric is subsequently obtained, which retains the most informative features.