{"title":"Dynamic mutual information-based feature selection for multi-label learning","authors":"Kyung-jun Kim, C. Jun","doi":"10.3233/ida-226666","DOIUrl":null,"url":null,"abstract":"In classification problems, feature selection is used to identify important input features to reduce the dimensionality of the input space while improving or maintaining classification performance. Traditional feature selection algorithms are designed to handle single-label learning, but classification problems have recently emerged in multi-label domain. In this study, we propose a novel feature selection algorithm for classifying multi-label data. This proposed method is based on dynamic mutual information, which can handle redundancy among features controlling the input space. We compare the proposed method with some existing problem transformation and algorithm adaptation methods applied to real multi-label datasets using the metrics of multi-label accuracy and hamming loss. The results show that the proposed method demonstrates more stable and better performance for nearly all multi-label datasets.","PeriodicalId":50355,"journal":{"name":"Intelligent Data Analysis","volume":"153 1","pages":"891-909"},"PeriodicalIF":0.9000,"publicationDate":"2023-05-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Intelligent Data Analysis","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.3233/ida-226666","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 1
Abstract
In classification problems, feature selection is used to identify important input features to reduce the dimensionality of the input space while improving or maintaining classification performance. Traditional feature selection algorithms are designed to handle single-label learning, but classification problems have recently emerged in multi-label domain. In this study, we propose a novel feature selection algorithm for classifying multi-label data. This proposed method is based on dynamic mutual information, which can handle redundancy among features controlling the input space. We compare the proposed method with some existing problem transformation and algorithm adaptation methods applied to real multi-label datasets using the metrics of multi-label accuracy and hamming loss. The results show that the proposed method demonstrates more stable and better performance for nearly all multi-label datasets.
期刊介绍:
Intelligent Data Analysis provides a forum for the examination of issues related to the research and applications of Artificial Intelligence techniques in data analysis across a variety of disciplines. These techniques include (but are not limited to): all areas of data visualization, data pre-processing (fusion, editing, transformation, filtering, sampling), data engineering, database mining techniques, tools and applications, use of domain knowledge in data analysis, big data applications, evolutionary algorithms, machine learning, neural nets, fuzzy logic, statistical pattern recognition, knowledge filtering, and post-processing. In particular, papers are preferred that discuss development of new AI related data analysis architectures, methodologies, and techniques and their applications to various domains.