{"title":"基于集成多标签学习方法的文本分类","authors":"Zhang Tao, Jiansheng Wu, Haifeng Hu","doi":"10.1109/ICSAI.2014.7009425","DOIUrl":null,"url":null,"abstract":"Text classification is one of the most significant contents in Natural Language Processing research field. In most real cases, text classification is usually a multi-label learning task. Currently, there are three mainstream attribute measures (i.e., information gain, document frequency and chi-square test values) which are often used to describe documents. The three attribute measures have been applied successfully in some tasks for text classification, but the information that each attribute measure is to focus on is different. It's valuable to improve the prediction performance of text classification by designing ensemble methods to combine these measures. In this paper, we have proposed a novel ensemble multi-label learning method En-MLKNN based on the state-of-the-art multi-label learning method MLKNN for this task. In addition, in order to make better use of our approach, we have constructed a complete framework for text classification. Experiments on two classic datasets show that our En-MLKNN algorithm is superior to most state-of-the-art Multi-Label learning algorithms.","PeriodicalId":143221,"journal":{"name":"The 2014 2nd International Conference on Systems and Informatics (ICSAI 2014)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":"{\"title\":\"Text classification based on a novel ensemble multi-label learning method\",\"authors\":\"Zhang Tao, Jiansheng Wu, Haifeng Hu\",\"doi\":\"10.1109/ICSAI.2014.7009425\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Text classification is one of the most significant contents in Natural Language Processing research field. In most real cases, text classification is usually a multi-label learning task. Currently, there are three mainstream attribute measures (i.e., information gain, document frequency and chi-square test values) which are often used to describe documents. The three attribute measures have been applied successfully in some tasks for text classification, but the information that each attribute measure is to focus on is different. It's valuable to improve the prediction performance of text classification by designing ensemble methods to combine these measures. In this paper, we have proposed a novel ensemble multi-label learning method En-MLKNN based on the state-of-the-art multi-label learning method MLKNN for this task. In addition, in order to make better use of our approach, we have constructed a complete framework for text classification. Experiments on two classic datasets show that our En-MLKNN algorithm is superior to most state-of-the-art Multi-Label learning algorithms.\",\"PeriodicalId\":143221,\"journal\":{\"name\":\"The 2014 2nd International Conference on Systems and Informatics (ICSAI 2014)\",\"volume\":\"17 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2014-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"6\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"The 2014 2nd International Conference on Systems and Informatics (ICSAI 2014)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICSAI.2014.7009425\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"The 2014 2nd International Conference on Systems and Informatics (ICSAI 2014)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICSAI.2014.7009425","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Text classification based on a novel ensemble multi-label learning method
Text classification is one of the most significant contents in Natural Language Processing research field. In most real cases, text classification is usually a multi-label learning task. Currently, there are three mainstream attribute measures (i.e., information gain, document frequency and chi-square test values) which are often used to describe documents. The three attribute measures have been applied successfully in some tasks for text classification, but the information that each attribute measure is to focus on is different. It's valuable to improve the prediction performance of text classification by designing ensemble methods to combine these measures. In this paper, we have proposed a novel ensemble multi-label learning method En-MLKNN based on the state-of-the-art multi-label learning method MLKNN for this task. In addition, in order to make better use of our approach, we have constructed a complete framework for text classification. Experiments on two classic datasets show that our En-MLKNN algorithm is superior to most state-of-the-art Multi-Label learning algorithms.