Mohammad Salim Ahmed, Sourabh Jain, F. B. Muhaya, L. Khan
{"title":"基于类标签对关联的多标签文本分类预测概率增强","authors":"Mohammad Salim Ahmed, Sourabh Jain, F. B. Muhaya, L. Khan","doi":"10.1109/EAIS.2013.6604107","DOIUrl":null,"url":null,"abstract":"In order to extract knowledge from the growing information available over the Internet, it is imperative that we classify the information first. Classification is a vastly researched topic in the field of data mining and text data, representing a significant portion of the information, naturally has acquired significant research interest. However, text data classification presents its own problems of high and sparse dimensionality, as attributes span over huge set of words of natural language and multi-label property as each document may belong to more than one class simultaneously. Any solution proposed to classify such data without considering these facts cannot render optimum results. In this paper, we have discussed an approach based on fuzzy clustering to handle high dimensionality of data and using inter-class correlation information in the form of class label pairs to enhance the prediction probabilities in multi-label classification as a post processing step. We use correlation information in both positive (rewarding) and negative (penalizing) terms to enhance the probability metrics for multi-label classification. We have tested our proposed algorithm on a number of benchmark data sets and have been able to achieve better performance than the existing approaches.","PeriodicalId":289995,"journal":{"name":"2013 IEEE Conference on Evolving and Adaptive Intelligent Systems (EAIS)","volume":"35 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-04-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Predicted probability enhancement for multi-label text classification using class label pair association\",\"authors\":\"Mohammad Salim Ahmed, Sourabh Jain, F. B. Muhaya, L. Khan\",\"doi\":\"10.1109/EAIS.2013.6604107\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In order to extract knowledge from the growing information available over the Internet, it is imperative that we classify the information first. Classification is a vastly researched topic in the field of data mining and text data, representing a significant portion of the information, naturally has acquired significant research interest. However, text data classification presents its own problems of high and sparse dimensionality, as attributes span over huge set of words of natural language and multi-label property as each document may belong to more than one class simultaneously. Any solution proposed to classify such data without considering these facts cannot render optimum results. In this paper, we have discussed an approach based on fuzzy clustering to handle high dimensionality of data and using inter-class correlation information in the form of class label pairs to enhance the prediction probabilities in multi-label classification as a post processing step. We use correlation information in both positive (rewarding) and negative (penalizing) terms to enhance the probability metrics for multi-label classification. We have tested our proposed algorithm on a number of benchmark data sets and have been able to achieve better performance than the existing approaches.\",\"PeriodicalId\":289995,\"journal\":{\"name\":\"2013 IEEE Conference on Evolving and Adaptive Intelligent Systems (EAIS)\",\"volume\":\"35 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2013-04-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2013 IEEE Conference on Evolving and Adaptive Intelligent Systems (EAIS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/EAIS.2013.6604107\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 IEEE Conference on Evolving and Adaptive Intelligent Systems (EAIS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/EAIS.2013.6604107","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Predicted probability enhancement for multi-label text classification using class label pair association
In order to extract knowledge from the growing information available over the Internet, it is imperative that we classify the information first. Classification is a vastly researched topic in the field of data mining and text data, representing a significant portion of the information, naturally has acquired significant research interest. However, text data classification presents its own problems of high and sparse dimensionality, as attributes span over huge set of words of natural language and multi-label property as each document may belong to more than one class simultaneously. Any solution proposed to classify such data without considering these facts cannot render optimum results. In this paper, we have discussed an approach based on fuzzy clustering to handle high dimensionality of data and using inter-class correlation information in the form of class label pairs to enhance the prediction probabilities in multi-label classification as a post processing step. We use correlation information in both positive (rewarding) and negative (penalizing) terms to enhance the probability metrics for multi-label classification. We have tested our proposed algorithm on a number of benchmark data sets and have been able to achieve better performance than the existing approaches.