{"title":"Dirichlet混合模型在Web内容分类中的半监督学习","authors":"J. Bai, XiaoPing Li, Xiaoxian Zhang","doi":"10.1109/ICEIT.2010.5607590","DOIUrl":null,"url":null,"abstract":"This paper presents a method for designing semi-supervised classifier trained on labeled and unlabeled instances. We explore the trade-off between maximizing a discriminative likelihood of labeled data and a generative likelihood of labeled and unlabeled data. Moreover, mixture models are an interesting and flexible model family. The different uses of mixture models include for example generative models and density estimation. This paper investigates semi-supervised learning of mixture models using a unified objective function taking both labeled and unlabeled data into account. We conducted experiments on the WebKB and 20NEWSGROUPS. The results show that unlabeled data results in improvement in classification accuracy over the supervised model.","PeriodicalId":346498,"journal":{"name":"2010 International Conference on Educational and Information Technology","volume":"53 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2010-10-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"On semi-supervised learning of Dirichlet Mixture Models for Web content classification\",\"authors\":\"J. Bai, XiaoPing Li, Xiaoxian Zhang\",\"doi\":\"10.1109/ICEIT.2010.5607590\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper presents a method for designing semi-supervised classifier trained on labeled and unlabeled instances. We explore the trade-off between maximizing a discriminative likelihood of labeled data and a generative likelihood of labeled and unlabeled data. Moreover, mixture models are an interesting and flexible model family. The different uses of mixture models include for example generative models and density estimation. This paper investigates semi-supervised learning of mixture models using a unified objective function taking both labeled and unlabeled data into account. We conducted experiments on the WebKB and 20NEWSGROUPS. The results show that unlabeled data results in improvement in classification accuracy over the supervised model.\",\"PeriodicalId\":346498,\"journal\":{\"name\":\"2010 International Conference on Educational and Information Technology\",\"volume\":\"53 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2010-10-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2010 International Conference on Educational and Information Technology\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICEIT.2010.5607590\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2010 International Conference on Educational and Information Technology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICEIT.2010.5607590","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
On semi-supervised learning of Dirichlet Mixture Models for Web content classification
This paper presents a method for designing semi-supervised classifier trained on labeled and unlabeled instances. We explore the trade-off between maximizing a discriminative likelihood of labeled data and a generative likelihood of labeled and unlabeled data. Moreover, mixture models are an interesting and flexible model family. The different uses of mixture models include for example generative models and density estimation. This paper investigates semi-supervised learning of mixture models using a unified objective function taking both labeled and unlabeled data into account. We conducted experiments on the WebKB and 20NEWSGROUPS. The results show that unlabeled data results in improvement in classification accuracy over the supervised model.