{"title":"基于关联规则和元数据的半监督主题学习和表示方法","authors":"Zhao Huiru, Lin Min","doi":"10.1109/CIAPP.2017.8167059","DOIUrl":null,"url":null,"abstract":"Aiming at the problem that the semantic explanation of the existing topic model is poor and the accuracy is not high, a semi-supervised topic learning and representation method based on association rules and metadata is proposed. First, we used the metadata as a priori knowledge to guide the topic learning, and got the probability distribution of the term in the document. Then, we got the frequent three items of each topic by weighted association rule. And then used the metadata of the experimental document to improve the semantic similarity through the improved vector space model algorithm. Finally, we got the topic semantics which are more in line with the actual situation and have better semantic explanation. On the same data set, LDA topic model representation method and this method were used to compare experiments. The experimental results show that the method proposed in this paper is superior to the LDA topic model representation in terms of topic extraction accuracy and topic granularity, and fully validates the effectiveness of the proposed method.","PeriodicalId":187056,"journal":{"name":"2017 2nd IEEE International Conference on Computational Intelligence and Applications (ICCIA)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Semi-supervised topic learning and representation method based on association rules and metadata\",\"authors\":\"Zhao Huiru, Lin Min\",\"doi\":\"10.1109/CIAPP.2017.8167059\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Aiming at the problem that the semantic explanation of the existing topic model is poor and the accuracy is not high, a semi-supervised topic learning and representation method based on association rules and metadata is proposed. First, we used the metadata as a priori knowledge to guide the topic learning, and got the probability distribution of the term in the document. Then, we got the frequent three items of each topic by weighted association rule. And then used the metadata of the experimental document to improve the semantic similarity through the improved vector space model algorithm. Finally, we got the topic semantics which are more in line with the actual situation and have better semantic explanation. On the same data set, LDA topic model representation method and this method were used to compare experiments. The experimental results show that the method proposed in this paper is superior to the LDA topic model representation in terms of topic extraction accuracy and topic granularity, and fully validates the effectiveness of the proposed method.\",\"PeriodicalId\":187056,\"journal\":{\"name\":\"2017 2nd IEEE International Conference on Computational Intelligence and Applications (ICCIA)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 2nd IEEE International Conference on Computational Intelligence and Applications (ICCIA)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CIAPP.2017.8167059\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 2nd IEEE International Conference on Computational Intelligence and Applications (ICCIA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CIAPP.2017.8167059","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Semi-supervised topic learning and representation method based on association rules and metadata
Aiming at the problem that the semantic explanation of the existing topic model is poor and the accuracy is not high, a semi-supervised topic learning and representation method based on association rules and metadata is proposed. First, we used the metadata as a priori knowledge to guide the topic learning, and got the probability distribution of the term in the document. Then, we got the frequent three items of each topic by weighted association rule. And then used the metadata of the experimental document to improve the semantic similarity through the improved vector space model algorithm. Finally, we got the topic semantics which are more in line with the actual situation and have better semantic explanation. On the same data set, LDA topic model representation method and this method were used to compare experiments. The experimental results show that the method proposed in this paper is superior to the LDA topic model representation in terms of topic extraction accuracy and topic granularity, and fully validates the effectiveness of the proposed method.