{"title":"社区问答服务中问题的主题提取与分类","authors":"Q. Ma, M. Murata","doi":"10.1109/CSCI49370.2019.00253","DOIUrl":null,"url":null,"abstract":"This paper presents methods of simultaneously performing topic/keyword extraction and unsupervised classification for questions posted in community-based question answering services (CQA) or Q&A websites, using topic models and hybrid models. Large-scale experiments on two kinds of data, one called category data and the other called subtyping data, show the effectiveness of our methods. The purity and correct rate show that the topic models outperform clustering methods, hybrid models outperform topic models in question classification, and the adoption of term frequency-inverse document frequency is effective for the subtyping data. Manual evaluations with the extracted keywords show the effectiveness of the topic models in topic extraction.","PeriodicalId":103662,"journal":{"name":"2019 International Conference on Computational Science and Computational Intelligence (CSCI)","volume":"68 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Topic Extraction and Classification for Questions Posted in Community-Based Question Answering Services\",\"authors\":\"Q. Ma, M. Murata\",\"doi\":\"10.1109/CSCI49370.2019.00253\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper presents methods of simultaneously performing topic/keyword extraction and unsupervised classification for questions posted in community-based question answering services (CQA) or Q&A websites, using topic models and hybrid models. Large-scale experiments on two kinds of data, one called category data and the other called subtyping data, show the effectiveness of our methods. The purity and correct rate show that the topic models outperform clustering methods, hybrid models outperform topic models in question classification, and the adoption of term frequency-inverse document frequency is effective for the subtyping data. Manual evaluations with the extracted keywords show the effectiveness of the topic models in topic extraction.\",\"PeriodicalId\":103662,\"journal\":{\"name\":\"2019 International Conference on Computational Science and Computational Intelligence (CSCI)\",\"volume\":\"68 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 International Conference on Computational Science and Computational Intelligence (CSCI)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CSCI49370.2019.00253\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 International Conference on Computational Science and Computational Intelligence (CSCI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CSCI49370.2019.00253","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Topic Extraction and Classification for Questions Posted in Community-Based Question Answering Services
This paper presents methods of simultaneously performing topic/keyword extraction and unsupervised classification for questions posted in community-based question answering services (CQA) or Q&A websites, using topic models and hybrid models. Large-scale experiments on two kinds of data, one called category data and the other called subtyping data, show the effectiveness of our methods. The purity and correct rate show that the topic models outperform clustering methods, hybrid models outperform topic models in question classification, and the adoption of term frequency-inverse document frequency is effective for the subtyping data. Manual evaluations with the extracted keywords show the effectiveness of the topic models in topic extraction.