{"title":"CQA网站中使用监督学习的问答帖子的主题权威答疑人识别","authors":"T. P. Sahu, N. K. Nagwani, Shrish Verma","doi":"10.1145/2998476.2998490","DOIUrl":null,"url":null,"abstract":"Community Question Answering (CQA) site is an online platform for hosting information in question-answer form by collaborative users worldwide. There are basically two types of user in this CQA sites: Asker -- who post their query as questions and Answerer -- who provide the answers to these questions. The semi-structured and growing size of contents in CQA sites is posing several challenges. As there is no restriction in posting the number of answers to a question, so the common challenge is to identify the authoritative answerers of a question in order to evaluate the answer quality for selecting the best answer. In this paper, we use latent dirichlet allocation (LDA) the statistical topic modelling on textual data and statistical computing on metadata to identify the features that would reflect the topical authoritative of answerer. Then these features are represented as vector for each answerer of the dataset under investigation for learning the classifier model. The various baseline classifier model are used to identify the topical authoritative answerer on Q&A posts of two real dataset extracted from StackOverflow and AskUbuntu. The correctness and effectiveness of classifier models are evaluated using various parameters like accuracy, precision, recall, and kappa statistic. The experimental result shows that Random Forest classifier outperforms over each evaluation parameter than other classification algorithms.","PeriodicalId":171399,"journal":{"name":"Proceedings of the 9th Annual ACM India Conference","volume":"9 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Topical Authoritative Answerer Identification on Q&A Posts using Supervised Learning in CQA Sites\",\"authors\":\"T. P. Sahu, N. K. Nagwani, Shrish Verma\",\"doi\":\"10.1145/2998476.2998490\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Community Question Answering (CQA) site is an online platform for hosting information in question-answer form by collaborative users worldwide. There are basically two types of user in this CQA sites: Asker -- who post their query as questions and Answerer -- who provide the answers to these questions. The semi-structured and growing size of contents in CQA sites is posing several challenges. As there is no restriction in posting the number of answers to a question, so the common challenge is to identify the authoritative answerers of a question in order to evaluate the answer quality for selecting the best answer. In this paper, we use latent dirichlet allocation (LDA) the statistical topic modelling on textual data and statistical computing on metadata to identify the features that would reflect the topical authoritative of answerer. Then these features are represented as vector for each answerer of the dataset under investigation for learning the classifier model. The various baseline classifier model are used to identify the topical authoritative answerer on Q&A posts of two real dataset extracted from StackOverflow and AskUbuntu. The correctness and effectiveness of classifier models are evaluated using various parameters like accuracy, precision, recall, and kappa statistic. The experimental result shows that Random Forest classifier outperforms over each evaluation parameter than other classification algorithms.\",\"PeriodicalId\":171399,\"journal\":{\"name\":\"Proceedings of the 9th Annual ACM India Conference\",\"volume\":\"9 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-10-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 9th Annual ACM India Conference\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2998476.2998490\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 9th Annual ACM India Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2998476.2998490","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Topical Authoritative Answerer Identification on Q&A Posts using Supervised Learning in CQA Sites
Community Question Answering (CQA) site is an online platform for hosting information in question-answer form by collaborative users worldwide. There are basically two types of user in this CQA sites: Asker -- who post their query as questions and Answerer -- who provide the answers to these questions. The semi-structured and growing size of contents in CQA sites is posing several challenges. As there is no restriction in posting the number of answers to a question, so the common challenge is to identify the authoritative answerers of a question in order to evaluate the answer quality for selecting the best answer. In this paper, we use latent dirichlet allocation (LDA) the statistical topic modelling on textual data and statistical computing on metadata to identify the features that would reflect the topical authoritative of answerer. Then these features are represented as vector for each answerer of the dataset under investigation for learning the classifier model. The various baseline classifier model are used to identify the topical authoritative answerer on Q&A posts of two real dataset extracted from StackOverflow and AskUbuntu. The correctness and effectiveness of classifier models are evaluated using various parameters like accuracy, precision, recall, and kappa statistic. The experimental result shows that Random Forest classifier outperforms over each evaluation parameter than other classification algorithms.