Expert as a Service: Software Expert Recommendation via Knowledge Domain Embeddings in Stack Overflow

2017 IEEE International Conference on Web Services (ICWS) Pub Date : 2017-06-01 DOI:10.1109/ICWS.2017.122

Chaoran Huang, Lina Yao, Xianzhi Wang, B. Benatallah, Quan Z. Sheng

{"title":"Expert as a Service: Software Expert Recommendation via Knowledge Domain Embeddings in Stack Overflow","authors":"Chaoran Huang, Lina Yao, Xianzhi Wang, B. Benatallah, Quan Z. Sheng","doi":"10.1109/ICWS.2017.122","DOIUrl":null,"url":null,"abstract":"Question answering (Q&A) communities have gained momentum recently as an effective means of knowledge sharing over the crowds, where many users are experts in the real-world and can make quality contributions in certain domains or technologies. Although the massive user-generated Q&A data present a valuable source of human knowledge, a related challenging issue is how to find those expert users effectively. In this paper, we propose a framework for finding such experts in a collaborative network. Accredited with recent works on distributed word representations, we are able to summarize text chunks from the semantics perspective and infer knowledge domains by clustering pre-trained word vectors. In particular, we exploit a graph-based clustering method for knowledge domain extraction and discern the shared latent factors using matrix factorization techniques. The proposed clustering method features requiring no post-processing of clustering indicators and the matrix factorization method is combined with the semantic similarity of the historical answers to conduct expertise ranking of users given a query. We use Stack Overflow, a website with a large group of users and a large number of posts on topics related to computer programming, to evaluate the proposed approach and conduct extensively experiments to show the effectiveness of our approach.","PeriodicalId":235426,"journal":{"name":"2017 IEEE International Conference on Web Services (ICWS)","volume":"13 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"16","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 IEEE International Conference on Web Services (ICWS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICWS.2017.122","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 16

Abstract

Question answering (Q&A) communities have gained momentum recently as an effective means of knowledge sharing over the crowds, where many users are experts in the real-world and can make quality contributions in certain domains or technologies. Although the massive user-generated Q&A data present a valuable source of human knowledge, a related challenging issue is how to find those expert users effectively. In this paper, we propose a framework for finding such experts in a collaborative network. Accredited with recent works on distributed word representations, we are able to summarize text chunks from the semantics perspective and infer knowledge domains by clustering pre-trained word vectors. In particular, we exploit a graph-based clustering method for knowledge domain extraction and discern the shared latent factors using matrix factorization techniques. The proposed clustering method features requiring no post-processing of clustering indicators and the matrix factorization method is combined with the semantic similarity of the historical answers to conduct expertise ranking of users given a query. We use Stack Overflow, a website with a large group of users and a large number of posts on topics related to computer programming, to evaluate the proposed approach and conduct extensively experiments to show the effectiveness of our approach.

查看原文本刊更多论文

专家即服务:基于知识域嵌入的堆栈溢出软件专家推荐

问答(Q&A)社区作为一种有效的群体知识共享手段，最近势头强劲，许多用户都是现实世界的专家，可以在某些领域或技术上做出高质量的贡献。尽管大量用户生成的问答数据提供了人类知识的宝贵来源，但如何有效地找到这些专家用户是一个具有挑战性的问题。在本文中，我们提出了一个在协作网络中寻找此类专家的框架。根据最近在分布式词表示方面的研究成果，我们能够从语义的角度总结文本块，并通过聚类预训练的词向量来推断知识领域。特别是，我们利用基于图的聚类方法进行知识领域提取，并使用矩阵分解技术来识别共享的潜在因素。该聚类方法的特点是不需要对聚类指标进行后处理，并将矩阵分解方法与历史答案的语义相似度相结合，对给定查询的用户进行专业度排序。我们使用Stack Overflow(一个拥有大量用户和大量与计算机编程相关主题的帖子的网站)来评估所提出的方法，并进行广泛的实验来显示我们方法的有效性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2017 IEEE International Conference on Web Services (ICWS)

自引率

0.00%

发文量