Expert as a Service: Software Expert Recommendation via Knowledge Domain Embeddings in Stack Overflow

Chaoran Huang, Lina Yao, Xianzhi Wang, B. Benatallah, Quan Z. Sheng
{"title":"Expert as a Service: Software Expert Recommendation via Knowledge Domain Embeddings in Stack Overflow","authors":"Chaoran Huang, Lina Yao, Xianzhi Wang, B. Benatallah, Quan Z. Sheng","doi":"10.1109/ICWS.2017.122","DOIUrl":null,"url":null,"abstract":"Question answering (Q&A) communities have gained momentum recently as an effective means of knowledge sharing over the crowds, where many users are experts in the real-world and can make quality contributions in certain domains or technologies. Although the massive user-generated Q&A data present a valuable source of human knowledge, a related challenging issue is how to find those expert users effectively. In this paper, we propose a framework for finding such experts in a collaborative network. Accredited with recent works on distributed word representations, we are able to summarize text chunks from the semantics perspective and infer knowledge domains by clustering pre-trained word vectors. In particular, we exploit a graph-based clustering method for knowledge domain extraction and discern the shared latent factors using matrix factorization techniques. The proposed clustering method features requiring no post-processing of clustering indicators and the matrix factorization method is combined with the semantic similarity of the historical answers to conduct expertise ranking of users given a query. We use Stack Overflow, a website with a large group of users and a large number of posts on topics related to computer programming, to evaluate the proposed approach and conduct extensively experiments to show the effectiveness of our approach.","PeriodicalId":235426,"journal":{"name":"2017 IEEE International Conference on Web Services (ICWS)","volume":"13 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"16","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 IEEE International Conference on Web Services (ICWS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICWS.2017.122","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 16

Abstract

Question answering (Q&A) communities have gained momentum recently as an effective means of knowledge sharing over the crowds, where many users are experts in the real-world and can make quality contributions in certain domains or technologies. Although the massive user-generated Q&A data present a valuable source of human knowledge, a related challenging issue is how to find those expert users effectively. In this paper, we propose a framework for finding such experts in a collaborative network. Accredited with recent works on distributed word representations, we are able to summarize text chunks from the semantics perspective and infer knowledge domains by clustering pre-trained word vectors. In particular, we exploit a graph-based clustering method for knowledge domain extraction and discern the shared latent factors using matrix factorization techniques. The proposed clustering method features requiring no post-processing of clustering indicators and the matrix factorization method is combined with the semantic similarity of the historical answers to conduct expertise ranking of users given a query. We use Stack Overflow, a website with a large group of users and a large number of posts on topics related to computer programming, to evaluate the proposed approach and conduct extensively experiments to show the effectiveness of our approach.
专家即服务:基于知识域嵌入的堆栈溢出软件专家推荐
问答(Q&A)社区作为一种有效的群体知识共享手段,最近势头强劲,许多用户都是现实世界的专家,可以在某些领域或技术上做出高质量的贡献。尽管大量用户生成的问答数据提供了人类知识的宝贵来源,但如何有效地找到这些专家用户是一个具有挑战性的问题。在本文中,我们提出了一个在协作网络中寻找此类专家的框架。根据最近在分布式词表示方面的研究成果,我们能够从语义的角度总结文本块,并通过聚类预训练的词向量来推断知识领域。特别是,我们利用基于图的聚类方法进行知识领域提取,并使用矩阵分解技术来识别共享的潜在因素。该聚类方法的特点是不需要对聚类指标进行后处理,并将矩阵分解方法与历史答案的语义相似度相结合,对给定查询的用户进行专业度排序。我们使用Stack Overflow(一个拥有大量用户和大量与计算机编程相关主题的帖子的网站)来评估所提出的方法,并进行广泛的实验来显示我们方法的有效性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信