Zhixing Li, Tao Wang, Yang Zhang, Y. Zhan, Gang Yin
{"title":"Query reformulation by leveraging crowd wisdom for scenario-based software search","authors":"Zhixing Li, Tao Wang, Yang Zhang, Y. Zhan, Gang Yin","doi":"10.1145/2993717.2993723","DOIUrl":null,"url":null,"abstract":"The Internet-scale open source software (OSS) production in various communities are generating abundant reusable resources for software developers. However, how to retrieve and reuse the desired and mature software from huge amounts of candidates is a great challenge: there are usually big gaps between the user application contexts (that often used as queries) and the OSS key words (that often used to match the queries). In this paper, we define the scenario-based query problem for OSS retrieval, and then we propose a novel approach to reformulate the raw query by leveraging the crowd wisdom from millions of developers to improve the retrieval results. We build a software-specific domain lexical database based on the knowledge in open source communities, by which we can expand and optimize the input queries. The experiment results show that, our approach can reformulate the initial query effectively and outperforms other existing search engines significantly at finding mature software.","PeriodicalId":20631,"journal":{"name":"Proceedings of the 8th Asia-Pacific Symposium on Internetware","volume":"28 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2016-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"22","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 8th Asia-Pacific Symposium on Internetware","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2993717.2993723","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 22
Abstract
The Internet-scale open source software (OSS) production in various communities are generating abundant reusable resources for software developers. However, how to retrieve and reuse the desired and mature software from huge amounts of candidates is a great challenge: there are usually big gaps between the user application contexts (that often used as queries) and the OSS key words (that often used to match the queries). In this paper, we define the scenario-based query problem for OSS retrieval, and then we propose a novel approach to reformulate the raw query by leveraging the crowd wisdom from millions of developers to improve the retrieval results. We build a software-specific domain lexical database based on the knowledge in open source communities, by which we can expand and optimize the input queries. The experiment results show that, our approach can reformulate the initial query effectively and outperforms other existing search engines significantly at finding mature software.