GitHub上可扩展的相关项目建议

Wenyuan Xu, Xiaobing Sun, Xin Xia, Xiang Chen
{"title":"GitHub上可扩展的相关项目建议","authors":"Wenyuan Xu, Xiaobing Sun, Xin Xia, Xiang Chen","doi":"10.1145/3131704.3131706","DOIUrl":null,"url":null,"abstract":"GitHub, one of the largest social coding platforms, fosters a flexible and collaborative development process. In practice, developers in the open source software platform need to find projects relevant to their development work to reuse their function, explore ideas of possible features, or analyze the requirements for their projects. Recommending relevant projects to a developer is a difficult problem considering that there are millions of projects hosted on GitHub, and different developers may have different requirements on relevant projects. In this paper, we propose a scalable and personalized approach to recommend projects by leveraging both developers' behaviors and project features. Based on the features of projects created by developers and their behaviors to other projects, our approach automatically recommends top N most relevant software projects to developers. Moreover, to improve the scalability of our approach, we implement our approach in a parallel processing frame (i.e., Apache Spark) to analyze large-scale data on GitHub for efficient recommendation. We perform an empirical study on the data crawled from GitHub, and the results show that our approach can efficiently recommend relevant software projects with a relatively high precision fit for developers' interests.","PeriodicalId":349438,"journal":{"name":"Proceedings of the 9th Asia-Pacific Symposium on Internetware","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":"{\"title\":\"Scalable Relevant Project Recommendation on GitHub\",\"authors\":\"Wenyuan Xu, Xiaobing Sun, Xin Xia, Xiang Chen\",\"doi\":\"10.1145/3131704.3131706\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"GitHub, one of the largest social coding platforms, fosters a flexible and collaborative development process. In practice, developers in the open source software platform need to find projects relevant to their development work to reuse their function, explore ideas of possible features, or analyze the requirements for their projects. Recommending relevant projects to a developer is a difficult problem considering that there are millions of projects hosted on GitHub, and different developers may have different requirements on relevant projects. In this paper, we propose a scalable and personalized approach to recommend projects by leveraging both developers' behaviors and project features. Based on the features of projects created by developers and their behaviors to other projects, our approach automatically recommends top N most relevant software projects to developers. Moreover, to improve the scalability of our approach, we implement our approach in a parallel processing frame (i.e., Apache Spark) to analyze large-scale data on GitHub for efficient recommendation. We perform an empirical study on the data crawled from GitHub, and the results show that our approach can efficiently recommend relevant software projects with a relatively high precision fit for developers' interests.\",\"PeriodicalId\":349438,\"journal\":{\"name\":\"Proceedings of the 9th Asia-Pacific Symposium on Internetware\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-09-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"10\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 9th Asia-Pacific Symposium on Internetware\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3131704.3131706\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 9th Asia-Pacific Symposium on Internetware","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3131704.3131706","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 10

摘要

GitHub是最大的社交编码平台之一,它促进了灵活和协作的开发过程。在实践中,开源软件平台的开发人员需要找到与他们的开发工作相关的项目来重用他们的功能,探索可能的特性的想法,或者分析他们的项目的需求。考虑到GitHub上托管了数百万个项目,不同的开发人员可能对相关项目有不同的要求,向开发人员推荐相关项目是一件困难的事情。在本文中,我们提出了一种可扩展和个性化的方法,通过利用开发人员的行为和项目特性来推荐项目。基于开发人员创建的项目的特征和他们对其他项目的行为,我们的方法自动向开发人员推荐最相关的N个软件项目。此外,为了提高我们方法的可扩展性,我们在并行处理框架(即Apache Spark)中实现我们的方法,以分析GitHub上的大规模数据以进行有效推荐。我们对从GitHub上抓取的数据进行了实证研究,结果表明我们的方法可以高效地推荐适合开发者兴趣的相关软件项目,并且具有较高的精度。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Scalable Relevant Project Recommendation on GitHub
GitHub, one of the largest social coding platforms, fosters a flexible and collaborative development process. In practice, developers in the open source software platform need to find projects relevant to their development work to reuse their function, explore ideas of possible features, or analyze the requirements for their projects. Recommending relevant projects to a developer is a difficult problem considering that there are millions of projects hosted on GitHub, and different developers may have different requirements on relevant projects. In this paper, we propose a scalable and personalized approach to recommend projects by leveraging both developers' behaviors and project features. Based on the features of projects created by developers and their behaviors to other projects, our approach automatically recommends top N most relevant software projects to developers. Moreover, to improve the scalability of our approach, we implement our approach in a parallel processing frame (i.e., Apache Spark) to analyze large-scale data on GitHub for efficient recommendation. We perform an empirical study on the data crawled from GitHub, and the results show that our approach can efficiently recommend relevant software projects with a relatively high precision fit for developers' interests.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信