2019 IEEE/ACM 16th International Conference on Mining Software Repositories (MSR)最新文献_第8页

Import2vec: Learning Embeddings for Software Libraries Import2vec:学习软件库的嵌入

2019 IEEE/ACM 16th International Conference on Mining Software Repositories (MSR) Pub Date : 2019-03-27 DOI: 10.1109/MSR.2019.00014

B. Theeten, Frederik Vandeputte, T. V. Cutsem

引用次数: 30

git2net - Mining Time-Stamped Co-Editing Networks from Large git Repositories 从大型git存储库中挖掘带有时间戳的协同编辑网络

2019 IEEE/ACM 16th International Conference on Mining Software Repositories (MSR) Pub Date : 2019-03-25 DOI: 10.1109/MSR.2019.00070

Christoph Gote, Ingo Scholtes, F. Schweitzer

{"title":"git2net - Mining Time-Stamped Co-Editing Networks from Large git Repositories","authors":"Christoph Gote, Ingo Scholtes, F. Schweitzer","doi":"10.1109/MSR.2019.00070","DOIUrl":"https://doi.org/10.1109/MSR.2019.00070","url":null,"abstract":"Data from software repositories have become an important foundation for the empirical study of software engineering processes. A recurring theme in the repository mining literature is the inference of developer networks capturing e.g. collaboration, coordination, or communication, from the commit history of projects. Most of the studied networks are based on the co-authorship of software artefacts defined at the level of files, modules, or packages. While this approach has led to insights into the social aspects of software development, it neglects detailed information on code changes and code ownership, e.g. which exact lines of code have been authored by which developers, that is contained in the commit log of software projects. Addressing this issue, we introduce git2net, a scalable python software that facilitates the extraction of fine-grained co-editing networks in large git repositories. It uses text mining techniques to analyse the detailed history of textual modifications within files. This information allows us to construct directed, weighted, and time-stamped networks, where a link signifies that one developer has edited a block of source code originally written by another developer. Our tool is applied in case studies of an Open Source and a commercial software project. We argue that it opens up a massive new source of high-resolution data on human collaboration patterns.","PeriodicalId":6706,"journal":{"name":"2019 IEEE/ACM 16th International Conference on Mining Software Repositories (MSR)","volume":"32 2 1","pages":"433-444"},"PeriodicalIF":0.0,"publicationDate":"2019-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89916240","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 19

Identifying Experts in Software Libraries and Frameworks Among GitHub Users 在GitHub用户中识别软件库和框架专家

2019 IEEE/ACM 16th International Conference on Mining Software Repositories (MSR) Pub Date : 2019-03-19 DOI: 10.1109/MSR.2019.00054

João Eduardo Montandon, L. L. Silva, M. T. Valente

{"title":"Identifying Experts in Software Libraries and Frameworks Among GitHub Users","authors":"João Eduardo Montandon, L. L. Silva, M. T. Valente","doi":"10.1109/MSR.2019.00054","DOIUrl":"https://doi.org/10.1109/MSR.2019.00054","url":null,"abstract":"Software development increasingly depends on libraries and frameworks to increase productivity and reduce time-to-market. Despite this fact, we still lack techniques to assess developers expertise in widely popular libraries and frameworks. In this paper, we evaluate the performance of unsupervised (based on clustering) and supervised machine learning classifiers (Random Forest and SVM) to identify experts in three popular JavaScript libraries: facebook/react, mongodb/node-mongodb, and socketio/socket.io. First, we collect 13 features about developers activity on GitHub projects, including commits on source code files that depend on these libraries. We also build a ground truth including the expertise of 575 developers on the studied libraries, as self-reported by them in a survey. Based on our findings, we document the challenges of using machine learning classifiers to predict expertise in software libraries, using features extracted from GitHub. Then, we propose a method to identify library experts based on clustering feature data from GitHub; by triangulating the results of this method with information available on Linkedin profiles, we show that it is able to recommend dozens of GitHub users with evidences of being experts in the studied JavaScript libraries. We also provide a public dataset with the expertise of 575 developers on the studied libraries.","PeriodicalId":6706,"journal":{"name":"2019 IEEE/ACM 16th International Conference on Mining Software Repositories (MSR)","volume":"77 1","pages":"276-287"},"PeriodicalIF":0.0,"publicationDate":"2019-03-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79687192","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 38

Automatically Generating Documentation for Lambda Expressions in Java 在Java中自动生成Lambda表达式文档

2019 IEEE/ACM 16th International Conference on Mining Software Repositories (MSR) Pub Date : 2019-03-15 DOI: 10.1109/MSR.2019.00057

Anwar Alqaimi, Patanamon Thongtanunam, Christoph Treude

{"title":"Automatically Generating Documentation for Lambda Expressions in Java","authors":"Anwar Alqaimi, Patanamon Thongtanunam, Christoph Treude","doi":"10.1109/MSR.2019.00057","DOIUrl":"https://doi.org/10.1109/MSR.2019.00057","url":null,"abstract":"When lambda expressions were introduced to the Java programming language as part of the release of Java 8 in 2014, they were the language's first step into functional programming. Since lambda expressions are still relatively new, not all developers use or understand them. In this paper, we first present the results of an empirical study to determine how frequently developers of GitHub repositories make use of lambda expressions and how they are documented. We find that 11% of Java GitHub repositories use lambda expressions, and that only 6% of the lambda expressions are accompanied by source code comments. We then present a tool called LambdaDoc which can automatically detect lambda expressions in a Java repository and generate natural language documentation for them. Our evaluation of LambdaDoc with 23 professional developers shows that they perceive the generated documentation to be complete, concise, and expressive, while the majority of the documentation produced by our participants without tool support was inadequate. Our contribution builds an important step towards automatically generating documentation for functional programming constructs in an object-oriented language.","PeriodicalId":6706,"journal":{"name":"2019 IEEE/ACM 16th International Conference on Mining Software Repositories (MSR)","volume":"237 1","pages":"310-320"},"PeriodicalIF":0.0,"publicationDate":"2019-03-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77276069","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 13

The Emergence of Software Diversity in Maven Central Maven Central中软件多样性的出现

2019 IEEE/ACM 16th International Conference on Mining Software Repositories (MSR) Pub Date : 2019-03-13 DOI: 10.1109/MSR.2019.00059

César Soto-Valero, Amine Benelallam, Nicolas Harrand, Olivier Barais, B. Baudry

{"title":"The Emergence of Software Diversity in Maven Central","authors":"César Soto-Valero, Amine Benelallam, Nicolas Harrand, Olivier Barais, B. Baudry","doi":"10.1109/MSR.2019.00059","DOIUrl":"https://doi.org/10.1109/MSR.2019.00059","url":null,"abstract":"Maven artifacts are immutable: an artifact that is uploaded on Maven Central cannot be removed nor modified. The only way for developers to upgrade their library is to release a new version. Consequently, Maven Central accumulates all the versions of all the libraries that are published there, and applications that declare a dependency towards a library can pick any version. In this work, we hypothesize that the immutability of Maven artifacts and the ability to choose any version naturally support the emergence of software diversity within Maven Central. We analyze 1,487,956 artifacts that represent all the versions of 73,653 libraries. We observe that more than 30% of libraries have multiple versions that are actively used by latest artifacts. In the case of popular libraries, more than 50% of their versions are used. We also observe that more than 17% of libraries have several versions that are significantly more used than the other versions. Our results indicate that the immutability of artifacts in Maven Central does support a sustained level of diversity among versions of libraries in the repository.","PeriodicalId":6706,"journal":{"name":"2019 IEEE/ACM 16th International Conference on Mining Software Repositories (MSR)","volume":"27 1","pages":"333-343"},"PeriodicalIF":0.0,"publicationDate":"2019-03-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80100666","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 26

A Manually-Curated Dataset of Fixes to Vulnerabilities of Open-Source Software 开源软件漏洞修复的人工管理数据集

2019 IEEE/ACM 16th International Conference on Mining Software Repositories (MSR) Pub Date : 2019-02-07 DOI: 10.1109/MSR.2019.00064

Serena Elisa Ponta, H. Plate, A. Sabetta, M. Bezzi, Cédric Dangremont

{"title":"A Manually-Curated Dataset of Fixes to Vulnerabilities of Open-Source Software","authors":"Serena Elisa Ponta, H. Plate, A. Sabetta, M. Bezzi, Cédric Dangremont","doi":"10.1109/MSR.2019.00064","DOIUrl":"https://doi.org/10.1109/MSR.2019.00064","url":null,"abstract":"Advancing our understanding of software vulnerabilities, automating their identification, the analysis of their impact, and ultimately their mitigation is necessary to enable the development of software that is more secure. While operating a vulnerability assessment tool, which we developed, and that is currently used by hundreds of development units at SAP, we manually collected and curated a dataset of vulnerabilities of open-source software, and the commits fixing them. The data were obtained both from the National Vulnerability Database (NVD), and from project-specific web resources, which we monitor on a continuous basis. From that data, we extracted a dataset that maps 624 publicly disclosed vulnerabilities affecting 205 distinct opensource Java projects, used in SAP products or internal tools, onto the 1282 commits that fix them. Out of 624 vulnerabilities, 29 do not have a CVE (Common Vulnerability and Exposure) identifier at all, and 46, which do have such identifier assigned by a numbering authority, are not available in the NVD yet. The dataset is released under an open-source license, together with supporting scripts that allow researchers to automatically retrieve the actual content of the commits from the corresponding repositories, and to augment the attributes available for each instance. Moreover, these scripts allow to complement the dataset with additional instances that are not security fixes (which is useful, for example, in machine learning applications). Our dataset has been successfully used to train classifiers that could automatically identify security-relevant commits in code repositories. The release of this dataset and the supporting code as open-source will allow future research to be based on data of industrial relevance; it also represents a concrete step towards making the maintenance of this dataset a shared effort involving open-source communities, academia, and the industry.","PeriodicalId":6706,"journal":{"name":"2019 IEEE/ACM 16th International Conference on Mining Software Repositories (MSR)","volume":"92 1","pages":"383-387"},"PeriodicalIF":0.0,"publicationDate":"2019-02-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87287827","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 82

The Maven Dependency Graph: A Temporal Graph-Based Representation of Maven Central Maven依赖图:Maven Central的临时图表示

2019 IEEE/ACM 16th International Conference on Mining Software Repositories (MSR) Pub Date : 2019-01-16 DOI: 10.1109/MSR.2019.00060

Amine Benelallam, Nicolas Harrand, César Soto-Valero, B. Baudry, Olivier Barais

引用次数: 30

Recommending Energy-Efficient Java Collections 推荐节能的Java集合

2019 IEEE/ACM 16th International Conference on Mining Software Repositories (MSR) Pub Date : 2019-01-01 DOI: 10.1109/MSR.2019.00033

Wellington de Oliveira Júnior, R. Oliveira dos Santos, Fernando José Castor de Lima Filho, Benito Fernandes de Araújo Neto, Gustavo Henrique Lima Pinto

{"title":"Recommending Energy-Efficient Java Collections","authors":"Wellington de Oliveira Júnior, R. Oliveira dos Santos, Fernando José Castor de Lima Filho, Benito Fernandes de Araújo Neto, Gustavo Henrique Lima Pinto","doi":"10.1109/MSR.2019.00033","DOIUrl":"https://doi.org/10.1109/MSR.2019.00033","url":null,"abstract":"Over the last years, increasing attention has been given to creating energy-efficient software systems. However, developers still lack the knowledge and the tools to support them in that task. In this work, we explore our vision that energy consumption non-specialists can build software that consumes less energy by alternating, at development time, between third-party, readily available, diversely-designed pieces of software, without increasing the development complexity. To support our vision, we propose an approach for energy-aware development that combines the construction of application-independent energy profiles of Java collections and static analysis to produce an estimate of in which ways and how intensively a system employs these collections. By combining these two pieces of information, it is possible to produce energy-saving recommendations for alternative collection implementations to be used in different parts of the system. We implement this approach in a tool named CT+ that works with both desktop and mobile Java systems, and is capable of analyzing 40 different collection implementations of lists, maps, and sets. We applied CT+ to twelve software systems: two mobile-based, seven desktop-based, and three that can run in both environments. Our evaluation infrastructure involved a high-end server, a notebook, and three mobile devices. When applying the (mostly trivial) recommendations, we achieved up to 17.34% reduction in energy consumption just by replacing collection implementations. Even for a real world, mature, highly-optimized system such as Xalan, CT+ could achieve a 5.81% reduction in energy consumption. Our results indicate that some widely used collections, e.g., ArrayList, HashMap, and HashTable, are not energy-efficient and sometimes should be avoided when energy consumption is a major concern.","PeriodicalId":6706,"journal":{"name":"2019 IEEE/ACM 16th International Conference on Mining Software Repositories (MSR)","volume":"10 1","pages":"160-170"},"PeriodicalIF":0.0,"publicationDate":"2019-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79670762","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Splitting APIs: An Exploratory Study of Software Unbundling 拆分api:软件解绑的探索性研究

2019 IEEE/ACM 16th International Conference on Mining Software Repositories (MSR) Pub Date : 2019-01-01 DOI: 10.1109/MSR.2019.00062

Anderson Severo de Matos, João Bosco Ferreira Filho, Lincoln Souza Rocha

引用次数: 0

Program Committee 项目委员会

2019 IEEE/ACM 16th International Conference on Mining Software Repositories (MSR) Pub Date : 2018-12-01 DOI: 10.1109/eitt.2018.00007

Rui Abreu

{"title":"Program Committee","authors":"Rui Abreu","doi":"10.1109/eitt.2018.00007","DOIUrl":"https://doi.org/10.1109/eitt.2018.00007","url":null,"abstract":"Rui Abreu, University of Lisbon, Portugal Jun Ai, Beihang University, China Domenico Amalfitano, University of Naples Federico II, Italy Doo-Hwan Bae, Korea Advanced Institute of Science and Technology, Korea Xiaoying Bai, Tsinghua University, China Lingfeng Bao, Zhejiang University City College, China David Benavides, University of Seville, Spain Antonia Bertolino, Italian National Research Council, Italy Mario Bravetti, Università di Bologna, Italy Christof Budnik, Siemens, Germany Yan Cai, Chinese Academy of Sciences, China Emilia Cambronero, Universidad Castilla-La Mancha, Spain Ana Cavalli, IT SudParis, France Arun Chakrapani Rao, University of Warwick, UK W.K. Chan, City University of Hong Kong, Hong Kong Junjie Chen, Peking University, China Yue Chen, Palo Alto Networks, USA William Chu, Tunghai University, Taiwan Sunita Chulani, Cisco, USA Frederic Dadeau, University of Franche-Comté, France Yuanshun Dai, University of Electronic Science and Technology of China, China Junhua Ding, East Carolina University, USA Tadashi Dohi, Hiroshima University, Japan Wei Dong, National University of Defense Technology, China Yunwei Dong, Northwestern Polytechnical University, China Benedikt Eberhardinger, MHP — A Porsche Company, Germany Khaled El-Fakih, American University of Sharjah, UAE Sadik Esmelioglu, Middle East Technical University, Turkey Hugues Evrard, Imperial College London, UK Joao Pascoal Faria, University of Porto, Portugal Thoshitha Gamage, Southern Illinois University Edwardsville, USA Sudipto Ghosh, Colorado State University, USA Arnaud Gotlieb, Simula Research Laboratory, Norway Matthias Güdemann, Input Output Hong Kong, Hong Kong Rajiv Gupta, University of California, Riverside, USA Chin-Yu Huang, National Tsing-Hua University, Taiwan Song Huang, Army Engineering University, China Ali Hurson, Missouri University of Science and Technology, USA Bo Jiang, Beihang University, China He Jiang, Dalian University of Technology, China Yu Jiang, Tsinghua University, China Xiaoyuan Jing, Wuhan University, China Roland Jochem, TU Berlin, Germany Sun Jun, Singapore University of Technology and Design, Singapore Jacky Keung, City University of Hong Kong, Hong Kong Pavneet Kochhar, Microsoft, USA Xuan-Bach Le, Carnegie Mellon University, USA","PeriodicalId":6706,"journal":{"name":"2019 IEEE/ACM 16th International Conference on Mining Software Repositories (MSR)","volume":"13 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74509026","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0