2019 IEEE/ACM 16th International Conference on Mining Software Repositories (MSR)最新文献_第5页

Predicting Co-Changes between Functionality Specifications and Source Code in Behavior Driven Development 在行为驱动开发中预测功能规范和源代码之间的共同变更

2019 IEEE/ACM 16th International Conference on Mining Software Repositories (MSR) Pub Date : 2019-05-26 DOI: 10.1109/MSR.2019.00080

Aidan Z. H. Yang, D. A. D. Costa, Ying Zou

{"title":"Predicting Co-Changes between Functionality Specifications and Source Code in Behavior Driven Development","authors":"Aidan Z. H. Yang, D. A. D. Costa, Ying Zou","doi":"10.1109/MSR.2019.00080","DOIUrl":"https://doi.org/10.1109/MSR.2019.00080","url":null,"abstract":"Behavior Driven Development (BDD) is an agile approach that uses. feature files to describe the functionalities of a software system using natural language constructs (English-like phrases). Because of the English-like structure of. feature files, BDD specifications become an evolving documentation that helps all (even non-technical) stakeholders to understand and contribute to a software project. After specifying a. feature files, developers can use a BDD tool (e.g., Cucumber) to automatically generate test cases and implement the code of the specified functionality. However, maintaining traceability between. feature files and source code requires human efforts. Therefore,. feature files can be out-of-date, reducing the advantages of using BDD. Furthermore, existing research do not attempt to improve the traceability between. feature files and source code files. In this paper, we study the co-changes between. feature files and source code files to improve the traceability between. feature files and source code files. Due to the English-like syntax of. feature files, we use natural language processing to identify co-changes, with an accuracy of 79%. We study the characteristics of BDD co-changes and build random forest models to predict when a. feature files should be modified before committing a code change. The random forest model obtains an AUC of 0.77. The model can assist developers in identifying when a. feature files should be modified in code commits. Once the traceability is up-to-date, BDD developers can write test code more efficiently and keep the software documentation up-to-date.","PeriodicalId":6706,"journal":{"name":"2019 IEEE/ACM 16th International Conference on Mining Software Repositories (MSR)","volume":"15 1","pages":"534-544"},"PeriodicalIF":0.0,"publicationDate":"2019-05-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80100303","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 12

Cleaning StackOverflow for Machine Translation 清理机器翻译的StackOverflow

2019 IEEE/ACM 16th International Conference on Mining Software Repositories (MSR) Pub Date : 2019-05-26 DOI: 10.1109/MSR.2019.00021

Musfiqur Rahman, Peter C. Rigby, Dharani Palani, T. Nguyen

{"title":"Cleaning StackOverflow for Machine Translation","authors":"Musfiqur Rahman, Peter C. Rigby, Dharani Palani, T. Nguyen","doi":"10.1109/MSR.2019.00021","DOIUrl":"https://doi.org/10.1109/MSR.2019.00021","url":null,"abstract":"Generating source code API sequences from an English query using Machine Translation (MT) has gained much interest in recent years. For any kind of MT, the model needs to be trained on a parallel corpus. In this paper we clean StackOverflow, one of the most popular online discussion forums for programmers, to generate a parallel English-Code corpus from Android posts. We contrast three data cleaning approaches: standard NLP, title only, and software task extraction. We evaluate the quality of the each corpus for MT. To provide indicators of how useful each corpus will be for machine translation, we provide researchers with measurements of the corpus size, percentage of unique tokens, and per-word maximum likelihood alignment entropy. We have used these corpus cleaning approaches to translate between English and Code [22, 23], to compare existing SMT approaches from word mapping to neural networks [24], and to re-examine the \"natural software\" hypothesis [29]. After cleaning and aligning the data, we create a simple maximum likelihood MT model to show that English words in the corpus map to a small number of specific code elements. This model provides a basis for the success of using StackOverflow for search and other tasks in the software engineering literature and paves the way for MT. Our scripts and corpora are publicly available on GitHub [1] as well as at https://search.datacite.org/works/10.5281/zenodo.2558551.","PeriodicalId":6706,"journal":{"name":"2019 IEEE/ACM 16th International Conference on Mining Software Repositories (MSR)","volume":"18 1","pages":"79-83"},"PeriodicalIF":0.0,"publicationDate":"2019-05-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87249080","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

A Dataset of Non-Functional Bugs 非功能性bug的数据集

2019 IEEE/ACM 16th International Conference on Mining Software Repositories (MSR) Pub Date : 2019-05-26 DOI: 10.1109/MSR.2019.00066

Aida Radu, Sarah Nadi

引用次数: 13

Striking Gold in Software Repositories? An Econometric Study of Cryptocurrencies on GitHub 在软件存储库中找到金子?GitHub上加密货币的计量经济学研究

2019 IEEE/ACM 16th International Conference on Mining Software Repositories (MSR) Pub Date : 2019-05-26 DOI: 10.1109/MSR.2019.00036

Asher Trockman, R. V. Tonder, Bogdan Vasilescu

引用次数: 12

Beyond GumTree: A Hybrid Approach to Generate Edit Scripts 超越GumTree:生成编辑脚本的混合方法

2019 IEEE/ACM 16th International Conference on Mining Software Repositories (MSR) Pub Date : 2019-05-26 DOI: 10.1109/MSR.2019.00082

Junnosuke Matsumoto, Yoshiki Higo, S. Kusumoto

引用次数: 11

Keynote Abstract 主题抽象

2019 IEEE/ACM 16th International Conference on Mining Software Repositories (MSR) Pub Date : 2019-05-01 DOI: 10.1109/msr.2019.00011

引用次数: 0

Title Page iii 第三页标题

2019 IEEE/ACM 16th International Conference on Mining Software Repositories (MSR) Pub Date : 2019-05-01 DOI: 10.1109/msr.2019.00002

引用次数: 0

GreenHub Farmer: Real-World Data for Android Energy Mining GreenHub Farmer: Android能源挖掘的真实世界数据

2019 IEEE/ACM 16th International Conference on Mining Software Repositories (MSR) Pub Date : 2019-05-01 DOI: 10.1109/MSR.2019.00034

Hugo Matalonga, Bruno Cabral, F. C. Filho, Marco Couto, Rui Pereira, S. Sousa, J. Fernandes

引用次数: 12

Investigating Next Steps in Static API-Misuse Detection 研究静态api误用检测的后续步骤

2019 IEEE/ACM 16th International Conference on Mining Software Repositories (MSR) Pub Date : 2019-05-01 DOI: 10.1109/MSR.2019.00053

Sven Amann, H. Nguyen, Sarah Nadi, T. Nguyen, M. Mezini

引用次数: 44

PathMiner: A Library for Mining of Path-Based Representations of Code PathMiner:用于挖掘基于路径的代码表示的库

2019 IEEE/ACM 16th International Conference on Mining Software Repositories (MSR) Pub Date : 2019-05-01 DOI: 10.1109/MSR.2019.00013

V. Kovalenko, Egor Bogomolov, T. Bryksin, Alberto Bacchelli

引用次数: 39