2019 IEEE/ACM 16th International Conference on Mining Software Repositories (MSR)最新文献

筛选
英文 中文
Predicting Co-Changes between Functionality Specifications and Source Code in Behavior Driven Development 在行为驱动开发中预测功能规范和源代码之间的共同变更
Aidan Z. H. Yang, D. A. D. Costa, Ying Zou
{"title":"Predicting Co-Changes between Functionality Specifications and Source Code in Behavior Driven Development","authors":"Aidan Z. H. Yang, D. A. D. Costa, Ying Zou","doi":"10.1109/MSR.2019.00080","DOIUrl":"https://doi.org/10.1109/MSR.2019.00080","url":null,"abstract":"Behavior Driven Development (BDD) is an agile approach that uses. feature files to describe the functionalities of a software system using natural language constructs (English-like phrases). Because of the English-like structure of. feature files, BDD specifications become an evolving documentation that helps all (even non-technical) stakeholders to understand and contribute to a software project. After specifying a. feature files, developers can use a BDD tool (e.g., Cucumber) to automatically generate test cases and implement the code of the specified functionality. However, maintaining traceability between. feature files and source code requires human efforts. Therefore,. feature files can be out-of-date, reducing the advantages of using BDD. Furthermore, existing research do not attempt to improve the traceability between. feature files and source code files. In this paper, we study the co-changes between. feature files and source code files to improve the traceability between. feature files and source code files. Due to the English-like syntax of. feature files, we use natural language processing to identify co-changes, with an accuracy of 79%. We study the characteristics of BDD co-changes and build random forest models to predict when a. feature files should be modified before committing a code change. The random forest model obtains an AUC of 0.77. The model can assist developers in identifying when a. feature files should be modified in code commits. Once the traceability is up-to-date, BDD developers can write test code more efficiently and keep the software documentation up-to-date.","PeriodicalId":6706,"journal":{"name":"2019 IEEE/ACM 16th International Conference on Mining Software Repositories (MSR)","volume":"15 1","pages":"534-544"},"PeriodicalIF":0.0,"publicationDate":"2019-05-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80100303","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
Cleaning StackOverflow for Machine Translation 清理机器翻译的StackOverflow
Musfiqur Rahman, Peter C. Rigby, Dharani Palani, T. Nguyen
{"title":"Cleaning StackOverflow for Machine Translation","authors":"Musfiqur Rahman, Peter C. Rigby, Dharani Palani, T. Nguyen","doi":"10.1109/MSR.2019.00021","DOIUrl":"https://doi.org/10.1109/MSR.2019.00021","url":null,"abstract":"Generating source code API sequences from an English query using Machine Translation (MT) has gained much interest in recent years. For any kind of MT, the model needs to be trained on a parallel corpus. In this paper we clean StackOverflow, one of the most popular online discussion forums for programmers, to generate a parallel English-Code corpus from Android posts. We contrast three data cleaning approaches: standard NLP, title only, and software task extraction. We evaluate the quality of the each corpus for MT. To provide indicators of how useful each corpus will be for machine translation, we provide researchers with measurements of the corpus size, percentage of unique tokens, and per-word maximum likelihood alignment entropy. We have used these corpus cleaning approaches to translate between English and Code [22, 23], to compare existing SMT approaches from word mapping to neural networks [24], and to re-examine the \"natural software\" hypothesis [29]. After cleaning and aligning the data, we create a simple maximum likelihood MT model to show that English words in the corpus map to a small number of specific code elements. This model provides a basis for the success of using StackOverflow for search and other tasks in the software engineering literature and paves the way for MT. Our scripts and corpora are publicly available on GitHub [1] as well as at https://search.datacite.org/works/10.5281/zenodo.2558551.","PeriodicalId":6706,"journal":{"name":"2019 IEEE/ACM 16th International Conference on Mining Software Repositories (MSR)","volume":"18 1","pages":"79-83"},"PeriodicalIF":0.0,"publicationDate":"2019-05-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87249080","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
A Dataset of Non-Functional Bugs 非功能性bug的数据集
Aida Radu, Sarah Nadi
{"title":"A Dataset of Non-Functional Bugs","authors":"Aida Radu, Sarah Nadi","doi":"10.1109/MSR.2019.00066","DOIUrl":"https://doi.org/10.1109/MSR.2019.00066","url":null,"abstract":"While several researchers have published bug data sets in the past, there has been less focus on bugs related to non-functional requirements. Non-functional requirements describe the quality attributes of a program. In this work, we introduce NFBugs, a data set of 133 non-functional bug fixes collected from 65 open-source projects written in Java and Python. NFBugs can be used to support code recommender systems focusing on non-functional properties.","PeriodicalId":6706,"journal":{"name":"2019 IEEE/ACM 16th International Conference on Mining Software Repositories (MSR)","volume":"4 1","pages":"399-403"},"PeriodicalIF":0.0,"publicationDate":"2019-05-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85591473","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
Striking Gold in Software Repositories? An Econometric Study of Cryptocurrencies on GitHub 在软件存储库中找到金子?GitHub上加密货币的计量经济学研究
Asher Trockman, R. V. Tonder, Bogdan Vasilescu
{"title":"Striking Gold in Software Repositories? An Econometric Study of Cryptocurrencies on GitHub","authors":"Asher Trockman, R. V. Tonder, Bogdan Vasilescu","doi":"10.1109/MSR.2019.00036","DOIUrl":"https://doi.org/10.1109/MSR.2019.00036","url":null,"abstract":"Cryptocurrencies have a significant open source development presence on GitHub. This presents a unique opportunity to observe their related developer effort and software growth. Individual cryptocurrency prices are partly driven by attractiveness, and we hypothesize that high-quality, actively-developed software is one of its influences. Thus, we report on a study of a panel data set containing nearly a year of daily observations of development activity, popularity, and market capitalization for over two hundred open source cryptocurrencies. We find that open source project popularity is associated with higher market capitalization, though development activity and quality assurance practices are insignificant variables in our models. Using Granger causality tests, we find no compelling evidence for a dynamic relation between market capitalization and metrics such as daily stars, forks, watchers, commits, contributors, and lines of code changed.","PeriodicalId":6706,"journal":{"name":"2019 IEEE/ACM 16th International Conference on Mining Software Repositories (MSR)","volume":"69 1","pages":"181-185"},"PeriodicalIF":0.0,"publicationDate":"2019-05-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87142071","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
Beyond GumTree: A Hybrid Approach to Generate Edit Scripts 超越GumTree:生成编辑脚本的混合方法
Junnosuke Matsumoto, Yoshiki Higo, S. Kusumoto
{"title":"Beyond GumTree: A Hybrid Approach to Generate Edit Scripts","authors":"Junnosuke Matsumoto, Yoshiki Higo, S. Kusumoto","doi":"10.1109/MSR.2019.00082","DOIUrl":"https://doi.org/10.1109/MSR.2019.00082","url":null,"abstract":"On development using a version control system, understanding differences of source code is important. Edit scripts (in short, ES) represent differences between two versions of source code. One of the tools generating ESs is GumTree. GumTree takes two versions of source code as input and generates an ES consisting of insert, delete, update and move nodes of abstract syntax tree (in short, AST). However, the accuracy of move and update actions generated by GumTree is insufficient, which makes ESs more difficult to understand. A reason why the accuracy is insufficient is that GumTree generates ESs from only information of AST. Thus, in this research, we propose to generate easier-to-understand ESs by using not only structures of AST but also information of line differences. To evaluate our methodology, we applied it to some open source software, and we confirmed that ESs generated by our methodology are more helpful to understand the differences of source code than GumTree.","PeriodicalId":6706,"journal":{"name":"2019 IEEE/ACM 16th International Conference on Mining Software Repositories (MSR)","volume":"73 1","pages":"550-554"},"PeriodicalIF":0.0,"publicationDate":"2019-05-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84022045","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Keynote Abstract 主题抽象
{"title":"Keynote Abstract","authors":"","doi":"10.1109/msr.2019.00011","DOIUrl":"https://doi.org/10.1109/msr.2019.00011","url":null,"abstract":"","PeriodicalId":6706,"journal":{"name":"2019 IEEE/ACM 16th International Conference on Mining Software Repositories (MSR)","volume":"85 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2019-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84342801","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Title Page iii 第三页标题
{"title":"Title Page iii","authors":"","doi":"10.1109/msr.2019.00002","DOIUrl":"https://doi.org/10.1109/msr.2019.00002","url":null,"abstract":"","PeriodicalId":6706,"journal":{"name":"2019 IEEE/ACM 16th International Conference on Mining Software Repositories (MSR)","volume":"1 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2019-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89546860","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Investigating Next Steps in Static API-Misuse Detection 研究静态api误用检测的后续步骤
Sven Amann, H. Nguyen, Sarah Nadi, T. Nguyen, M. Mezini
{"title":"Investigating Next Steps in Static API-Misuse Detection","authors":"Sven Amann, H. Nguyen, Sarah Nadi, T. Nguyen, M. Mezini","doi":"10.1109/MSR.2019.00053","DOIUrl":"https://doi.org/10.1109/MSR.2019.00053","url":null,"abstract":"Application Programming Interfaces (APIs) often impose constraints such as call order or preconditions. API misuses, i.e., usages violating these constraints, may cause software crashes, data-loss, and vulnerabilities. Researchers developed several approaches to detect API misuses, typically still resulting in low recall and precision. In this work, we investigate ways to improve API-misuse detection. We design MUDetect, an API-misuse detector that builds on the strengths of existing detectors and tries to mitigate their weaknesses. MUDetect uses a new graph representation of API usages that captures different types of API misuses and a systematically designed ranking strategy that effectively improves precision. Evaluation shows that MUDetect identifies real-world API misuses with twice the recall of previous detectors and 2.5x higher precision. It even achieves almost 4x higher precision and recall, when mining patterns across projects, rather than from only the target project.","PeriodicalId":6706,"journal":{"name":"2019 IEEE/ACM 16th International Conference on Mining Software Repositories (MSR)","volume":"55 1","pages":"265-275"},"PeriodicalIF":0.0,"publicationDate":"2019-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86570877","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 44
GreenHub Farmer: Real-World Data for Android Energy Mining GreenHub Farmer: Android能源挖掘的真实世界数据
Hugo Matalonga, Bruno Cabral, F. C. Filho, Marco Couto, Rui Pereira, S. Sousa, J. Fernandes
{"title":"GreenHub Farmer: Real-World Data for Android Energy Mining","authors":"Hugo Matalonga, Bruno Cabral, F. C. Filho, Marco Couto, Rui Pereira, S. Sousa, J. Fernandes","doi":"10.1109/MSR.2019.00034","DOIUrl":"https://doi.org/10.1109/MSR.2019.00034","url":null,"abstract":"As mobile devices are supporting more and more of our daily activities, it is vital to widen their battery up-time as much as possible. In fact, according to the Wall Street Journal, 9/10 users suffer from low battery anxiety. The goal of our work is to understand how Android usage, apps, operating systems, hardware and user habits influence battery lifespan. Our strategy is to collect anonymous raw data from devices all over the world, through a mobile app, build and analyze a large-scale dataset containing real-world, day-to-day data, representative of user practices. So far, the dataset we collected includes 12 million+ (anonymous) data samples, across 900+ device brands and 5.000+ models. And, it keeps growing. The data we collect, which is publicly available and by different channels, is sufficiently heterogeneous for supporting studies with a wide range of focuses and research goals, thus opening the opportunity to inform and reshape user habits, and even influence the development of both hardware and software for mobile devices.","PeriodicalId":6706,"journal":{"name":"2019 IEEE/ACM 16th International Conference on Mining Software Repositories (MSR)","volume":"23 1","pages":"171-175"},"PeriodicalIF":0.0,"publicationDate":"2019-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77837581","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
A Large-Scale Study About Quality and Reproducibility of Jupyter Notebooks 木星笔记的质量和再现性的大规模研究
J. F. Pimentel, Leonardo Gresta Paulino Murta, V. Braganholo, J. Freire
{"title":"A Large-Scale Study About Quality and Reproducibility of Jupyter Notebooks","authors":"J. F. Pimentel, Leonardo Gresta Paulino Murta, V. Braganholo, J. Freire","doi":"10.1109/MSR.2019.00077","DOIUrl":"https://doi.org/10.1109/MSR.2019.00077","url":null,"abstract":"Jupyter Notebooks have been widely adopted by many different communities, both in science and industry. They support the creation of literate programming documents that combine code, text, and execution results with visualizations and all sorts of rich media. The self-documenting aspects and the ability to reproduce results have been touted as significant benefits of notebooks. At the same time, there has been growing criticism that the way notebooks are being used leads to unexpected behavior, encourage poor coding practices, and that their results can be hard to reproduce. To understand good and bad practices used in the development of real notebooks, we studied 1.4 million notebooks from GitHub. We present a detailed analysis of their characteristics that impact reproducibility. We also propose a set of best practices that can improve the rate of reproducibility and discuss open challenges that require further research and development.","PeriodicalId":6706,"journal":{"name":"2019 IEEE/ACM 16th International Conference on Mining Software Repositories (MSR)","volume":"81 1","pages":"507-517"},"PeriodicalIF":0.0,"publicationDate":"2019-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78658011","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 148
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信