2019 IEEE/ACM 16th International Conference on Mining Software Repositories (MSR)最新文献

筛选
英文 中文
PathMiner: A Library for Mining of Path-Based Representations of Code PathMiner:用于挖掘基于路径的代码表示的库
V. Kovalenko, Egor Bogomolov, T. Bryksin, Alberto Bacchelli
{"title":"PathMiner: A Library for Mining of Path-Based Representations of Code","authors":"V. Kovalenko, Egor Bogomolov, T. Bryksin, Alberto Bacchelli","doi":"10.1109/MSR.2019.00013","DOIUrl":"https://doi.org/10.1109/MSR.2019.00013","url":null,"abstract":"One recent, significant advance in modeling source code for machine learning algorithms has been the introduction of path-based representation – an approach consisting in representing a snippet of code as a collection of paths from its syntax tree. Such representation efficiently captures the structure of code, which, in turn, carries its semantics and other information. Building the path-based representation involves parsing the code and extracting the paths from its syntax tree; these steps build up to a substantial technical job. With no common reusable toolkit existing for this task, the burden of mining diverts the focus of researchers from the essential work and hinders newcomers in the field of machine learning on code. In this paper, we present PathMiner – an open-source library for mining path-based representations of code. PathMiner is fast, flexible, well-tested, and easily extensible to support input code in any common programming language. Preprint [https://doi.org/10.5281/zenodo.2595271]; released tool [https://doi.org/10.5281/zenodo.2595257].","PeriodicalId":6706,"journal":{"name":"2019 IEEE/ACM 16th International Conference on Mining Software Repositories (MSR)","volume":"31 1","pages":"13-17"},"PeriodicalIF":0.0,"publicationDate":"2019-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90108187","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 39
An Empirical Study of Multiple Names and Email Addresses in OSS Version Control Repositories OSS版本控制库中多个名称和电子邮件地址的实证研究
Jiaxin Zhu, Jun Wei
{"title":"An Empirical Study of Multiple Names and Email Addresses in OSS Version Control Repositories","authors":"Jiaxin Zhu, Jun Wei","doi":"10.1109/MSR.2019.00068","DOIUrl":"https://doi.org/10.1109/MSR.2019.00068","url":null,"abstract":"Data produced by version control systems are widely used in software research and development. Version control data users always use the name or email address field to identify the committer or author of a modification. However, developers may use multiple names and email addresses, which brings difficulties for identification of distinct developers. In this paper, we sample 450 Git repositories from GitHub to study the multiple names and email addresses of developers. We conduct a conservative estimation of its prevalence and impact on related measurements. We merge the multiple names and email addresses of a developer through a method of high precision. With the merged identities, we obtain a number of interesting findings, e.g., about 6% of the developers used multiple names or email addresses in more than 60% of the repositories, and they contributed about half of all the commits. Our impact analysis shows that the multiple names and email addresses issue cannot be ignored for the basic related measurements, e.g., the number of developers in a repository. Our results could help researchers and practitioners have a more clear understanding of multiple names and email addresses in practice to improve the accuracy of related measurements.","PeriodicalId":6706,"journal":{"name":"2019 IEEE/ACM 16th International Conference on Mining Software Repositories (MSR)","volume":"1 1","pages":"409-420"},"PeriodicalIF":0.0,"publicationDate":"2019-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89485463","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Challenges with Responding to Static Analysis Tool Alerts 响应静态分析工具警报的挑战
Nasif Imtiaz, A. Rahman, Effat Farhana, L. Williams
{"title":"Challenges with Responding to Static Analysis Tool Alerts","authors":"Nasif Imtiaz, A. Rahman, Effat Farhana, L. Williams","doi":"10.1109/MSR.2019.00049","DOIUrl":"https://doi.org/10.1109/MSR.2019.00049","url":null,"abstract":"Static analysis tool alerts can help developers detect potential defects in the code early in the development cycle. However, developers are not always able to respond to the alerts with their preferred action and may turn away from using the tool. In this paper, we qualitatively analyze 280 Stack Overflow (SO) questions regarding static analysis tool alerts to identify the challenges developers face in understanding and responding to these alerts. We find that the most prevalent question on SO is how to ignore and filter alerts, followed by validation of false positives. Our findings confirm prior researchers' findings related to notification communication theory as 44.6% of the SO questions that we analyzed indicate developers face communication challenges.","PeriodicalId":6706,"journal":{"name":"2019 IEEE/ACM 16th International Conference on Mining Software Repositories (MSR)","volume":"55 1","pages":"245-249"},"PeriodicalIF":0.0,"publicationDate":"2019-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83747370","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 32
Snoring: A Noise in Defect Prediction Datasets 打鼾:缺陷预测数据集中的噪声
A. Ahluwalia, D. Falessi, M. D. Penta
{"title":"Snoring: A Noise in Defect Prediction Datasets","authors":"A. Ahluwalia, D. Falessi, M. D. Penta","doi":"10.1109/MSR.2019.00019","DOIUrl":"https://doi.org/10.1109/MSR.2019.00019","url":null,"abstract":"In order to develop and train defect prediction models, researchers rely on datasets in which a defect is often attributed to a release where the defect itself is discovered. However, in many circumstances, it can happen that a defect is only discovered several releases after its introduction. This might introduce a bias in the dataset, i.e., treating the intermediate releases as defect-free and the latter as defect-prone. We call this phenomenon as \"sleeping defects\". We call \"snoring\" the phenomenon where classes are affected by sleeping defects only, that would be treated as defect-free until the defect is discovered. In this paper we analyze, on data from 282 releases of six open source projects from the Apache ecosystem, the magnitude of the sleeping defects and of the snoring classes. Our results indicate that 1) on all projects, most of the defects in a project slept for more than 20% of the existing releases, and 2) in the majority of the projects the missing rate is more than 25% even if we remove 50% of releases.","PeriodicalId":6706,"journal":{"name":"2019 IEEE/ACM 16th International Conference on Mining Software Repositories (MSR)","volume":"1 1","pages":"63-67"},"PeriodicalIF":0.0,"publicationDate":"2019-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82795391","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
[Title page i] [标题页i]
{"title":"[Title page i]","authors":"","doi":"10.1109/msr.2019.00001","DOIUrl":"https://doi.org/10.1109/msr.2019.00001","url":null,"abstract":"","PeriodicalId":6706,"journal":{"name":"2019 IEEE/ACM 16th International Conference on Mining Software Repositories (MSR)","volume":"49 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2019-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88240683","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
ConPan: A Tool to Analyze Packages in Software Containers 分析软件容器中的包的工具
Ahmed Zerouali, Valerio Cosentino, G. Robles, Jesus M. Gonzalez-Barahona, T. Mens
{"title":"ConPan: A Tool to Analyze Packages in Software Containers","authors":"Ahmed Zerouali, Valerio Cosentino, G. Robles, Jesus M. Gonzalez-Barahona, T. Mens","doi":"10.1109/MSR.2019.00089","DOIUrl":"https://doi.org/10.1109/MSR.2019.00089","url":null,"abstract":"Deploying software packages and services into containers is a popular software engineering practice that increases portability and reusability. Docker, the most popular containerization technology, helps DevOps practitioners in their daily activities. Despite being successfully and increasingly employed, containers may include buggy and vulnerable packages that put at risk the environments in which the containers have been deployed. Existing quality and security monitoring tools provide only limited support to analyze Docker containers, thus forcing practitioners to perform additional manual work or develop adhoc scripts when the analysis goes beyond security purposes. This limitation also affects researchers desiring to empirically study the evolution dynamics of Docker containers and their contained packages. To overcome this limitation, we present ConPan, an automated tool to inspect the characteristics of packages in Docker containers, such as their outdatedness and other possible flaws (e.g., bugs and security vulnerabilities). ConPan comes with a CLI and API, and the analysis results can be presented to the user in a variety of formats.","PeriodicalId":6706,"journal":{"name":"2019 IEEE/ACM 16th International Conference on Mining Software Repositories (MSR)","volume":"8 1","pages":"592-596"},"PeriodicalIF":0.0,"publicationDate":"2019-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81351279","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Automated Software Vulnerability Assessment with Concept Drift 基于概念漂移的自动化软件漏洞评估
T. H. Le, Bushra Sabir, M. Babar
{"title":"Automated Software Vulnerability Assessment with Concept Drift","authors":"T. H. Le, Bushra Sabir, M. Babar","doi":"10.1109/MSR.2019.00063","DOIUrl":"https://doi.org/10.1109/MSR.2019.00063","url":null,"abstract":"Software Engineering researchers are increasingly using Natural Language Processing (NLP) techniques to automate Software Vulnerabilities (SVs) assessment using the descriptions in public repositories. However, the existing NLP-based approaches suffer from concept drift. This problem is caused by a lack of proper treatment of new (out-of-vocabulary) terms for the evaluation of unseen SVs over time. To perform automated SVs assessment with concept drift using SVs' descriptions, we propose a systematic approach that combines both character and word features. The proposed approach is used to predict seven Vulnerability Characteristics (VCs). The optimal model of each VC is selected using our customized time-based cross-validation method from a list of eight NLP representations and six well-known Machine Learning models. We have used the proposed approach to conduct large-scale experiments on more than 100,000 SVs in the National Vulnerability Database (NVD). The results show that our approach can effectively tackle the concept drift issue of the SVs' descriptions reported from 2000 to 2018 in NVD even without retraining the model. In addition, our approach performs competitively compared to the existing word-only method. We also investigate how to build compact concept-drift-aware models with much fewer features and give some recommendations on the choice of classifiers and NLP representations for SVs assessment.","PeriodicalId":6706,"journal":{"name":"2019 IEEE/ACM 16th International Conference on Mining Software Repositories (MSR)","volume":"51 1","pages":"371-382"},"PeriodicalIF":0.0,"publicationDate":"2019-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90959753","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 28
Organizing Committee for MSR 2019 MSR 2019组委会
{"title":"Organizing Committee for MSR 2019","authors":"","doi":"10.1109/msr.2019.00007","DOIUrl":"https://doi.org/10.1109/msr.2019.00007","url":null,"abstract":"","PeriodicalId":6706,"journal":{"name":"2019 IEEE/ACM 16th International Conference on Mining Software Repositories (MSR)","volume":"1 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2019-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86924283","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Benchmark of Data Loss Bugs for Android Apps Android应用程序数据丢失bug的基准测试
O. Riganelli, M. Mobilio, D. Micucci, L. Mariani
{"title":"A Benchmark of Data Loss Bugs for Android Apps","authors":"O. Riganelli, M. Mobilio, D. Micucci, L. Mariani","doi":"10.1109/MSR.2019.00087","DOIUrl":"https://doi.org/10.1109/MSR.2019.00087","url":null,"abstract":"Android apps must be able to deal with both stop events, which require immediately stopping the execution of the app without losing state information, and start events, which require resuming the execution of the app at the same point it was stopped. Support to these kinds of events must be explicitly implemented by developers who unfortunately often fail to implement the proper logic for saving and restoring the state of an app. As a consequence apps can lose data when moved to background and then back to foreground (e.g., to answer a call) or when the screen is simply rotated. These faults can be the cause of annoying usability issues and unexpected crashes. This paper presents a public benchmark of 110 data loss faults in Android apps that we systematically collected to facilitate research and experimentation with these problems. The benchmark is available on GitLab and includes the faulty apps, the fixed apps (when available), the test cases to automatically reproduce the problems, and additional information that may help researchers in their tasks.","PeriodicalId":6706,"journal":{"name":"2019 IEEE/ACM 16th International Conference on Mining Software Repositories (MSR)","volume":"73 1","pages":"582-586"},"PeriodicalIF":0.0,"publicationDate":"2019-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85844707","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
We Need to Talk About Microservices: an Analysis from the Discussions on StackOverflow 我们需要谈论微服务:从StackOverflow的讨论分析
Alan Bandeira, Carlos Alberto Medeiros, M. Paixão, P. Maia
{"title":"We Need to Talk About Microservices: an Analysis from the Discussions on StackOverflow","authors":"Alan Bandeira, Carlos Alberto Medeiros, M. Paixão, P. Maia","doi":"10.1109/MSR.2019.00051","DOIUrl":"https://doi.org/10.1109/MSR.2019.00051","url":null,"abstract":"Microservices are a new and rapidly growing architectural model aimed at developing highly scalable software solutions based on independently deployable and evolvable components. Due to its novelty, microservice-related discussions are increasing in Q&A websites, such as StackOverflow (SO). In order to understand what is being discussed by the microservice community, this work has applied mining techniques and topic modelling to a manually-curated dataset of 1,043 microservice-related posts from StackOverflow. As a result, we found that 13.68% of microservice technical posts on SO discuss a single technology: Netflix Eureka. Moreover, buzzwords in the microservice ecosystem, e.g., blue/green deployment, were not identified as relevant subjects of discussion on SO. Finally, we show how a high discussion rate on SO may not reflect the popularity of a certain subject within the microservice community.","PeriodicalId":6706,"journal":{"name":"2019 IEEE/ACM 16th International Conference on Mining Software Repositories (MSR)","volume":"64 1","pages":"255-259"},"PeriodicalIF":0.0,"publicationDate":"2019-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86203535","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 29
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信