2018 IEEE/ACM 15th International Conference on Mining Software Repositories (MSR)最新文献

筛选
英文 中文
Leveraging Historical Versions of Android Apps for Efficient and Precise Taint Analysis 利用历史版本的安卓应用程序进行高效和精确的污点分析
Haipeng Cai, John Jenkins
{"title":"Leveraging Historical Versions of Android Apps for Efficient and Precise Taint Analysis","authors":"Haipeng Cai, John Jenkins","doi":"10.1145/3196398.3196433","DOIUrl":"https://doi.org/10.1145/3196398.3196433","url":null,"abstract":"Today, computing on various Android devices is pervasive. However, growing security vulnerabilities and attacks in the Android ecosystem constitute various threats through user apps. Taint analysis is a common technique for defending against these threats, yet it su?ers from challenges in attaining practical simultaneous scalability and e?ectiveness. This paper presents a novel approach to fast and precise taint checking, called incremental taint analysis, by exploiting the evolving nature of Android apps. The analysis narrows down the search space of taint checking from an entire app, as conventionally addressed, to the parts of the program that are di?erent from its previous versions. This technique improves the overall efciency of checking multiple versions of the app as it evolves. We have implemented the techniques as a tool prototype, EvoTaint, and evaluated our analysis by applying it to real-world evolving Android apps. Our preliminary results show that the incremental approach largely reduced the cost of taint analysis, by 78.6% on average, yet without sacrifcing the analysis e?ectiveness, relative to a representative precise taint analysis as the baseline.","PeriodicalId":6639,"journal":{"name":"2018 IEEE/ACM 15th International Conference on Mining Software Repositories (MSR)","volume":"58 1","pages":"265-269"},"PeriodicalIF":0.0,"publicationDate":"2018-05-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79763283","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 16
Studying the Impact of Adopting Continuous Integration on the Delivery Time of Pull Requests 研究持续集成对拉式请求交付时间的影响
João Helis Bernardo, D. A. D. Costa, U. Kulesza
{"title":"Studying the Impact of Adopting Continuous Integration on the Delivery Time of Pull Requests","authors":"João Helis Bernardo, D. A. D. Costa, U. Kulesza","doi":"10.1145/3196398.3196421","DOIUrl":"https://doi.org/10.1145/3196398.3196421","url":null,"abstract":"Continuous Integration (CI) is a software development practice that leads developers to integrate their work more frequently. Software projects have broadly adopted CI to ship new releases more frequently and to improve code integration. The adoption of CI is motivated by the allure of delivering new functionalities more quickly. However, there is little empirical evidence to support such a claim. Through the analysis of 162,653 pull requests (PRs) of 87 GitHub projects that are implemented in 5 different programming languages, we empirically investigate the impact of adopting CI on the time to deliver merged PRs. Surprisingly, only 51.3% of the projects deliver merged PRs more quickly after adopting CI.We also observe that the large increase of PR submissions after CI is a key reason as to why projects deliver PRs more slowly after adopting CI. To investigate the factors that are related to the time-to-delivery of merged PRs, we train regression models that obtain sound median R-squares of 0.64-0.67. Finally, a deeper analysis of our models indicates that, before the adoption of CI, the integration-load of the development team, i.e., the number of submitted PRs competing for being merged, is the most impactful metric on the time to deliver merged PRs before CI. Our models also reveal that PRs that are merged more recently in a release cycle experience a slower delivery time.","PeriodicalId":6639,"journal":{"name":"2018 IEEE/ACM 15th International Conference on Mining Software Repositories (MSR)","volume":"702 1","pages":"131-141"},"PeriodicalIF":0.0,"publicationDate":"2018-05-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82816477","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 32
What are Your Programming Language's Energy-Delay Implications? 你的编程语言的能量延迟意味着什么?
Stefanos Georgiou, M. Kechagia, P. Louridas, D. Spinellis
{"title":"What are Your Programming Language's Energy-Delay Implications?","authors":"Stefanos Georgiou, M. Kechagia, P. Louridas, D. Spinellis","doi":"10.1145/3196398.3196414","DOIUrl":"https://doi.org/10.1145/3196398.3196414","url":null,"abstract":"Motivation: Even though many studies examine the energy efficiency of hardware and embedded systems, those that investigate the energy consumption of software applications are still limited, and mostly focused on mobile applications. As modern applications become even more complex and heterogeneous a need arises for methods that can accurately assess their energy consumption. Goal: Measure the energy consumption and run-time performance of commonly used programming tasks implemented in different programming languages and executed on a variety of platforms to help developers to choose appropriate implementation platforms. Method: Obtain measurements to calculate the Energy Delay Prod- uct, a weighted function that takes into account a task's energy consumption and run-time performance. We perform our tests by calculating the Energy Delay Product of 25 programming tasks, found in the Rosetta Code Repository, which are implemented in 14 programming languages and run on three different computer platforms, a server, a laptop, and an embedded system. Results: Compiled programming languages are outperforming the interpreted ones for most, but not for all tasks. C, C#, and JavaScript are on average the best performing compiled, semi-compiled, and interpreted programming languages for the Energy Delay Product, and Rust appears to be well-placed for i/o-intensive operations, such as file handling. We also find that a good behaviour, energy- wise, can be the result of clever optimizations and design choices in seemingly unexpected programming languages.","PeriodicalId":6639,"journal":{"name":"2018 IEEE/ACM 15th International Conference on Mining Software Repositories (MSR)","volume":"32 1","pages":"303-313"},"PeriodicalIF":0.0,"publicationDate":"2018-05-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83143580","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 15
How Swift Developers Handle Errors Swift开发者如何处理错误
Nathan Cassee, G. Pinto, F. C. Filho, Alexander Serebrenik
{"title":"How Swift Developers Handle Errors","authors":"Nathan Cassee, G. Pinto, F. C. Filho, Alexander Serebrenik","doi":"10.1145/3196398.3196428","DOIUrl":"https://doi.org/10.1145/3196398.3196428","url":null,"abstract":"Swift is a new programming language developed by Apple as a replacement to Objective-C. It features a sophisticated error handling (EH) mechanism that provides the kind of separation of concerns afforded by exception handling mechanisms in other languages, while also including constructs to improve safety and maintainability. However, Swift also inherits a software development culture stemming from Objective-C being the de-facto standard programming language for Apple platforms for the last 15 years. It is, therefore, a priori unclear whether Swift developers embrace the novel EH mechanisms of the programming language or still rely on the old EH culture of Objective-C even working in Swift. In this paper, we study to what extent developers adhere to good practices exemplified by EH guidelines and tutorials, and what are the common bad EH practices particularly relevant to Swift code. Furthermore, we investigate whether perception of these practices differs between novices and experienced Swift developers. To answer these questions we employ a mixed-methods approach and combine 10 semi-structured interviews with Swift developers and quantitative analysis of 78,760 Swift 4 files extracted from 2,733 open-source GitHub repositories. Our findings indicate that there is ample opportunity to improve the way Swift developers use error handling mechanisms. For instance, some recommendations derived in this work are not well spread in the corpus of studied Swift projects. For example, generic catch handlers are common in Swift (even though it is not uncommon for them to share space with their counterparts: non empty catch handlers), custom, developerdefined error types are rare, and developers are mostly reactive when it comes to error handling, using Swift's constructs mostly to handle errors thrown by libraries, instead of throwing and handling application-specific errors.","PeriodicalId":6639,"journal":{"name":"2018 IEEE/ACM 15th International Conference on Mining Software Repositories (MSR)","volume":"70 1","pages":"292-302"},"PeriodicalIF":0.0,"publicationDate":"2018-05-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79570043","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 17
Large-Scale Analysis of the Co-commit Patterns of the Active Developers in GitHub's Top Repositories GitHub顶级存储库中活跃开发人员共同提交模式的大规模分析
Eldan Cohen, M. Consens
{"title":"Large-Scale Analysis of the Co-commit Patterns of the Active Developers in GitHub's Top Repositories","authors":"Eldan Cohen, M. Consens","doi":"10.1145/3196398.3196436","DOIUrl":"https://doi.org/10.1145/3196398.3196436","url":null,"abstract":"GitHub, the largest code hosting site (with 25 million public active repositories and contributions from 6 million active users), provides an unprecedented opportunity to observe the collaboration patterns of software developers. Understanding the patterns behind the social coding phenomena is an active research area where the insights gained can guide the design of better collaboration tools, and can also help to identify and select developer talent. In this paper, we present a large-scale analysis of the co-commit patterns in GitHub. We analyze 10 million commits made by 200 thousand developers to 16 thousand repositories, using 17 of the most popular programming languages over a period of 3 years. Although a large volume of data is included in our study, we pay close attention to the participation criteria for repositories and developers. We select repositories by reputation (based on star ranking), and we introduce the notion of active developer in GitHub (observing that a limited subset of developers is responsible for the vast majority of the commits). Using co-authorship networks, we analyze the co-commit patterns of the active developer network for each programming language. We observe that the active developer networks are less connected and more centralized than the general GitHub developer networks, and that the patterns vary significantly among languages. We compare our results to other collaborative environments (Wikipedia and scientific research networks), and we also describe the evolution of the co-commit patterns over time.","PeriodicalId":6639,"journal":{"name":"2018 IEEE/ACM 15th International Conference on Mining Software Repositories (MSR)","volume":"35 1","pages":"426-436"},"PeriodicalIF":0.0,"publicationDate":"2018-05-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73747216","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Structured Information on State and Evolution of Dockerfiles on GitHub 关于GitHub上Dockerfiles的状态和演变的结构化信息
Gerald Schermann, Sali Zumberi, Jürgen Cito
{"title":"Structured Information on State and Evolution of Dockerfiles on GitHub","authors":"Gerald Schermann, Sali Zumberi, Jürgen Cito","doi":"10.1145/3196398.3196456","DOIUrl":"https://doi.org/10.1145/3196398.3196456","url":null,"abstract":"Docker containers are standardized, self-contained units of applications, packaged with their dependencies and execution environment. The environment is defined in a Dockerfile that specifies the steps to reach a certain system state as infrastructure code, with the aim of enabling reproducible builds of the container. To lay the groundwork for research on infrastructure code, we collected structured information about the state and the evolution of Docker-files on GitHub and release it as a PostgreSQL database archive (over 100,000 unique Dockerfiles in over 15,000 GitHub projects). Our dataset enables answering a multitude of interesting research questions related to different kinds of software evolution behavior in the Docker ecosystem.","PeriodicalId":6639,"journal":{"name":"2018 IEEE/ACM 15th International Conference on Mining Software Repositories (MSR)","volume":"208 1","pages":"26-29"},"PeriodicalIF":0.0,"publicationDate":"2018-05-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73773778","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 29
Who's This? Developer Identification Using IDE Event Data 这是谁?使用IDE事件数据识别开发人员
J. Wilkie, Ziad Al Halabi, Alperen Karaoglu, Jiafeng Liao, George Ndungu, Chaiyong Ragkhitwetsagul, M. Paixão, J. Krinke
{"title":"Who's This? Developer Identification Using IDE Event Data","authors":"J. Wilkie, Ziad Al Halabi, Alperen Karaoglu, Jiafeng Liao, George Ndungu, Chaiyong Ragkhitwetsagul, M. Paixão, J. Krinke","doi":"10.1145/3196398.3196461","DOIUrl":"https://doi.org/10.1145/3196398.3196461","url":null,"abstract":"This paper presents a technique to identify a developer based on their IDE event data. We exploited the KaVE data set which recorded IDE activities from 85 developers with 11M events. We found that using an SVM with a linear kernel on raw event count outperformed k-NN in identifying developers with an accuracy of 0.52. Moreover, after setting the optimal number of events and sessions to train the classifier, we achieved a higher accuracy of 0.69 and 0.71 respectively. The findings shows that we can identify developers based on their IDE event data. The technique can be expanded further to group similar developers for IDE feature recommendations.","PeriodicalId":6639,"journal":{"name":"2018 IEEE/ACM 15th International Conference on Mining Software Repositories (MSR)","volume":"13 1","pages":"90-93"},"PeriodicalIF":0.0,"publicationDate":"2018-05-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74703839","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
The Open-Closed Principle of Modern Machine Learning Frameworks 现代机器学习框架的开闭原则
Houssem Ben Braiek, Foutse Khomh, Bram Adams
{"title":"The Open-Closed Principle of Modern Machine Learning Frameworks","authors":"Houssem Ben Braiek, Foutse Khomh, Bram Adams","doi":"10.1145/3196398.3196445","DOIUrl":"https://doi.org/10.1145/3196398.3196445","url":null,"abstract":"Recent advances in computing technologies and the availability of huge volumes of data have sparked a new machine learning (ML) revolution, where almost every day a new headline touts the demise of human experts by ML models on some task. Open source software development is rumoured to play a significant role in this revolution, with both academics and large corporations such as Google and Microsoft releasing their ML frameworks under an open source license. This paper takes a step back to examine and understand the role of open source development in modern ML, by examining the growth of the open source ML ecosystem on GitHub, its actors, and the adoption of frameworks over time. By mining LinkedIn and Google Scholar profiles, we also examine driving factors behind this growth (paid vs. voluntary contributors), as well as the major players who promote its democratization (companies vs. communities), and the composition of ML development teams (engineers vs. scientists). According to the technology adoption lifecycle, we find that ML is in between the stages of early adoption and early majority. Furthermore, companies are the main drivers behind open source ML, while the majority of development teams are hybrid teams comprising both engineers and professional scientists. The latter correspond to scientists employed by a company, and by far represent the most active profiles in the development of ML applications, which reflects the importance of a scientific background for the development of ML frameworks to complement coding skills. The large influence of cloud computing companies on the development of open source ML frameworks raises the risk of vendor lock-in. These frameworks, while open source, could be optimized for specific commercial cloud offerings.","PeriodicalId":6639,"journal":{"name":"2018 IEEE/ACM 15th International Conference on Mining Software Repositories (MSR)","volume":"110 1","pages":"353-363"},"PeriodicalIF":0.0,"publicationDate":"2018-05-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72864395","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 30
Developer Interaction Traces Backed by IDE Screen Recordings from Think Aloud Sessions 开发者互动的痕迹由IDE屏幕录音从思考出声会议
A. Yamashita, Fábio Petrillo, Foutse Khomh, Yann-Gaël Guéhéneuc
{"title":"Developer Interaction Traces Backed by IDE Screen Recordings from Think Aloud Sessions","authors":"A. Yamashita, Fábio Petrillo, Foutse Khomh, Yann-Gaël Guéhéneuc","doi":"10.1145/3196398.3196457","DOIUrl":"https://doi.org/10.1145/3196398.3196457","url":null,"abstract":"There are two well-known difficulties to test and interpret methodologies for mining developer interaction traces: first, the lack of enough large datasets needed by mining or machine learning approaches to provide reliable results; and second, the lack of \"ground truth\" or empirical evidence that can be used to triangulate the results, or to verify their accuracy and correctness. Moreover, relying solely on interaction traces limits our ability to take into account contextual factors that can affect the applicability of mining techniques in other contexts, as well hinders our ability to fully understand the mechanics behind observed phenomena. The data presented in this paper attempts to alleviate these challenges by providing 600+ hours of developer interaction traces, from which 26+ hours are backed with video recordings of the IDE screen and developer's comments. This data set is relevant to researchers interested in investigating program comprehension, and those who are developing techniques for interaction traces analysis and mining.","PeriodicalId":6639,"journal":{"name":"2018 IEEE/ACM 15th International Conference on Mining Software Repositories (MSR)","volume":"38 5 1","pages":"50-53"},"PeriodicalIF":0.0,"publicationDate":"2018-05-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75863294","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Detection and Analysis of Behavioral T-Patterns in Debugging Activities 调试活动中行为t模式的检测与分析
César Soto-Valero, Johann Bourcier, B. Baudry
{"title":"Detection and Analysis of Behavioral T-Patterns in Debugging Activities","authors":"César Soto-Valero, Johann Bourcier, B. Baudry","doi":"10.1145/3196398.3196452","DOIUrl":"https://doi.org/10.1145/3196398.3196452","url":null,"abstract":"A growing body of research in empirical software engineering applies recurrent patterns analysis in order to make sense of the developers' behavior during their interactions with IDEs. However, the exploration of hidden real-time structures of programming behavior remains a challenging task. In this paper, we investigate the presence of temporal behavioral patterns (T-patterns) in debugging activities using the THEME software. Our preliminary exploratory results show that debugging activities are strongly correlated with code editing, file handling, window interactions and other general types of programming activities. The validation of our T-patterns detection approach demonstrates that debugging activities are performed on the basis of repetitive and well-organized behavioral events. Furthermore, we identify a large set of T-patterns that associate debugging activities with build success, which corroborates the positive impact of debugging practices on software development.","PeriodicalId":6639,"journal":{"name":"2018 IEEE/ACM 15th International Conference on Mining Software Repositories (MSR)","volume":"91 1","pages":"110-113"},"PeriodicalIF":0.0,"publicationDate":"2018-05-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78553208","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信