{"title":"Tracing back the history of commits in low-tech reviewing environments: a case study of the Linux kernel","authors":"Yujuan Jiang, Bram Adams, Foutse Khomh, D. Germán","doi":"10.1145/2652524.2652542","DOIUrl":"https://doi.org/10.1145/2652524.2652542","url":null,"abstract":"<u>Context</u>: During software maintenance, people typically go back to the original reviews of a patch to understand the actual design rationale and potential risks of the code. Whereas modern web-based reviewing environments like gerrit make this process relatively easy, the low-tech, mailing-list based reviewing environments of many open source systems make linking a commit back to its reviews and earlier versions far from trivial, since (1) a commit has no physical link with any reviewing email, (2) the discussed patches are not always fully identical to the accepted commits and (3) some discussions last across multiple email threads, each of which containing potentially multiple versions of the same patch.\u0000 <u>Goal</u>: To support maintainers in reconstructing the reviewing history of kernel patches, and studying (for the first time) the characteristics of the recovered reviewing histories.\u0000 <u>Method</u>: This paper performs a comparative empirical study on the Linux kernel mailing lists of 3 email-to-email and email-to-commit linking techniques based on checksums, common patch lines and clone detection.\u0000 <u>Results</u>: Around 25% of the patches had an (until now) hidden reviewing history of more than four weeks, and patches with multiple versions typically are larger and have a higher acceptance rate than patches with just one version.\u0000 <u>Conclusion</u>: The plus-minus-line-based technique is the best approach for linking patch emails to commits, while it needs to be combined with the checksum-based technique for linking different patch versions.","PeriodicalId":124452,"journal":{"name":"International Symposium on Empirical Software Engineering and Measurement","volume":"76 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114882145","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Enriching source code by empirical metadata","authors":"K. Rástočný, M. Bieliková","doi":"10.1145/2652524.2652596","DOIUrl":"https://doi.org/10.1145/2652524.2652596","url":null,"abstract":"Empirical metadata that describe activities of developers with source code are often stored as logs with only basic references to source code. These approaches deals with several problems. At first, logs are attached to whole files and it is hard to analyse the collected data. The second problem is dynamism of source code. If we have only logs about activities of developers over source code, we could not be sure that the source code still exists or it has been at changed.","PeriodicalId":124452,"journal":{"name":"International Symposium on Empirical Software Engineering and Measurement","volume":"55 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130283249","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Fernando González-Ladrón-de-Guevara, M. Fernández-Diego
{"title":"ISBSG variables most frequently used for software effort estimation: a mapping review","authors":"Fernando González-Ladrón-de-Guevara, M. Fernández-Diego","doi":"10.1145/2652524.2652550","DOIUrl":"https://doi.org/10.1145/2652524.2652550","url":null,"abstract":"Background: The International Software Benchmarking Standards Group (ISBSG) dataset makes it possible to estimate a project's size, effort, duration, and cost.\u0000 Aim: The aim was to analyze the ISBSG variables that have been used by researchers for software effort estimation from 2000, when the first papers were published, until the end of 2013.\u0000 Method: A systematic mapping review was applied to over 167 papers obtained after the filtering process. From these, it was found that 133 papers produce effort estimation and only 107 list the independent variables used in the effort estimation models.\u0000 Results: Seventy-one out of 118 ISBSG variables have been used at least once. There is a group of 20 variables that appear in more than 50% of the papers and include Functional Size (62%), Development Type (58%), Language Type (53%), and Development Platform (52%) following ISBSG recommendations. Sizing and Size attributes altogether represent the most relevant group along with Project attributes that includes 24 technical features of the project and the development platform. All in all, variables that have more missing values are used less frequently.\u0000 Conclusions: This work presents a snapshot of the existing usage of ISBSG variables in software development estimation. Moreover, some insights are provided to guide future studies.","PeriodicalId":124452,"journal":{"name":"International Symposium on Empirical Software Engineering and Measurement","volume":"57 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123810081","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A. Duc, A. Mockus, Randy L. Hackbarth, J. Palframan
{"title":"Forking and coordination in multi-platform development: a case study","authors":"A. Duc, A. Mockus, Randy L. Hackbarth, J. Palframan","doi":"10.1145/2652524.2652546","DOIUrl":"https://doi.org/10.1145/2652524.2652546","url":null,"abstract":"[Context] With the proliferation of desktop and mobile platforms the development and maintenance of identical or similar applications on multiple platforms is urgently needed. [Goal] We study a software product deployed to more than 25 software/hardware combinations over 10 years to understand multi-platform development practices. [Method] We use semi structured interviews, project wikis, VCSs and issue tracking systems to understand and quantify these practices. [Results] We find the projects using MR cloning, MR review meeting, cross platform coordinator's role as three primary means of coordination. We find that forking code temporarily relieves the coordination needs and is driven by divergent schedule, market needs, and organizational policy. Based on our qualitative findings we propose quantitative measures of coordination, redundant work, and parallel development. [Conclusions] A model of coordination intensity suggests that it is related to the amount of paralel and redundant work. We hope that this work will provide a basis for quantitative understanding of issues faced in multi-platform software development.","PeriodicalId":124452,"journal":{"name":"International Symposium on Empirical Software Engineering and Measurement","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128335673","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The use of systematic reviews in evidence based software engineering: a systematic mapping study","authors":"Ronnie E. S. Santos, C. Magalhães, F. Silva","doi":"10.1145/2652524.2652553","DOIUrl":"https://doi.org/10.1145/2652524.2652553","url":null,"abstract":"Context. A decade ago, Kitchenham, Dybå and Jørgensen argued that software engineering could benefit from an evidence-based research approach similar that that used in medicine, introducing the basis for Evidence Based Software Engineering (EBSE). Objective. Our main goal is to understand the evolution of the use of systematic reviews as the main research method in EBSE, as proposed by Kitchenham et al., by investigating primary and tertiary studies that explore any aspect, theory, or concept around the use of systematic reviews in software engineering. Method. A systematic mapping study protocol was used to find and selected studies about EBSE and systematic reviews in SE, published between 2004 and 2013. Results. We selected 52 unique papers classified as non-empirical studies (12), empirical studies (31), and tertiary studies (9). Conclusion. SLR has become an important component of software engineering research with nearly 200 unique reviews catalogued by the tertiary studies. Most important limitations are related to the industrial relevance and application of the results of reviews and the poor use of synthesis method to aggregate evidence","PeriodicalId":124452,"journal":{"name":"International Symposium on Empirical Software Engineering and Measurement","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131972527","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"FixerCache: unsupervised caching active developers for diverse bug triage","authors":"Song-Yun Wang, Wen Zhang, Qing Wang","doi":"10.1145/2652524.2652536","DOIUrl":"https://doi.org/10.1145/2652524.2652536","url":null,"abstract":"Context: Bug triage aims to recommend appropriate developers for new bugs in order to reduce time and effort in bug resolution. Most previous approaches for bug triage are supervised. Before recommending developers, these approaches need to learn developers' bug-fix preferences via building and training models using text-information of developers' historical bug reports.\u0000 Goal: In this paper, we empirically address three limitations of supervised bug triage approaches and propose FixerCache, an unsupervised approach for bug triage by caching developers based on their activeness in components of products.\u0000 Method: In FixerCache, each component of a product has a dynamic developer cache which contains prioritized developers according to developers' activeness scores. Given a new bug report, FixerCache recommends fixers with high activeness in developer cache to participate in fixing the new bug.\u0000 Results: Results of experiments on four products from Eclipse and Mozilla show that FixerCache outperforms supervised bug triage approaches in both prediction accuracy and diversity. And it can achieve prediction accuracy up to 96.32% and diversity up to 91.67%, with top-10 recommendation list.\u0000 Conclusions: FixerCache recommends fixers for new bugs based on developers' activeness in components of products with high prediction accuracy and diversity. Moreover, since FixerCache does not need to learn developers' bug-fix preferences through complex and time consuming processes, it could reduce bug triage time from hours of supervised approaches to seconds.","PeriodicalId":124452,"journal":{"name":"International Symposium on Empirical Software Engineering and Measurement","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134175076","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"On knowledge transfer skill in pair programming","authors":"Franz Zieris, L. Prechelt","doi":"10.1145/2652524.2652529","DOIUrl":"https://doi.org/10.1145/2652524.2652529","url":null,"abstract":"Context: General knowledge transfer is often considered a valuable effect or side-effect of pair programming, but even more important is its role for the success of the pair programming session itself: The partners often need to explain an idea to carry the process forward. Goal: Understand the mechanisms at work when knowledge is transferred during a pair programming session; provide practical advice for constructive behavior. Method: Qualitative data analysis of recordings of actual industrial pair programming sessions. Results: Some pairs are much more efficient in their knowledge transfer than others. These pairs manage to (1) not attempt to explain multiple things at once, (2) not lose sight of a topic, (3) clarify difficult points in stages. Conclusions: Pair programming requires skill beyond software development skill. To be able to identify knowledge needs and then push such knowledge to or pull it from the partner successfully is one aspect of such skill. We characterize a number of its elements.","PeriodicalId":124452,"journal":{"name":"International Symposium on Empirical Software Engineering and Measurement","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132849946","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
C. EfraínR.Fonseca, Óscar Dieste Tubío, Natalia Juristo Juzgado, Estefanía Serral, S. Biffl
{"title":"Reviewing technical approaches for sharing and preservation of experimental data","authors":"C. EfraínR.Fonseca, Óscar Dieste Tubío, Natalia Juristo Juzgado, Estefanía Serral, S. Biffl","doi":"10.1145/2652524.2652600","DOIUrl":"https://doi.org/10.1145/2652524.2652600","url":null,"abstract":"Context: Empirical Software Engineering (ESE) replication researchers need to store and manipulate experimental data for several purposes, in particular analysis and reporting. Current research needs call for sharing and preservation of experimental data as well. In a previous work, we analyzed Replication Data Management (RDM) needs. A novel concept, called Experimental Ecosystem, was proposed to solve current deficiencies in RDM approaches. The empirical ecosystem provides replication researchers with a common framework that integrates transparently local heterogeneous data sources. A typical situation where the Empirical Ecosystem is applicable, is when several members of a research group, or several research groups collaborating together, need to share and access each other experimental results. However, to be able to apply the Empirical Ecosystem concept and deliver all promised benefits, it is necessary to analyze the software architectures and tools that can properly support it.\u0000 Goal: Identify the most appropriate technologies for the implementation of the Empirical Ecosystem concept.\u0000 Method: For the purpose of technology identification, four features are particularly relevant: Volume of data, architecture, data semantics and manipulation facilities. Those features were surveyed in repositories and data sharing and preservation tools used in the sciences by means of a systematic literature review.\u0000 Results: 17 sharing and preservation tools reported in the literature were identified. The fields of Genomics and Proteomics, and secondarily Biology, stand out. Given the importance of those disciplines in today's science and economy, it would not be surprising that many other proprietary tools would have gone unnoticed. Regarding repositories, there are hundreds available (either publicly or restricted access) in the Internet. Typically, they aim at benchmarking, or reanalysis and synthesis of existing empirical studies. Most repositories (both in number and importance) belong to the \"hard sciences\" (e.g. biology, physics, etc.), but virtually every research area is represented, including ESE.\u0000 Most tools and repositories use relational databases for data storage, with very little exceptions. When the amount of stored data is very high (e.g. Genomics), relational databases are being substituted by big data management infrastructures such as Apache™ Hadoop®. Relational databases are also used when data are distributed. Global conceptual models guarantee the interoperability among different data sources. When data are heterogeneous, the situation is more complex. Standard conceptual schemas may not be useful, because the semantics of the local data do not necessarily agree the meaning assigned to the global schema. Likewise, large parts of the conceptual schema may not be applicable to local data sources, and the links among local models may not be easily defined. The current trend is abandoning classical conceptual schemas (e.g. entity-relati","PeriodicalId":124452,"journal":{"name":"International Symposium on Empirical Software Engineering and Measurement","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125327449","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Discovering buffer overflow vulnerabilities in the wild: an empirical study","authors":"Ming Fang, M. Hafiz","doi":"10.1145/2652524.2652533","DOIUrl":"https://doi.org/10.1145/2652524.2652533","url":null,"abstract":"Context: Reporters of security vulnerabilities possess rich information about the security engineering process. Goal: We performed an empirical study on reporters of buffer overflow vulnerabilities to understand the methods and tools used during the discovery. Method: We ran the study in the form of an email questionnaire with open ended questions. The participants were reporters featured in the SecurityFocus repository during two six-month periods; we collected 58 responses. Results: We found that in spite of many apparent choices, reporters follow similar approaches. Most reporters typically use fuzzing, but their fuzzing tools are created ad hoc; they use a few debugging tools to analyze the crash introduced by a fuzzer; and static analysis tools are rarely used. We also found a serious problem in the vulnerability reporting process. Most reporters, especially the experienced ones, favor full-disclosure and do not collaborate with the vendors of vulnerable software. They think that the public disclosure, sometimes supported by a detailed exploit, will put pressure on vendors to fix the vulnerabilities. But, in practice, the vulnerabilities not reported to vendors are less likely to be fixed. Conclusions: The results are valuable for beginners exploring how to detect and report buffer overflows and for tool vendors and researchers exploring how to automate and fix the process.","PeriodicalId":124452,"journal":{"name":"International Symposium on Empirical Software Engineering and Measurement","volume":"47 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114817494","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Motivated software engineers are engaged and focused, while satisfied ones are happy","authors":"A. C. A. França, H. Sharp, F. Silva","doi":"10.1145/2652524.2652545","DOIUrl":"https://doi.org/10.1145/2652524.2652545","url":null,"abstract":"Context -- Motivation and job satisfaction are not the same thing, and although business organization research recognized this a long time ago, in Software Engineering research, we have not. As a result, thirty years of research on motivation in software engineering has produced knowledge on what makes software engineers generally happier, but not about how to increase their motivation. Goal -- In this article, we aim to identify visible signs of a software engineer who is motivated to work. Method -- We describe a field study in which 62 practitioners in Brazil reported their view of \"motivation\" in the context of their practical work. Data was collected by means of audio-recorded semi-structured interviews, and a thematic analysis was applied to identify the most relevant descriptors of motivation. Results -- Our data reveal that (1) motivated Software Engineers are engaged, focused, and collaborative; and (2) the term \"motivation\" is used as an umbrella term to cover several distinct organizational behaviours that are not necessarily related to the individual's desire to work. Conclusions -- Without a clear picture of the difference between these two concepts, work-based motivation programs may not be designed effectively to address either turnover or performance issues. Overall, this work indicates the need for a more effective conceptual system to investigate and encourage both job satisfaction and work motivation in software engineering research and practice.","PeriodicalId":124452,"journal":{"name":"International Symposium on Empirical Software Engineering and Measurement","volume":"60 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130214011","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}