{"title":"Surveying the Developer Experience of Flaky Tests","authors":"Michael C Hilton","doi":"10.1145/3510457.3513037","DOIUrl":"https://doi.org/10.1145/3510457.3513037","url":null,"abstract":"Test cases that pass and fail without changes to the code under test are known as flaky. The past decade has seen increasing research interest in flaky tests, though little attention has been afforded to the views and experiences of software developers. In this study, we utilized a multi-source approach to obtain insights into how developers define flaky tests, their experiences of the impacts and causes of flaky tests, and the actions they take in response to them. To that end, we designed a literature-guided developer survey that we deployed on social media, receiving 170 total responses. We also searched on StackOverflow and analyzed 38 threads relevant to flaky tests, offering a distinct perspective free of any self-reporting bias. Through a mixture of numerical and thematic analyses, this study reveals a number of findings, including (1) developers strongly agree that flaky tests hinder continuous integration; (2) developers who experience flaky tests more often may be more likely to ignore potentially genuine test failures; and (3) developers rate issues in setup and teardown to be the most common causes of flaky tests.","PeriodicalId":119790,"journal":{"name":"2022 IEEE/ACM 44th International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP)","volume":"190 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116851644","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Looking for Lacunae in Bitcoin Core's Fuzzing Efforts","authors":"Alex Groce","doi":"10.1145/3510457.3513072","DOIUrl":"https://doi.org/10.1145/3510457.3513072","url":null,"abstract":"Bitcoin is one of the most prominent distributed software systems in the world. This paper describes an effort to investigate and enhance the effectiveness of the Bitcoin Core fuzzing effort. The effort initially began as a query about how to escape saturation in the fuzzing effort, but developed into a more general exploration. This paper summarizes the outcomes of a two-week focused effort. While the effort found no smoking guns indicating major test/fuzz weaknesses, it produced a large number of additional fuzz corpus entries, increased the set of fuzzers used for Bitcoin Core, and ran mutation analysis of Bitcoin Core fuzz targets, with a comparison to Bitcoin functional tests and other cryptocurrencies’ tests. Our conclusion is that for high quality fuzzing efforts, improvements to the oracle may be the best way to get more out of fuzzing.","PeriodicalId":119790,"journal":{"name":"2022 IEEE/ACM 44th International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP)","volume":"53 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129178303","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Feng Zhu, Lijie Xu, Gang Ma, Shuping Ji, Jie Wang, Gang Wang, Hongyi Zhang, K. Wan, Ming-ming Wang, Xingchao Zhang, Yuming Wang, Jingpin Li
{"title":"An Empirical Study on Quality Issues of eBay's Big Data SQL Analytics Platform","authors":"Feng Zhu, Lijie Xu, Gang Ma, Shuping Ji, Jie Wang, Gang Wang, Hongyi Zhang, K. Wan, Ming-ming Wang, Xingchao Zhang, Yuming Wang, Jingpin Li","doi":"10.1145/3510457.3513034","DOIUrl":"https://doi.org/10.1145/3510457.3513034","url":null,"abstract":"Big data SQL analytics platform has evolved as the key infrastructure for business data analysis. Compared with traditional costly commercial RDBMS, scalable solutions with open-source projects, such as SQL-on-Hadoop, are more popular and attractive to enter-prises. In eBay, we build Carmel, a company-wide interactive SQL analytics platform based on Apache Spark. Carmel has been serving thousands of customers from hundreds of teams globally for more than 3 years. Meanwhile, despite the popularity of open-source based big data SQL analytics platforms, few empirical studies on service quality issues (e.g., job failure) were carried out for them. However, a deep understanding of service quality issues and taking right mitigation are significant to the ease of manual maintenance efforts. To fill this gap, we conduct a comprehensive empirical study on 1,884 real-word service quality issues from Carmel. We summa-rize the common symptoms and identify the root causes with typical cases. Stakeholders including system developers, researchers, and platform maintainers can benefit from our findings and implications. Furthermore, we also present lessons learned from critical cases in our daily practice, as well as insights to motivate automatic tool support and future research directions.","PeriodicalId":119790,"journal":{"name":"2022 IEEE/ACM 44th International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP)","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130846583","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Tim Hübener, M. Chaudron, Yaping Luo, Pieter Vallen, Jonck van der Kogel, Tom Liefheid
{"title":"Automatic Anti-Pattern Detection in Microservice Architectures Based on Distributed Tracing","authors":"Tim Hübener, M. Chaudron, Yaping Luo, Pieter Vallen, Jonck van der Kogel, Tom Liefheid","doi":"10.1145/3510457.3513066","DOIUrl":"https://doi.org/10.1145/3510457.3513066","url":null,"abstract":"The successful use of microservice-based applications by large companies has popularized this architectural style. One problem with the microservice architecture is that current techniques for visualising- and detecting anti-pattern are inadequate. This study contributes a method and tool for detecting anti-patterns in microservice architecture based on distributed execution traces. We demonstrate this on an industrial case study.","PeriodicalId":119790,"journal":{"name":"2022 IEEE/ACM 44th International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP)","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117236709","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Bug Tracking Process Smells In Practice","authors":"Erdem Tuna, V. Kovalenko, Eray Tüzün","doi":"10.1145/3510457.3513080","DOIUrl":"https://doi.org/10.1145/3510457.3513080","url":null,"abstract":"Software teams use bug tracking (BT) tools to report and manage bugs. Each record in a bug tracking system (BTS) is a reporting entity consisting of several information fields. The contents of the reports are similar across different tracking tools, though not the same. The variation in the workflow between teams prevents defining an ideal process of running BTS. Nevertheless, there are best practices reported both in white and gray literature. Developer teams may not adopt the best practices in their BT process. This study investigates the non-compliance of developers with best practices, so-called smells, in the BT process. We mine bug reports of four projects in the BTS of JetBrains, a software company, to observe the prevalence of BT smells in an industrial setting. Also, we survey developers to see (1) if they recognize the smells, (2) their perception of the severity of the smells, and (3) the potential benefits of a BT process smell detection tool. We found that (1) smells occur, and their detection requires a solid understanding of the BT practices of the projects, (2) smell severity perception varies across smell types, and (3) developers considered that a smell detection tool would be useful for six out of the 12 smell categories.","PeriodicalId":119790,"journal":{"name":"2022 IEEE/ACM 44th International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127388021","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Salwa Alamir, Petr Babkin, N. Navarro, Sameena Shah
{"title":"AI for Automated Code Updates","authors":"Salwa Alamir, Petr Babkin, N. Navarro, Sameena Shah","doi":"10.1145/3510457.3513073","DOIUrl":"https://doi.org/10.1145/3510457.3513073","url":null,"abstract":"Most modern code bases extensively rely on external libraries to provide robust functionality out of the box. When these libraries are updated they can sometimes introduce breaking changes in the process, which require extensive developer maintenance. To mitigate this we propose to use artificial intelligence to parse the text of release notes to capture code deprecations in structured form. This, in turn, enables us to develop an IDE plugin that can automatically detect deprecated library usages in live code bases and even suggest recommended fixes. We evaluated our system on over 30 internal projects within J.P. Morgan.","PeriodicalId":119790,"journal":{"name":"2022 IEEE/ACM 44th International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP)","volume":"121 19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126097848","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jiangchao Liu, Jierui Liu, Peng Di, A. Liu, Zexin Zhong
{"title":"Record and Replay of Online Traffic for Microservices with Automatic Mocking Point Identification","authors":"Jiangchao Liu, Jierui Liu, Peng Di, A. Liu, Zexin Zhong","doi":"10.1145/3510457.3513029","DOIUrl":"https://doi.org/10.1145/3510457.3513029","url":null,"abstract":"Using recorded online traffic for the regression testing of web applications has become a common practice in industry. However, this “record and replay” on microservices is challenging because simply recorded online traffic (i.e., values for variables or input/output for function calls) often cannot be successfully replayed because microservices often have various dependencies on the complicated online environment. These dependencies include the states of underlying systems, internal states (e.g., caches), and external states (e.g., interaction with other microservices/middleware). Considering the large size and the complexity of industrial microservices, an automatic, scalable, and precise identification of such dependencies is needed as manual identification is time-consuming. In this paper, we propose an industrial grade solution to identifying all dependencies, and generating mocking points automatically using static program analysis techniques. Our solution has been deployed in a large Internet company (i.e., Ant Group) to handle hundreds of microservices, which consists of hundreds of millions lines of code, with high success rate in replay (99% on average). Moreover, our framework can boost the efficiency of the testing system by refining dependencies that must not affect the behavior of a microservice. Our experimental results show that our approach can filter out 73.1% system state dependency and 71.4% internal state dependency, which have no effect on the behavior of the microservice.","PeriodicalId":119790,"journal":{"name":"2022 IEEE/ACM 44th International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP)","volume":"407 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122778373","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
H. Iwasaki, Tsuyoshi Nakajima, Ryota Tsukamoto, Kazuko Takahashi, Shuichi Tokumoto
{"title":"A Software Impact Analysis Tool based on Change History Learning and its Evaluation","authors":"H. Iwasaki, Tsuyoshi Nakajima, Ryota Tsukamoto, Kazuko Takahashi, Shuichi Tokumoto","doi":"10.1145/3510457.3519017","DOIUrl":"https://doi.org/10.1145/3510457.3519017","url":null,"abstract":"Software change impact analysis plays an important role in controlling software evolution in the maintenance of continuous software development. We developed a tool for change impact analysis, which machine-learns change histories and directly outputs candidates of the components to be modified for a change request. We applied the tool to real project data to evaluate it with two metrics: coverage range ratio and accuracy in the coverage range. The results show that it works well for software projects having many change histories for one source code base.","PeriodicalId":119790,"journal":{"name":"2022 IEEE/ACM 44th International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP)","volume":"79 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134020793","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Automatically Identifying Shared Root Causes of Test Breakages in SAP HANA","authors":"Gabin An, Juyeon Yoon, Jeongju Sohn, Jingun Hong, Dongwon Hwang, Shin Yoo","doi":"10.1145/3510457.3513051","DOIUrl":"https://doi.org/10.1145/3510457.3513051","url":null,"abstract":"Continuous Integration (CI) of a largescale software system such as SAP HANA can produce a non-trivial number of test breakages. Each breakage that newly occurs from daily runs needs to be manually inspected, triaged, and eventually assigned to developers for debugging. However, not all new breakages are unique, as some test breakages would share the same root cause; in addition, human errors can produce duplicate bug tickets for the same root cause. An automated identification of breakages with shared root causes will be able to significantly reduce the cost of the (typically manual) post-breakage steps. This paper investigates multiple similarity functions between test breakages to assist and automate the identification of test breakages that are caused by the same root cause. We consider multiple information sources, such as static (i.e., the code itself), historical (i.e., whether the test results have changed in a similar way in the past), as well as dynamic (i.e., whether the coverage of test cases are similar to each other), for the purpose of such automation. We evaluate a total of 27 individual similarity functions, using realworld CI data of SAP HANA from a six-month period. Further, using these individual similarity functions as in-put features, we construct a classification model that can predict whether two test breakages share the same root cause or not. When trained using ground truth labels extracted from the issue tracker of SAP HANA, our model achieves an F1 score of 0.743 when evaluated using a set of unseen test breakages collected over three months. Our results show that a classification model based on test similarity functions can successfully support the bug triage stage of a CI pipeline.","PeriodicalId":119790,"journal":{"name":"2022 IEEE/ACM 44th International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP)","volume":"45 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134107195","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
K. Anderson, Denise Visser, Jan-Willem Mannen, Yuxiang Jiang, A. Deursen
{"title":"Challenges in Applying Continuous Experimentation: A Practitioners' Perspective","authors":"K. Anderson, Denise Visser, Jan-Willem Mannen, Yuxiang Jiang, A. Deursen","doi":"10.1145/3510457.3513052","DOIUrl":"https://doi.org/10.1145/3510457.3513052","url":null,"abstract":"Background: Applying Continuous Experimentation on a large scale is not easily achieved. Although the evolution within large tech organisations is well understood, we still lack a good understanding of how to transition a company towards applying more experiments. Objective: This study investigates how practitioners define, value and apply experimentation, the blockers they experience and what to do to solve these. Method: We interviewed and surveyed over one hundred practitioners with regards to experimentation perspectives, from a large financial services and e-commerce organization, based in the Netherlands. Results: Many practitioners have different perspectives on experimentation. The value is well understood. We have learned that the practitioners are blocked by a lack of priority, experience and well functioning tooling. Challenges also arise around dependencies between teams and evaluating experiments with the correct metrics. Conclusions: Organisation leaders need to start asking for experiment results and investing in infrastructure and processes to actually enable teams to execute experiments and show the value of their work in terms of value for customers and business.","PeriodicalId":119790,"journal":{"name":"2022 IEEE/ACM 44th International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP)","volume":"47 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124969102","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}