2022 IEEE International Conference on Software Maintenance and Evolution (ICSME)最新文献_第7页

An Empirical Study on the Usage of Automated Machine Learning Tools 自动化机器学习工具使用的实证研究

2022 IEEE International Conference on Software Maintenance and Evolution (ICSME) Pub Date : 2022-08-28 DOI: 10.1109/ICSME55016.2022.00014

Forough Majidi, Moses Openja, Foutse Khomh, Heng Li

{"title":"An Empirical Study on the Usage of Automated Machine Learning Tools","authors":"Forough Majidi, Moses Openja, Foutse Khomh, Heng Li","doi":"10.1109/ICSME55016.2022.00014","DOIUrl":"https://doi.org/10.1109/ICSME55016.2022.00014","url":null,"abstract":"The popularity of automated machine learning (AutoML) tools in different domains has increased over the past few years. Machine learning (ML) practitioners use AutoML tools to automate and optimize the process of feature engineering, model training, and hyperparameter optimization and so on. Recent work performed qualitative studies on practitioners’ experiences of using AutoML tools and compared different AutoML tools based on their performance and provided features, but none of the existing work studied the practices of using AutoML tools in real-world projects at a large scale. Therefore, we conducted an empirical study to understand how ML practitioners use AutoML tools in their projects. To this end, we examined the top 10 most used AutoML tools and their respective usages in a large number of open-source project repositories hosted on GitHub. The results of our study show 1) which AutoML tools are mostly used by ML practitioners and 2) the characteristics of the repositories that use these AutoML tools. Also, we identified the purpose of using AutoML tools (e.g. model parameter sampling, search space management, model evaluation/error-analysis, Data/ feature transformation, and data labeling) and the stages of the ML pipeline (e.g. feature engineering) where AutoML tools are used. Finally, we report how often AutoML tools are used together in the same source code files. We hope our results can help ML practitioners learn about different AutoML tools and their usages, so that they can pick the right tool for their purposes. Besides, AutoML tool developers can benefit from our findings to gain insight into the usages of their tools and improve their tools to better fit the users’ usages and needs.","PeriodicalId":300084,"journal":{"name":"2022 IEEE International Conference on Software Maintenance and Evolution (ICSME)","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-08-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134365813","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

Don’t Reinvent the Wheel: Towards Automatic Replacement of Custom Implementations with APIs 不要重新发明轮子:用api自动替换自定义实现

2022 IEEE International Conference on Software Maintenance and Evolution (ICSME) Pub Date : 2022-08-16 DOI: 10.1109/ICSME55016.2022.00046

Rosalia Tufano, Emad Aghajani, G. Bavota

引用次数: 0

How to Configure Masked Event Anomaly Detection on Software Logs? 如何配置对软件日志进行屏蔽事件异常检测?

2022 IEEE International Conference on Software Maintenance and Evolution (ICSME) Pub Date : 2022-08-03 DOI: 10.1109/ICSME55016.2022.00050

Jesse Nyyssölä, M. Mäntylä, M. Varela

{"title":"How to Configure Masked Event Anomaly Detection on Software Logs?","authors":"Jesse Nyyssölä, M. Mäntylä, M. Varela","doi":"10.1109/ICSME55016.2022.00050","DOIUrl":"https://doi.org/10.1109/ICSME55016.2022.00050","url":null,"abstract":"Software Log anomaly event detection with masked event prediction has various technical approaches with countless configurations and parameters. Our objective is to provide a baseline of settings for similar studies in the future. The models we use are the N-Gram model, which is a classic approach in the field of natural language processing (NLP), and two deep learning (DL) models long short-term memory (LSTM) and convolutional neural network (CNN). For datasets we used four datasets Profilence, BlueGene/L (BGL), Hadoop Distributed File System (HDFS) and Hadoop. Other settings are the size of the sliding window which determines how many surrounding events we are using to predict a given event, mask position (the position within the window we are predicting), the usage of only unique sequences, and the portion of data that is used for training. The results show clear indications of settings that can be generalized across datasets. The performance of the DL models does not deteriorate as the window size increases while the N-Gram model shows worse performance with large window sizes on the BGL and Profilence datasets. Despite the popularity of Next Event Prediction, the results show that in this context it is better not to predict events at the edges of the subsequence, i.e., first or last event, with the best result coming from predicting the fourth event when the window size is five. Regarding the amount of data used for training, the results show differences across datasets and models. For example, the N-Gram model appears to be more sensitive toward the lack of data than the DL models. Overall, for similar experimental setups we suggest the following general baseline: Window size 10, mask position second to last, do not filter out non-unique sequences, and use a half of the total data for training.","PeriodicalId":300084,"journal":{"name":"2022 IEEE International Conference on Software Maintenance and Evolution (ICSME)","volume":"227 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-08-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121040449","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Together or Apart? Investigating a mediator bot to aggregate bot’s comments on pull requests 一起还是分开?调查一个中介机器人来聚合机器人对拉取请求的评论

2022 IEEE International Conference on Software Maintenance and Evolution (ICSME) Pub Date : 2022-08-02 DOI: 10.1109/ICSME55016.2022.00054

Eric Ribeiro, Ronan Nascimento, Igor Steinmacher, Laerte Xavier, M. Gerosa, Hugo de Paula, M. Wessel

引用次数: 2

An Exploratory Study of Documentation Strategies for Product Features in Popular GitHub Projects 流行GitHub项目中产品特性文档策略的探索性研究

2022 IEEE International Conference on Software Maintenance and Evolution (ICSME) Pub Date : 2022-08-02 DOI: 10.1109/ICSME55016.2022.00043

Tim Puhlfurss, Lloyd Montgomery, W. Maalej

{"title":"An Exploratory Study of Documentation Strategies for Product Features in Popular GitHub Projects","authors":"Tim Puhlfurss, Lloyd Montgomery, W. Maalej","doi":"10.1109/ICSME55016.2022.00043","DOIUrl":"https://doi.org/10.1109/ICSME55016.2022.00043","url":null,"abstract":"[Background] In large open-source software projects, development knowledge is often fragmented across multiple artefacts and contributors such that individual stakeholders are generally unaware of the full breadth of the product features. However, users want to know what the software is capable of, while contributors need to know where to fix, update, and add features. [Objective] This work aims at understanding how feature knowledge is documented in GitHub projects and how it is linked (if at all) to the source code. [Method] We conducted an in-depth qualitative exploratory content analysis of 25 popular GitHub repositories that provided the documentation artefacts recommended by GitHub’s Community Standards indicator. We first extracted strategies used to document software features in textual artefacts and then strategies used to link the feature documentation with source code. [Results] We observed feature documentation in all studied projects in artefacts such as READMEs, wikis, and website resource files. However, the features were often described in an unstructured way. Additionally, tracing techniques to connect feature documentation and source code were rarely used. [Conclusions] Our results suggest a lacking (or a low-prioritised) feature documentation in open-source projects, little use of normalised structures, and a rare explicit referencing to source code. As a result, product feature traceability is likely to be very limited, and maintainability to suffer over time.","PeriodicalId":300084,"journal":{"name":"2022 IEEE International Conference on Software Maintenance and Evolution (ICSME)","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-08-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114669175","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Adding Context to Source Code Representations for Deep Learning 为深度学习的源代码表示添加上下文

2022 IEEE International Conference on Software Maintenance and Evolution (ICSME) Pub Date : 2022-07-30 DOI: 10.1109/ICSME55016.2022.00042

Fuwei Tian, Christoph Treude

引用次数: 1

Developers Struggle with Authentication in Blazor WebAssembly 开发人员在Blazor WebAssembly中挣扎于身份验证

2022 IEEE International Conference on Software Maintenance and Evolution (ICSME) Pub Date : 2022-07-30 DOI: 10.1109/ICSME55016.2022.00045

Pascal André, Quentin Sti'evenart, Mohammad Ghafari

引用次数: 1

Perun: Performance Version System Perun:性能版本系统

2022 IEEE International Conference on Software Maintenance and Evolution (ICSME) Pub Date : 2022-07-26 DOI: 10.1109/ICSME55016.2022.00067

Tomás Fiedor, Jirí Pavela, Adam Rogalewicz, Tomáš Vojnar

引用次数: 0

What Made This Test Flake? Pinpointing Classes Responsible for Test Flakiness 是什么造成了这个测试片?精确定位导致测试不稳定的类

2022 IEEE International Conference on Software Maintenance and Evolution (ICSME) Pub Date : 2022-07-20 DOI: 10.1109/ICSME55016.2022.00039

Sarra Habchi, Guillaume Haben, Jeongju Sohn, Adriano Franci, Mike Papadakis, Maxime Cordy, Yves Le Traon

{"title":"What Made This Test Flake? Pinpointing Classes Responsible for Test Flakiness","authors":"Sarra Habchi, Guillaume Haben, Jeongju Sohn, Adriano Franci, Mike Papadakis, Maxime Cordy, Yves Le Traon","doi":"10.1109/ICSME55016.2022.00039","DOIUrl":"https://doi.org/10.1109/ICSME55016.2022.00039","url":null,"abstract":"Flaky tests are defined as tests that manifest non-deterministic behaviour by passing and failing intermittently for the same version of the code. These tests cripple continuous integration with false alerts that waste developers’ time and break their trust in regression testing. To mitigate the effects of flakiness, both researchers and industrial experts proposed strategies and tools to detect and isolate flaky tests. However, flaky tests are rarely fixed as developers struggle to localise and understand their causes. Additionally, developers working with large codebases often need to know the sources of non-determinism to preserve code quality, i.e., avoid introducing technical debt linked with non-deterministic behaviour, and to avoid introducing new flaky tests. To aid with these tasks, we propose re-targeting Fault Localisation techniques to the flaky component localisation problem, i.e., pinpointing program classes that cause the non-deterministic behaviour of flaky tests. In particular, we employ Spectrum-Based Fault Localisation (SBFL), a coverage-based fault localisation technique commonly adopted for its simplicity and effectiveness. We also utilise other data sources, such as change history and static code metrics, to further improve the localisation. Our results show that augmenting SBFL with change and code metrics ranks flaky classes in the top-1 and top-5 suggestions, in 26% and 47% of the cases. Overall, we successfully reduced the average number of classes inspected to locate the first flaky class to 19% of the total number of classes covered by flaky tests. Our results also show that localisation methods are effective in major flakiness categories, such as concurrency and asynchronous waits, indicating their general ability to identify flaky components.","PeriodicalId":300084,"journal":{"name":"2022 IEEE International Conference on Software Maintenance and Evolution (ICSME)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130975117","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

An Empirical Study of Flaky Tests in JavaScript JavaScript中片状测试的实证研究

2022 IEEE International Conference on Software Maintenance and Evolution (ICSME) Pub Date : 2022-07-03 DOI: 10.1109/ICSME55016.2022.00011

Negar Hashemi, Amjed Tahir, Shawn Rasheed

{"title":"An Empirical Study of Flaky Tests in JavaScript","authors":"Negar Hashemi, Amjed Tahir, Shawn Rasheed","doi":"10.1109/ICSME55016.2022.00011","DOIUrl":"https://doi.org/10.1109/ICSME55016.2022.00011","url":null,"abstract":"Flaky tests (tests with non-deterministic outcomes) can be problematic for testing efficiency and software reliability. Flaky tests in test suites can also significantly delay software releases. There have been several studies that attempt to quantify the impact of test flakiness in different programming languages (e.g., Java and Python) and application domains (e.g., mobile and GUI-based). In this paper, we conduct an empirical study of the state of flaky tests in JavaScript. We investigate two aspects of flaky tests in JavaScript projects: the main causes of flaky tests in these projects and common fixing strategies. By analysing 452 commits from large, top-scoring JavaScript projects from GitHub, we found that flakiness caused by concurrency-related issues (e.g., async wait, race conditions or deadlocks) is the most dominant reason for test flakiness. The other top causes of flaky tests are operating system-specific (e.g., features that work on specific OS or OS versions) and network stability (e.g., internet availability or bad socket management). In terms of how flaky tests are dealt with, the majority of those flaky tests (>80%) are fixed to eliminate flaky behaviour and developers sometimes skip, quarantine or remove flaky tests.","PeriodicalId":300084,"journal":{"name":"2022 IEEE International Conference on Software Maintenance and Evolution (ICSME)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129835506","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5