{"title":"Software Architecture Challenges for ML Systems","authors":"Grace A. Lewis, I. Ozkaya, Xiwei Xu","doi":"10.26226/morressier.613b5419842293c031b5b647","DOIUrl":"https://doi.org/10.26226/morressier.613b5419842293c031b5b647","url":null,"abstract":"Developing machine learning (ML) systems, just like any other system, requires architecture thinking. However, there are characteristics of ML components that create challenges and unique quality attribute (QA) concerns for software architecture and design activities, such as data-dependent behavior, detecting and responding to drift over time, and timely capture of ground truth to inform retraining. This paper presents four categories of software architecture challenges that need to be addressed to support ML system development, maintenance and evolution: software architecture practices for ML systems, architecture patterns and tactics for ML-important QAs, monitorability as a driving QA, and co-architecting and co-versioning. These challenges were collected from targeted workshops, practitioner interviews, and industry engagements. The goal of our work is to encourage further research in these areas and use the information presented in this paper to guide the development of empirically-validated practices for architecting ML systems.","PeriodicalId":205629,"journal":{"name":"2021 IEEE International Conference on Software Maintenance and Evolution (ICSME)","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134170803","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mohamed Raed El aoun, Heng Li, Foutse Khomh, Moses Openja
{"title":"Understanding Quantum Software Engineering Challenges An Empirical Study on Stack Exchange Forums and GitHub Issues","authors":"Mohamed Raed El aoun, Heng Li, Foutse Khomh, Moses Openja","doi":"10.26226/morressier.613b5418842293c031b5b5dd","DOIUrl":"https://doi.org/10.26226/morressier.613b5418842293c031b5b5dd","url":null,"abstract":"With the advance of quantum computing, quantum software becomes critical for exploring the full potential of quantum computing systems. Recently, quantum software engineering (QSE) becomes an emerging area attracting more and more attention. However, it is not clear what are the challenges and opportunities of quantum computing facing the software engineering community. This work aims to understand the QSE-related challenges perceived by developers. We perform an empirical study on Stack Exchange forums where developers post-QSE-related questions & answers and Github issue reports where developers raise QSE-related issues in practical quantum computing projects. Based on an existing taxonomy of question types on Stack Overflow, we first perform a qualitative analysis of the types of QSE-related questions asked on Stack Exchange forums. We then use automated topic modeling to uncover the topics in QSE-related Stack Exchange posts and GitHub issue reports. Our study highlights some particularly challenging areas of QSE that are different from that of traditional software engineering, such as explaining the theory behind quantum computing code, interpreting quantum program outputs, and bridging the knowledge gap between quantum computing and classical computing, as well as their associated opportunities.","PeriodicalId":205629,"journal":{"name":"2021 IEEE International Conference on Software Maintenance and Evolution (ICSME)","volume":"44 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133474384","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"You Look so Different: Finding Structural Clones and Subclones in Java Source Code","authors":"W. Amme, Thomas S. Heinze, André Schäfer","doi":"10.1109/ICSME52107.2021.00013","DOIUrl":"https://doi.org/10.1109/ICSME52107.2021.00013","url":null,"abstract":"Code reuse and copying is a widespread practice in software development. Detecting code clones, i.e., identical or similar fragments of code, is thus an important task with many applications, ranging from code search to bug finding and malware detection. In this paper, we propose a new approach to detect code clones in source code. Instead of analyzing the code tokens or syntax, our technique is based upon control flow analysis and dominator trees. In this way, the technique not only detects exact and syntactically similar near-miss code clones but also two new types of clones, which we characterize as structural code clones and subclones. For implementation and evaluation, we have developed the tool StoneDetector, which finds code clones in Java source code. StoneDetector performs competitive with the state of the art as measured on the BigCloneBench benchmark and finds more structural clones and subclones.","PeriodicalId":205629,"journal":{"name":"2021 IEEE International Conference on Software Maintenance and Evolution (ICSME)","volume":"60 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122970110","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Shaowei Wang, D. Germán, T. Chen, Yuan Tian, Ahmed E. Hassan
{"title":"Is reputation on Stack Overflow always a good indicator for users' expertise? No!","authors":"Shaowei Wang, D. Germán, T. Chen, Yuan Tian, Ahmed E. Hassan","doi":"10.1109/ICSME52107.2021.00067","DOIUrl":"https://doi.org/10.1109/ICSME52107.2021.00067","url":null,"abstract":"Stack Overflow (SO) users are recognized by reputation points. The reputation points are often a great avenue for users to build their career profile and demonstrate their expertise in some domains. Prior studies used users' reputation as a proxy to estimate their experience and expertise. However, there are various ways for a user to earn reputation points that do not require much expertise, such as asking high-quality questions. Therefore, it is important to understand the meaning of a high-reputation point and if the reputation could be used as a good indicator for users' expertise and experience on Stack Overflow. In this study, we explore how users earn reputation points on Stack Overflow by mining their reputation-related activities (e.g., asking questions, answering questions, and editing posts). We study the reputation-related activities of 93,053 high-reputation users that have at least 1,000 reputation points. We find that 1) 13.8% of the studied users earn their majority reputation points through asking questions rather than answering questions. 2) In general, most of the posted answers received no or very few reputation points with users gaining their points from a very small proportion of highly-voted answers. 12% of users' entire reputation comes from one single answer. We suggest future research and Stack Overflow introduce a new metric (i.e., vindex) to evaluate the expertise of a user.","PeriodicalId":205629,"journal":{"name":"2021 IEEE International Conference on Software Maintenance and Evolution (ICSME)","volume":"76 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117221520","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Teng Zhou, Kui Liu, Li Li, Zhe Liu, Jacques Klein, Tegawendé F. Bissyandé
{"title":"SmartGift: Learning to Generate Practical Inputs for Testing Smart Contracts","authors":"Teng Zhou, Kui Liu, Li Li, Zhe Liu, Jacques Klein, Tegawendé F. Bissyandé","doi":"10.1109/ICSME52107.2021.00009","DOIUrl":"https://doi.org/10.1109/ICSME52107.2021.00009","url":null,"abstract":"With the boom of Initial Coin Offerings (ICO) in the financial markets, smart contracts have gained rapid popularity among consumers. Smart contract vulnerabilities however made them a prime target to malicious attacks that are leading to huge losses. The research community is thus applying various software engineering technologies to smart contracts to address them. In general, to detect vulnerabilities in smart contracts, mutation and fuzz based testing approaches have been widely studied and indeed achieved promising performance on benchmark datasets. Generating test inputs with mutation approaches essentially relies on the available test cases in a smart contract program. In our preliminary study, however, we observed that 56.4% of 218 identified open-source smart contract project repositories do not provide any test case for validation. Fuzzing test inputs leads to random values and lacks practical usefulness. Our work addresses this problem: we propose an approach, Smartgift, which generates practical inputs for testing smart contracts by learning from the transaction records of real-world smart contracts. Leveraging a collected set of over 60 thousand transaction records, Smartgift is able to generate relevant test inputs for ~77% smart contract functions, largely outperforming the traditional fuzzing approach (successful for only 60% functions). We further demonstrate the practicality of the test inputs by using them to replace the test inputs of the ContractFuzzer state of the art smart contract vulnerability detector: with inputs by Smartgift, ContractFuzzer can now detect 131 of the 154 vulnerabilities in its benchmark.","PeriodicalId":205629,"journal":{"name":"2021 IEEE International Conference on Software Maintenance and Evolution (ICSME)","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126622505","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A First Look at Accessibility Issues in Popular GitHub Projects","authors":"Tingting Bi, Xin Xia, D. Lo, A. Aleti","doi":"10.1109/ICSME52107.2021.00041","DOIUrl":"https://doi.org/10.1109/ICSME52107.2021.00041","url":null,"abstract":"Accessibility design elements allow people to access software products and services independent of their different abilities. However, accessibility is challenging to handle and whether accessibility is widely considered in software projects is unclear. In this work, we aim to understand if accessibility is a prevalent consideration in practice, what accessibility issues are discussed in GitHub projects, what potential reasons cause accessibility issues, and what solutions (e.g., tools and standards) are applied for addressing accessibility issues. In this work, we collect 11,820 accessibility issues and their threads discussed by developers in popular GitHub projects. We manually analyzed and grouped the collected accessibility issues into seven categories. The results of our study uncover that accessibility is widely discussed in general projects, and the potential reasons that cause accessibility issues are because developers are not aware of the importance of accessibility and they lack knowledge about accessibility concerns, standards, and existing tools. Our results and findings can enhance and improve developers' knowledge and awareness when they conduct accessibility-relevant design or incorporate accessibility elements into their projects.","PeriodicalId":205629,"journal":{"name":"2021 IEEE International Conference on Software Maintenance and Evolution (ICSME)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128091983","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Improving Traceability Link Recovery Using Fine-grained Requirements-to-Code Relations","authors":"Tobias Hey, Fei Chen, Sebastian Weigelt, W. Tichy","doi":"10.1109/ICSME52107.2021.00008","DOIUrl":"https://doi.org/10.1109/ICSME52107.2021.00008","url":null,"abstract":"Traceability information is a fundamental prerequisite for many essential software maintenance and evolution tasks, such as change impact and software reusability analyses. However, manually generating traceability information is costly and error-prone. Therefore, researchers have developed automated approaches that utilize textual similarities between artifacts to establish trace links. These approaches tend to achieve low precision at reasonable recall levels, as they are not able to bridge the semantic gap between high-level natural language requirements and code. We propose to overcome this limitation by leveraging fine-grained, method and sentence level, similarities between the artifacts for traceability link recovery. Our approach uses word embeddings and a Word Mover's Distance-based similarity to bridge the semantic gap. The fine-grained similarities are aggregated according to the artifacts structure and participate in a majority vote to retrieve coarse-grained, requirement-to-class, trace links. In a comprehensive empirical evaluation, we show that our approach is able to outperform state-of-the-art unsupervised traceability link recovery approaches. Additionally, we illustrate the benefits of fine-grained structural analyses to word embedding-based trace link generation.","PeriodicalId":205629,"journal":{"name":"2021 IEEE International Conference on Software Maintenance and Evolution (ICSME)","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126366669","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Disambiguating Mentions of API Methods in Stack Overflow via Type Scoping","authors":"K. Luong, Ferdian Thung, D. Lo","doi":"10.1109/ICSME52107.2021.00080","DOIUrl":"https://doi.org/10.1109/ICSME52107.2021.00080","url":null,"abstract":"Stack Overflow is one of the most popular venues for developers to find answers to their API-related questions. However, API mentions in informal text content of Stack Overflow are often ambiguous and thus it could be difficult to find the APIs and learn their usages. Disambiguating these API mentions is not trivial, as an API mention can match with names of APIs from different libraries or even the same one. In this paper, we propose an approach called DATYS to disambiguate API mentions in informal text content of Stack Overflow using type scoping. With type scoping, we consider API methods whose type (i.e. class or interface) appear in more parts (i.e., scopes) of a Stack Overflow thread as more likely to be the API method that the mention refers to. We have evaluated our approach on a dataset of 807 API mentions from 380 threads containing discussions of API methods from four popular third-party Java libraries. Our experiment shows that our approach beats the state-of-the-art by 42.86% in terms of F1-score.","PeriodicalId":205629,"journal":{"name":"2021 IEEE International Conference on Software Maintenance and Evolution (ICSME)","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125222769","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The Impact of Continuous Code Quality Assessment on Defects","authors":"R. Pfeiffer","doi":"10.1109/ICSME52107.2021.00069","DOIUrl":"https://doi.org/10.1109/ICSME52107.2021.00069","url":null,"abstract":"Continuous Code Quality Assessment (CCQA) tools promise that increasing code quality leads to fewer defects, i.e., that software quality from the user view can be increased by increasing quality of the product. Currently, there is limited evidence on that application of CCQA tools, such as, SonarCloud (SC), during software development actually reduces the amount of defects over time. In this paper we study five open-source projects that adopt SonarCloud (SC) for CCQA and we compare frequencies of defect reports before and after adoption of SC. For only one project (Apache Ratis), we find a statistically significant decrease of defects after adoption of the tool. After closer investigation we find, that this decrease is likely just a coincidence and not caused by the adoption of SC and adherence to its code quality recommendations. In general, we find no evidence for that application of a CCQA tool increases product quality.","PeriodicalId":205629,"journal":{"name":"2021 IEEE International Conference on Software Maintenance and Evolution (ICSME)","volume":"48 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122106580","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A. A. Bangash, Daniil Tiganov, Karim Ali, Abram Hindle
{"title":"Energy Efficient Guidelines for iOS Core Location Framework","authors":"A. A. Bangash, Daniil Tiganov, Karim Ali, Abram Hindle","doi":"10.26226/morressier.613b5418842293c031b5b5f0","DOIUrl":"https://doi.org/10.26226/morressier.613b5418842293c031b5b5f0","url":null,"abstract":"Several types of apps require accessing user location, including map navigation, food ordering, and fitness tracking apps. To access user location, app developers use frameworks that the underlying platform provides to them. For the iOS platform, the Core Location framework enables developers to configure various services to obtain user location information. But how does a particular configuration affect the energy consumption of an app? The available Core Location framework documentation is insufficient to help developers reason about the tradeoff between choosing a particular configuration and energy consumption. In this paper, we present a set of guidelines that will help developers make an energy-efficient design choice while configuring the Core Location framework for their app. To achieve that, we have created microbenchmark configurations of the various services that the Core Location framework provides. We have then run several test-scenarios on these configurations to extract their energy profiles. To extract energy-efficient guidelines for developers, we have carefully examined those energy profile results. The guidelines show several configurations that not only reduce energy consumption but also access locations more frequently than other configurations. To evaluate those guidelines, we analyzed three real-world apps and a location service sample app provided by Apple. Our results show that the guidelines help reduce energy: 0.42% for a property search app, 10.59% for a weather app, 26.91% for a location utility app, and 11.37% for Apple's sample app. Additionally, our empirical evaluation shows that choosing an energy-hungry configuration can increase the energy consumption by up to a maximum of 23.97%. Our guidelines are effective on 3 real-world apps, and our methodology may be used to extract energy-efficient guidelines for frameworks other than the Core Location framework.","PeriodicalId":205629,"journal":{"name":"2021 IEEE International Conference on Software Maintenance and Evolution (ICSME)","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132608150","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}