{"title":"Code Privacy in Detection of Web Vulnerabilities","authors":"Jorge Martins, Ibéria Medeiros, Bernardo Ferreira","doi":"10.1145/3593434.3593483","DOIUrl":"https://doi.org/10.1145/3593434.3593483","url":null,"abstract":"We propose a solution combining source code static analysis with searchable symmetric encryption to detect input validation vulnerabilities of web applications in encrypted PHP code, allowing developers to protect their codebase from malicious third parties while simultaneously discovering vulnerabilities in it. Results show that our solution is capable of identifying vulnerabilities with precision similar to traditional static code, non-privacy-preserving analysers and exhibits a maximum overhead increase of around 16,55%.","PeriodicalId":178596,"journal":{"name":"Proceedings of the 27th International Conference on Evaluation and Assessment in Software Engineering","volume":"450 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122487528","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Fusion of deep convolutional and LSTM recurrent neural networks for automated detection of code smells","authors":"Anh Ho, Anh M. T. Bui, P. Nguyen, Amleto Di Salle","doi":"10.1145/3593434.3593476","DOIUrl":"https://doi.org/10.1145/3593434.3593476","url":null,"abstract":"Code smells is the term used to signal certain patterns or structures in software code that may contain a potential design or architecture problem, leading to maintainability or other software quality issues. Detecting code smells early in the software development process helps prevent these problems and improve the overall software quality. Existing research concentrates on the process of collecting and handling dataset, then exploring the potential of utilizing deep learning models to detect smells, while ignoring extensive feature engineering. Though these approaches obtained promising results, the following issues need to be tackled: (i) extracting both structural and semantic features from the software units; (ii) mitigating the effects of imbalanced data distribution on the performance.In this paper, we propose DeepSmells as a novel approach to code smells detection. To learn the complex hierarchical representations of the code fragment, we apply a deep convolutional neural network (CNN). Then, in order to improve the quality of the context encoding and preserve semantic information, long short-term memory networks (LSTM) is placed immediately after the CNN. The final classification is conducted by deep neural networks with weighted loss function to reduce the impact of skewed data distribution. We performed an empirical study using the existing code smell benchmark datasets to assess the performance of our proposed approach, and compare it with state-of-the-art baselines. The results demonstrate the effectiveness of our proposed method for all kinds of code smells with outperformed evaluation metrics in terms of F1 score and MCC.","PeriodicalId":178596,"journal":{"name":"Proceedings of the 27th International Conference on Evaluation and Assessment in Software Engineering","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127124907","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Outside the Sandbox: A Study of Input/Output Methods in Java","authors":"Matúš Sulír, Sergej Chodarev, Milan Nosáľ","doi":"10.1145/3593434.3593501","DOIUrl":"https://doi.org/10.1145/3593434.3593501","url":null,"abstract":"Programming languages often demarcate the internal sandbox, consisting of entities such as objects and variables, from the outside world, e.g., files or network. Although communication with the external world poses fundamental challenges for live programming, reversible debugging, testing, and program analysis in general, studies about this phenomenon are rare. In this paper, we present a preliminary empirical study about the prevalence of input/output (I/O) method usage in Java. We manually categorized 1435 native methods in a Java Standard Edition distribution into non-I/O and I/O-related methods, which were further classified into areas such as desktop or file-related ones. According to the static analysis of a call graph for 798 projects, about 57% of methods potentially call I/O natives. The results of dynamic analysis on 16 benchmarks showed that 21% of the executed methods directly or indirectly called an I/O native. We conclude that neglecting I/O is not a viable option for tool designers and suggest the integration of I/O-related metadata with source code to facilitate their querying.","PeriodicalId":178596,"journal":{"name":"Proceedings of the 27th International Conference on Evaluation and Assessment in Software Engineering","volume":"134 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132294881","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"From Data Analysis to Human Input: Navigating the Complexity of Software Evaluation and Assessment","authors":"Sigrid Eldh","doi":"10.1145/3593434.3596439","DOIUrl":"https://doi.org/10.1145/3593434.3596439","url":null,"abstract":"It is the time of trust and transformation in software. We want explainable AI to assist us in dialogue, write our programs, test our software, and improve how we communicate. It is the time of digitalization, but we must ask ourselves - on what data in what format, when do we collect it, and what is the source? Does “data” make sense? Every action can be automated, should eventually be automated, and as such should be traceable and explainable. The transformation of software – and how we can now train, and feedback in a fast way, enable us to not only utilize existing technologies, but also aids us in faster embracing new technologies. This transformation is much to slow even if things change at a lightning speed. Change is the only thing we can be sure will happen. Evaluating and assessing quality of software sounds easy but is only as good as you design it to be. We, often simplify the problem so we can move forward, but it is the complications that is the real issue – our context, our combination of tools, languages, hardware, history, and way of working. We simply need the labeling, the meta-data, the context – and this data in a form with “many” perspectives to draw the more “accurate” scientific picture. Having a multi-facetted perspective is important when analyzing complex contexts. In software, listening skills and asking the right questions to the right people is often invaluable to complement blunt data. On the other side - much information is probably missing as you are too easily getting “only” what you asked for. So, we cannot judge what we cannot observe – and analyzing this data, is another issue all together. We need to know what is right – because if we cannot trust the source – or double check the outcome, how would we know it is not just a “fake” data? What does the outlier really mean? Is it a sign of a new trend is it the first time we capture this odd event? Therefore, it is easy to lose perspective in a fast-changing world. Despite drowning in tools, we still miss a lot of them. The threshold of using a tool is high, as we cannot trust them, and we cannot be sure that the data these tools collect does represent what we want to investigate. Therefore, the role of the scientist is more important than ever. Trusting the scientific process, utilizing multiple methods, and combining them is the receipt! Another goal is doing our best to select topics and collaborators – as building better software (quality) for humanity. It starts with you and me. I hope I will in this context be able to touch upon areas like security, testing, automation, AI/ML, ethics and “human in the loop”, analysis, tools, and technical debt, with a focus on evaluations and assessments.","PeriodicalId":178596,"journal":{"name":"Proceedings of the 27th International Conference on Evaluation and Assessment in Software Engineering","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130995749","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Are security commit messages informative? Not enough!","authors":"Sofia Reis, Rui Abreu, C. Pasareanu","doi":"10.1145/3593434.3593481","DOIUrl":"https://doi.org/10.1145/3593434.3593481","url":null,"abstract":"The fast distribution and deployment of security patches are important to protect users against cyberattacks. These fixes can be detected automatically by patch management triage systems. However, previous work has shown that automating the task is not easy, in some cases, because of poor documentation or lack of information in security fixes. For many years, standard practices in the security community have steered engineers to provide cryptic commit messages (i.e., patch software vulnerabilities silently) to avoid potential attacks and reputation damage. However, not providing enough documentation on vulnerability fixes can hinder trust between vendors and users. Current efforts in the security community aim to increase the level of transparency during patch and disclosing times to help build trust in the development community and make patch management processes faster. In this paper, we evaluate how informative security commit messages (i.e., messages attached to security fixes) are and how different levels of information can affect different tasks in automated patch triage systems. We observed that security engineers, in general, do not provide enough detail to enable the three automated triage systems at the same time. In addition, results show that security commit messages need to be more informative—56.7% of the messages analyzed were documented poorly. Best practices to write informative and well-structured security commit messages (such as SECOM) should become a standard practice in the security community.","PeriodicalId":178596,"journal":{"name":"Proceedings of the 27th International Conference on Evaluation and Assessment in Software Engineering","volume":"80 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128410340","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yasir Hussain, Zhiqiu Huang, Yu Zhou, I. A. Khan, Nasrullah Khan, Muhammad Zahid Abbas
{"title":"Optimized Tokenization Process for Open-Vocabulary Code Completion: An Empirical Study","authors":"Yasir Hussain, Zhiqiu Huang, Yu Zhou, I. A. Khan, Nasrullah Khan, Muhammad Zahid Abbas","doi":"10.1145/3593434.3594236","DOIUrl":"https://doi.org/10.1145/3593434.3594236","url":null,"abstract":"Studies have substantiated the efficacy of deep learning-based models in various source code modeling tasks. These models are usually trained on large datasets that are divided into smaller units, known as tokens, utilizing either an open or closed vocabulary system. The selection of a tokenization method can have a profound impact on the number of tokens generated, which in turn can significantly influence the performance of the model. This study investigates the effect of different tokenization methods on source code modeling and proposes an optimized tokenizer to enhance the tokenization performance. The proposed tokenizer employs a hybrid approach that initializes with a global vocabulary based on the most frequent unigrams and incrementally builds an open-vocabulary system. The proposed tokenizer is evaluated against popular tokenization methods such as Closed, Unigram, WordPiece, and BPE tokenizers, as well as tokenizers provided by large pre-trained models such as PolyCoder and CodeGen. The results indicate that the choice of tokenization method can significantly impact the number of sub-tokens generated, which can ultimately influence the modeling performance of a model. Furthermore, our empirical evaluation demonstrates that the proposed tokenizer outperforms other baselines, achieving improved tokenization performance both in terms of a reduced number of sub-tokens and time cost. In conclusion, this study highlights the significance of the choice of tokenization method in source code modeling and the potential for improvement through optimized tokenization techniques.","PeriodicalId":178596,"journal":{"name":"Proceedings of the 27th International Conference on Evaluation and Assessment in Software Engineering","volume":"48 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131806734","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
M. Agbese, Rahul Mohanani, A. Khan, P. Abrahamsson
{"title":"Ethical Requirements Stack: A framework for implementing ethical requirements of AI in software engineering practices","authors":"M. Agbese, Rahul Mohanani, A. Khan, P. Abrahamsson","doi":"10.1145/3593434.3593489","DOIUrl":"https://doi.org/10.1145/3593434.3593489","url":null,"abstract":"ACM Reference Format: Mamia Agbese, Rahul Mohanani, Arif Ali Khan, and Pekka Abrahamsson. 2023. Ethical Requirements Stack: A framework for implementing ethical requirements of AI in software engineering practices. In Proceedings of the International Conference on Evaluation and Assessment in Software Engineering (EASE ’23), June 14–16, 2023, Oulu, Finland. ACM, New York, NY, USA, 3 pages. https://doi.org/10.1145/3593434.3593489","PeriodicalId":178596,"journal":{"name":"Proceedings of the 27th International Conference on Evaluation and Assessment in Software Engineering","volume":"112 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127981168","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Effective Agile Contracts Framework for Software Innovation Projects","authors":"Adriano Gomes","doi":"10.1145/3593434.3593473","DOIUrl":"https://doi.org/10.1145/3593434.3593473","url":null,"abstract":"This research explores the challenges in agile contract modeling for software innovation projects, particularly for outsourced projects. Literature has presented various methods and frameworks for agile contract management, but there is still a gap in effectively establishing the best contract approach for each project based on specific conditions. This work aims to contribute with a framework definition that effectively applies practical approaches for contract deployment suitable for software innovation projects, considering the best contractual practices related to projects specific context. The study will conduct an action research at CESAR, a prominent Brazilian Institute of Science and Technology (ICT) with 1200 employees and 26 years old, to establish effective agile contract models and its implementation that better support agile management and project success. The study hopes to contribute to understanding the relationship between the type of contract and project outcomes and to provide better agile contract implementation for software innovation projects developed by outsourced companies.","PeriodicalId":178596,"journal":{"name":"Proceedings of the 27th International Conference on Evaluation and Assessment in Software Engineering","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129079132","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Towards a User-centred Security Framework for Social Robots in Public Spaces","authors":"S. O. Oruma","doi":"10.1145/3593434.3593446","DOIUrl":"https://doi.org/10.1145/3593434.3593446","url":null,"abstract":"The use of social robots in public spaces is becoming increasingly popular due to their ability to provide personalized services to users. However, the convergence of different technologies and software applications has raised concerns regarding security requirements, standards, and regulations. Specifically, there are significant concerns about the evolving threat landscape for software applications in public settings, where social robots interact without supervision and are in direct contact with threat actors. During the development of social robots software, developers and practitioners need practical tools to continuously assess their products’ security profiles. This paper presents a preventive approach to the dynamic evolving security landscape of Social Robots in Public Spaces (SRPS) using design science research (DSR) methodology to develop a security framework. The study investigates security threats, vulnerabilities, and risks associated with SRPS software development and analyzes existing related frameworks to design a security framework for SRPS software developers. The research aims to provide insights into the security aspects of SRPS software application development processes and contribute to developing effective security frameworks to mitigate evolving risks and ensure secure operation and acceptance in public spaces.","PeriodicalId":178596,"journal":{"name":"Proceedings of the 27th International Conference on Evaluation and Assessment in Software Engineering","volume":"417 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116705441","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Giacomo Garaccione, Riccardo Coppola, Luca Ardito, Marco Torchiano
{"title":"Gamification of Business Process Modeling Notation education: an experience report","authors":"Giacomo Garaccione, Riccardo Coppola, Luca Ardito, Marco Torchiano","doi":"10.1145/3593434.3593956","DOIUrl":"https://doi.org/10.1145/3593434.3593956","url":null,"abstract":"Business Process Modeling (BPM) is a skill considered fundamental for computer engineers, with Business Process Modeling Notation (BPMN) being one of the most commonly used notations for this discipline. BPMN modeling is present in different curricula in specific Master’s Degree courses related to software engineering, but, in practice, students often underperform on BPMN modeling exercises due to difficulties in learning good modeling practices. In recent years, more and more fields of computer science have employed gamification (the usage of game elements in non-recreational contexts to gain benefits in terms of interest, participation, motivation, and enjoyment) with positive results during both development and teaching processes. Thus, we have developed a platform for BPMN modeling that employs gamification mechanics to facilitate learning good modeling practices with mechanisms such as rewarding good modeling solutions and penalizing less correct ones, with a dedicated feedback mechanism that maps correctly modeled elements to the corresponding concept. A preliminary laboratory experiment has been conducted with students of an Information Systems course to evaluate how students receive the mechanics and if there may be benefits in using a gamified environment for teaching process modeling throughout an entire course.","PeriodicalId":178596,"journal":{"name":"Proceedings of the 27th International Conference on Evaluation and Assessment in Software Engineering","volume":"102 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116712919","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}