{"title":"Naturalness and Artifice of Code: Exploiting the Bi-Modality","authors":"Prem Devanbu","doi":"10.1145/3511430.3511915","DOIUrl":"https://doi.org/10.1145/3511430.3511915","url":null,"abstract":"While natural languages are rich in vocabulary and grammatical flexibility, most human are mundane and repetitive. This repetitiveness in natural language has led to great advances in statistical NLP methods. In our lab, we discovered (almost a decade ago) that, despite the considerable power and flexibility of programming languages, large software corpora are actually even more repetitive than NL Corpora. We also showed that this “naturalness” of code could be captured in language models, and exploited within software tools. This line of work has prospered, and been turbo-charged by the tremendous capacity and design flexibility of deep learning models. Numerous other creative and interesting applications of naturalness have ensued, from colleagues around the world, and several industrial applications have emerged. Recently, we have been studying the consequences and opportunities arising from the observation that Software is bimodal: it’s written not only to be run on machines, but also read by humans; this makes software amenable to both algorithmic analysis, and statistical prediction. Bimodality allows new ways of training machine learning models, new ways of designing analysis algorithms, and new ways to understand the practice of programming. In this talk, I will begin with a backgrounder on ”Naturalness” studies, and the promise of bimodality.","PeriodicalId":138760,"journal":{"name":"15th Innovations in Software Engineering Conference","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-02-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114901368","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Automating Software Engineering with Machine Learning","authors":"Aditya Kanade","doi":"10.1145/3511430.3511432","DOIUrl":"https://doi.org/10.1145/3511430.3511432","url":null,"abstract":"Software plays a crucial role in our everyday lives. The scarcity of skilled software engineers has become a bottleneck in delivering better software at scale. Can we automate software engineering to help improve developer productivity and software quality? Can we take advantage of massive codebases to learn about building correct and scalable software? In this talk, I will present some recent advances in automated software engineering using machine learning. Along the way, I will relate the data-driven techniques to traditional, algorithmic program analysis techniques. I will discuss representative deep learning methods to analyze and synthesize source code. Even though we are witnessing exciting new advances in machine learning for software engineering, we shall reflect on what challenges remain and the way forward.","PeriodicalId":138760,"journal":{"name":"15th Innovations in Software Engineering Conference","volume":"127 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-02-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116151386","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Classifying Toggles-smells and Investigating Development Effort","authors":"Harika Sugnanam, Md Tajmilur Rahman","doi":"10.1145/3511430.3511461","DOIUrl":"https://doi.org/10.1145/3511430.3511461","url":null,"abstract":"Companies are moving towards rapid release to deliver features as quickly as possible using Feature Toggles. Feature toggle is a variable that controls the state of a feature allowing unfinished code into the trunk. However, Maintaining the feature toggles needs a great effort, otherwise, it may lead to technical debt. Toggles may turn into code smells since they can be used in many ways if there is no standard of usage. We are calling such standard-less use of feature-toggles as “Toggle Smell’’. We classify different uses of toggle smells, and then we measure how much effort the code files are consuming to develop features, and maintain the toggles in each component. Our quantitative analysis on the Chromium open-source project finds that there are 3.1K toggles in 38 components and of the six different types of toggle usage, we classify three different toggle smells. The other types of usage will be analyzed in a future work. Three classification models predict the development effort in files as “High’’, “Medium’’, and “Low’’ with a similar accuracy of 95.x%.","PeriodicalId":138760,"journal":{"name":"15th Innovations in Software Engineering Conference","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-02-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129212335","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Report on Tutorials and Tech-Briefings co-located with ISEC 2022","authors":"M. Nagappan, Pavan Kumar Chittimalli","doi":"10.1145/3511430.3511464","DOIUrl":"https://doi.org/10.1145/3511430.3511464","url":null,"abstract":"This is a short report on the Tutorials and Tech Briefings session of the 15th Innovations in Software Engineering (ISEC 2022) conference held on 24-26th February 2022 in DA-IICT Gandhinagar, India. The tutorials and tech briefings at ISEC have been popular with the participants because they offer a gentle and friendly introduction to cutting edge topics and research at the frontiers of the discipline of software engineering. This year seven submissions were selected (2 Tech Briefings + 4 Tutorials) for presentation to reflect the current interests and directions of the field of software engineering.","PeriodicalId":138760,"journal":{"name":"15th Innovations in Software Engineering Conference","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-02-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127912251","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Handling Memory-Intensive Operations in Symbolic Execution","authors":"Luca Borzacchiello, Emilio Coppa, C. Demetrescu","doi":"10.1145/3511430.3511453","DOIUrl":"https://doi.org/10.1145/3511430.3511453","url":null,"abstract":"Symbolic execution is a popular software testing technique that can help developers identify complex bugs in real-world applications. Unfortunately, symbolic execution may struggle at analyzing programs containing memory-intensive operations, such as memcpy and memset, whenever these operations are carried out over memory blocks whose size or address is symbolic, i.e., input-dependent. In this paper, we devise MInt, a memory model for symbolic execution that can support reasoning over such operations. The key new idea behind our proposal is to make the memory model aware of these memory-intensive operations, deferring any symbolic reasoning on their effects to the time where the program actually manipulates the symbolic data affected by these operations. We show that a preliminary implementation of MInt based on the symbolic framework angr can effectively analyze applications taken from the DARPA Cyber Grand Challenge.","PeriodicalId":138760,"journal":{"name":"15th Innovations in Software Engineering Conference","volume":"158 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-02-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133830814","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Supporting Readability by Comprehending the Hierarchical Abstraction of a Software Project","authors":"Avijit Bhattacharjee, B. Roy, Kevin A. Schneider","doi":"10.1145/3511430.3511441","DOIUrl":"https://doi.org/10.1145/3511430.3511441","url":null,"abstract":"Exploring the source code of a software system is a prevailing task that is frequently done by contributors to a system. Practitioners often use call graphs to aid in understanding the source code of an inadequately documented software system. Call graphs, when visualized, show caller and callee relationships between functions. A static call graph provides an overall structure of a software system and dynamic call graphs generated from dynamic execution logs can be used to trace program behaviour for a particular scenario. Unfortunately a call graph of an entire system can be very complicated and hard to understand. Hierarchically abstracting a call graph can be used to summarize an entire system’s structure and more easily comprehending function calls. In this work, we mine concepts from source code entities (functions) to generate a concept cluster tree with improved naming of cluster nodes to complement existing studies and facilitate more effective program comprehension for developers. We apply three different information retrieval techniques (TFIDF, LDA, and LSI) on function names and function name variants to label the nodes of a concept cluster tree generated by clustering execution paths. From our experiment in comparing automatic labelling with manual labeling by participants for 12 use cases, we found that among the techniques on average, TFIDF performs better with 64% matching. LDA and LSI had 37% and 23% matching respectively. In addition, using the words in function name variants performed at least 5% better in participant ratings for all three techniques on average for the use cases.","PeriodicalId":138760,"journal":{"name":"15th Innovations in Software Engineering Conference","volume":"114 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-02-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128115492","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Report on the Fifth Workshop on Software Engineering Education (SEED 2022)","authors":"Kuldeep Kumar, Bharti Suri, Bimlesh Wadhwa","doi":"10.1145/3511430.3511467","DOIUrl":"https://doi.org/10.1145/3511430.3511467","url":null,"abstract":"The 5th International Workshop on Software Engineering Education (SEED 2022), co-located with the 15th Innovations in Software Engineering Conference (ISEC 2022), aims to provide a unique forum to bring together researchers, educators, students, and practitioners to report on their experiences and their ongoing efforts in meeting the recent demands of remote teaching and learning in Software Engineering. The theme of SEED 2022 is Software Engineering Education amid a global pandemic - How can software engineering teaching meet the challenge of the sudden shift to online education triggered by the COVID-19 pandemic? Strategies for project-based learning, hybrid learning, blended learning, use of tools in teaching and learning are specifically targeted in this workshop. Further, it aims to provide a unique opportunity to Software Engineering educators and practitioners to come together and build collaborations for Software Engineering education research and practice.","PeriodicalId":138760,"journal":{"name":"15th Innovations in Software Engineering Conference","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-02-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129652016","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Identifying and Extracting Hierarchical Information from Business PDF Documents","authors":"Rohit Shere, Pavan Kumar Chittimalli, Ravindra Naik","doi":"10.1145/3511430.3511440","DOIUrl":"https://doi.org/10.1145/3511430.3511440","url":null,"abstract":"Portable Document Format (PDF) is a popular choice for a secure communication and persistence of business information and is a universally accepted format by businesses choosing to become digital. PDF provides multiple ways to make the information visually appealing and readable, and device independent rendering. To achieve this, PDF stores metadata with individual text characters, graphic components and other layout elements. Such atomic component wise meta-data makes machine processing of information in the PDF format very challenging; the challenge is further extended due to the difficulty of stitching together the original semantics from the componentized information. We propose a generic approach for extracting the hierarchy of the document structure while separating the content from header and footer, and extracting metadata associated with checkboxes to annotate the business information contained in PDF for tasks like mining specifications and rules from the document. Our prototype is able to process real-life, large PDF documents each running into roughly 400 pages, with nearly 95% of the extraction requiring no human intervention.","PeriodicalId":138760,"journal":{"name":"15th Innovations in Software Engineering Conference","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-02-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126777243","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Report on the First Workshop on Knowledge Guided AI-Native Adaptive Enterprise","authors":"V. Kulkarni","doi":"10.1145/3511430.3511465","DOIUrl":"https://doi.org/10.1145/3511430.3511465","url":null,"abstract":"Future enterprise will be a hyperconnected ecosystem that needs to deliver stated goals in a dynamic uncertain environment where the changes (even in goals) cannot be deduced upfront. Enterprise software is expected to play a principle role in its growth story. This calls for enterprise and its software to be capable of continuous adaptation in the face of uncertainty. Will innovative integration of proven ideas from modelling and simulation, AI, control theory and automated software engineering lead to a pragmatic solution?","PeriodicalId":138760,"journal":{"name":"15th Innovations in Software Engineering Conference","volume":"62 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-02-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126977947","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Raghav Mittal, Sai Anirudh Karre, Y. P. Gururaj, Y. R. Reddy
{"title":"Enhancing Configurable Limitless Paths in Virtual Reality Environments","authors":"Raghav Mittal, Sai Anirudh Karre, Y. P. Gururaj, Y. R. Reddy","doi":"10.1145/3511430.3511452","DOIUrl":"https://doi.org/10.1145/3511430.3511452","url":null,"abstract":"Locomotion in a virtual environment within a limited physical space is a complex activity. There exist established techniques to support limitless natural walking in virtual environments. These include Redirected walking, Dynamic path generation, and Walk-In-place technique, etc. PragPal is one such limitless path generation technique that supports natural walking in virtual environments. It is a novel software-based non-haptic locomotion technique. In this paper, we detail the enhancements to the existing PragPal path generation technique that addresses underlying issues in the technique like (1) path collision at angular turns, (2) effective usage of the physical play area, and (3) the ability to set path-width during path turns.","PeriodicalId":138760,"journal":{"name":"15th Innovations in Software Engineering Conference","volume":"129 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-02-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122380433","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}