Tosin Daniel Oyetoyan, Jean-Rémy Falleri, Jens Dietrich, Kamil Jezek
{"title":"Circular dependencies and change-proneness: An empirical study","authors":"Tosin Daniel Oyetoyan, Jean-Rémy Falleri, Jens Dietrich, Kamil Jezek","doi":"10.1109/SANER.2015.7081834","DOIUrl":"https://doi.org/10.1109/SANER.2015.7081834","url":null,"abstract":"Advice that circular dependencies between programming artefacts should be avoided goes back to the earliest work on software design, and is well-established and rarely questioned. However, empirical studies have shown that real-world (Java) programs are riddled with circular dependencies between artefacts on different levels of abstraction and aggregation. It has been suggested that additional heuristics could be used to distinguish between bad and harmless cycles, for instances by relating them to the hierarchical structure of the packages within a program, or to violations of additional design principles. In this study, we try to explore this question further by analysing the relationship between different kinds of circular dependencies between Java classes, and their change frequency. We find that (1) the presence of cycles can have a significant impact on the change proneness of the classes near these cycles and (2) neither subtype knowledge nor the location of the cycle within the package containment tree are suitable criteria to distinguish between critical and harmless cycles.","PeriodicalId":355949,"journal":{"name":"2015 IEEE 22nd International Conference on Software Analysis, Evolution, and Reengineering (SANER)","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-03-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132008988","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
J. Knodel, Matthias Naab, Eric Bouwers, Joost Visser
{"title":"Software risk management in practice: Shed light on your software product","authors":"J. Knodel, Matthias Naab, Eric Bouwers, Joost Visser","doi":"10.1109/SANER.2015.7081884","DOIUrl":"https://doi.org/10.1109/SANER.2015.7081884","url":null,"abstract":"You can't control what you can't measure. And you can't decide if you are wandering around in the dark. Risk management in practice requires shedding light on the internals of the software product in order to make informed decisions. Thus, in practice, risk management has to be based on information about artifacts (documentation, code, and executables) in order to detect (potentially) critical issues. This tutorial presents experiences from industrial cases worldwide on qualitative and quantitative measurement of software products. We present our lessons learned as well as consolidated experiences from practice and provide a classification scheme of applicable measurement techniques. Participants of the tutorial receive an introduction to the techniques in theory and then apply them in practice in interactive exercises. This enables participants to learn how to shed light on the internals of their software and how to make risk management decisions efficiently and effectively.","PeriodicalId":355949,"journal":{"name":"2015 IEEE 22nd International Conference on Software Analysis, Evolution, and Reengineering (SANER)","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-03-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127227519","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Understanding software performance regressions using differential flame graphs","authors":"C. Bezemer, J. Pouwelse, B. Gregg","doi":"10.1109/SANER.2015.7081872","DOIUrl":"https://doi.org/10.1109/SANER.2015.7081872","url":null,"abstract":"Flame graphs are gaining rapidly in popularity in industry to visualize performance profiles collected by stack-trace based profilers. In some cases, for example, during performance regression detection, profiles of different software versions have to be compared. Doing this manually using two or more flame graphs or textual profiles is tedious and error-prone. In this `Early Research Achievements'-track paper, we present our preliminary results on using differential flame graphs instead. Differential flame graphs visualize the differences between two performance profiles. In addition, we discuss which research fields we expect to benefit from using differential flame graphs. We have implemented our approach in an open source prototype called FLAMEGRAPHDIFF, which is available on GitHub. FLAMEGRAPHDIFF makes it easy to generate interactive differential flame graphs from two existing performance profiles. These graphs facilitate easy tracing of elements in the different graphs to ease the understanding of the (d)evolution of the performance of an application.","PeriodicalId":355949,"journal":{"name":"2015 IEEE 22nd International Conference on Software Analysis, Evolution, and Reengineering (SANER)","volume":"52 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-03-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121591034","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Code review: Veni, ViDI, vici","authors":"Y. Tymchuk, Andrea Mocci, Michele Lanza","doi":"10.1109/SANER.2015.7081825","DOIUrl":"https://doi.org/10.1109/SANER.2015.7081825","url":null,"abstract":"Modern software development sees code review as a crucial part of the process, because not only does it facilitate the sharing of knowledge about the system at hand, but it may also lead to the early detection of defects, ultimately improving the quality of the produced software. Although supported by numerous approaches and tools, code review is still in its infancy, and indeed researchers have pointed out a number of shortcomings in the state of the art. We present a critical analysis of the state of the art of code review tools and techniques, extracting a set of desired features that code review tools should possess. We then present our vision and initial implementation of a novel code review approach named Visual Design Inspection (ViDI), illustrated through a set of usage scenarios. ViDI is based on a combination of visualization techniques, design heuristics, and static code analysis techniques.","PeriodicalId":355949,"journal":{"name":"2015 IEEE 22nd International Conference on Software Analysis, Evolution, and Reengineering (SANER)","volume":"80 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128337302","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Modeling the evolution of development topics using Dynamic Topic Models","authors":"Jiajun Hu, Xiaobing Sun, D. Lo, Bin Li","doi":"10.1109/SANER.2015.7081810","DOIUrl":"https://doi.org/10.1109/SANER.2015.7081810","url":null,"abstract":"As the development of a software project progresses, its complexity grows accordingly, making it difficult to understand and maintain. During software maintenance and evolution, software developers and stakeholders constantly shift their focus between different tasks and topics. They need to investigate into software repositories (e.g., revision control systems) to know what tasks have recently been worked on and how much effort has been devoted to them. For example, if an important new feature request is received, an amount of work that developers perform on ought to be relevant to the addition of the incoming feature. If this does not happen, project managers might wonder what kind of work developers are currently working on. Several topic analysis tools based on Latent Dirichlet Allocation (LDA) have been proposed to analyze information stored in software repositories to model software evolution, thus helping software stakeholders to be aware of the focus of development efforts at various time during software evolution. Previous LDA-based topic analysis tools can capture either changes on the strengths of various development topics over time (i.e., strength evolution) or changes in the content of existing topics over time (i.e., content evolution). Unfortunately, none of the existing techniques can capture both strength and content evolution. In this paper, we use Dynamic Topic Models (DTM) to analyze commit messages within a project's lifetime to capture both strength and content evolution simultaneously. We evaluate our approach by conducting a case study on commit messages of two well-known open source software systems, jEdit and PostgreSQL. The results show that our approach could capture not only how the strengths of various development topics change over time, but also how the content of each topic (i.e., words that form the topic) changes over time. Compared with existing topic analysis approaches, our approach can provide a more complete and valuable view of software evolution to help developers better understand the evolution of their projects.","PeriodicalId":355949,"journal":{"name":"2015 IEEE 22nd International Conference on Software Analysis, Evolution, and Reengineering (SANER)","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114008772","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mircea Cadariu, Eric Bouwers, Joost Visser, A. Deursen
{"title":"Tracking known security vulnerabilities in proprietary software systems","authors":"Mircea Cadariu, Eric Bouwers, Joost Visser, A. Deursen","doi":"10.1109/SANER.2015.7081868","DOIUrl":"https://doi.org/10.1109/SANER.2015.7081868","url":null,"abstract":"Known security vulnerabilities can be introduced in software systems as a result of being dependent upon third-party components. These documented software weaknesses are “hiding in plain sight” and represent low hanging fruit for attackers. In this paper we present the Vulnerability Alert Service (VAS), a tool-based process to track known vulnerabilities in software systems throughout their life cycle. We studied its usefulness in the context of external software product quality monitoring provided by the Software Improvement Group, a software advisory company based in Amsterdam, the Netherlands. Besides empirically assessing the usefulness of the VAS, we have also leveraged it to gain insight and report on the prevalence of third-party components with known security vulnerabilities in proprietary applications.","PeriodicalId":355949,"journal":{"name":"2015 IEEE 22nd International Conference on Software Analysis, Evolution, and Reengineering (SANER)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123837176","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jordi Cabot, Javier Luis Cánovas Izquierdo, Valerio Cosentino, Belen Rolandi
{"title":"Exploring the use of labels to categorize issues in Open-Source Software projects","authors":"Jordi Cabot, Javier Luis Cánovas Izquierdo, Valerio Cosentino, Belen Rolandi","doi":"10.1109/SANER.2015.7081875","DOIUrl":"https://doi.org/10.1109/SANER.2015.7081875","url":null,"abstract":"Reporting bugs, asking for new features and in general giving any kind of feedback is a common way to contribute to an Open-Source Software (OSS) project. This feedback is generally reported in the form of new issues for the project, managed by the so-called issue-trackers. One of the features provided by most issue-trackers is the possibility to define a set of labels/tags to classify the issues and, at least in theory, facilitate their management. Nevertheless, there is little empirical evidence to confirm that taking the time to categorize new issues has indeed a beneficial impact on the project evolution. In this paper we analyze a population of more than three million of GitHub projects and give some insights on how labels are used in them. Our preliminary results reveal that, even if the label mechanism is scarcely used, using labels favors the resolution of issues. Our analysis also suggests that not all projects use labels in the same way (e.g., for some labels are only a way to prioritize the project while others use them to signal their temporal evolution as they move along in the development workflow). Further research is needed to precisely characterize these label “families” and learn more the ideal application scenarios for each of them.","PeriodicalId":355949,"journal":{"name":"2015 IEEE 22nd International Conference on Software Analysis, Evolution, and Reengineering (SANER)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114397503","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"NIRMAL: Automatic identification of software relevant tweets leveraging language model","authors":"Abhishek Sharma, Yuan Tian, D. Lo","doi":"10.1109/SANER.2015.7081855","DOIUrl":"https://doi.org/10.1109/SANER.2015.7081855","url":null,"abstract":"Twitter is one of the most widely used social media platforms today. It enables users to share and view short 140-character messages called “tweets”. About 284 million active users generate close to 500 million tweets per day. Such rapid generation of user generated content in large magnitudes results in the problem of information overload. Users who are interested in information related to a particular domain have limited means to filter out irrelevant tweets and tend to get lost in the huge amount of data they encounter. A recent study by Singer et al. found that software developers use Twitter to stay aware of industry trends, to learn from others, and to network with other developers. However, Singer et al. also reported that developers often find Twitter streams to contain too much noise which is a barrier to the adoption of Twitter. In this paper, to help developers cope with noise, we propose a novel approach named NIRMAL, which automatically identifies software relevant tweets from a collection or stream of tweets. Our approach is based on language modeling which learns a statistical model based on a training corpus (i.e., set of documents). We make use of a subset of posts from StackOverflow, a programming question and answer site, as a training corpus to learn a language model. A corpus of tweets was then used to test the effectiveness of the trained language model. The tweets were sorted based on the rank the model assigned to each of the individual tweets. The top 200 tweets were then manually analyzed to verify whether they are software related or not, and then an accuracy score was calculated. The results show that decent accuracy scores can be achieved by various variants of NIRMAL, which indicates that NIRMAL can effectively identify software related tweets from a huge corpus of tweets.","PeriodicalId":355949,"journal":{"name":"2015 IEEE 22nd International Conference on Software Analysis, Evolution, and Reengineering (SANER)","volume":"87 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125761128","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Beyond support and confidence: Exploring interestingness measures for rule-based specification mining","authors":"Tien-Duy B. Le, D. Lo","doi":"10.1109/SANER.2015.7081843","DOIUrl":"https://doi.org/10.1109/SANER.2015.7081843","url":null,"abstract":"Numerous rule-based specification mining approaches have been proposed in the literature. Many of these approaches analyze a set of execution traces to discover interesting usage rules, e.g., whenever lock() is invoked, eventually unlock() is invoked. These techniques often generate and enumerate a set of candidate rules and compute some interestingness scores. Rules whose interestingness scores are above a certain threshold would then be output. In past studies, two measures, namely support and confidence, which are well-known measures, are often used to compute these scores. However, aside from these two, many other interestingness measures have been proposed. It is thus unclear if support and confidence are the best interestingness measures for specification mining. In this work, we perform an empirical study that investigates the utility of 38 interestingness measures in recovering correct specifications of classes from Java libraries. We used a ground truth dataset consisting of 683 rules and recorded execution traces that are produced when we run the DaCapo test suite. We apply 38 different interestingness measures to identify correct rules from a pool of candidate rules. Our study highlights that many measures are on par to support and confidence. Some of the measures are even better than support or confidence and at least one of the measures is statistically significantly better than the two measures. We also find that compositions of several measures with support statistically significantly outperform the composition of support and confidence. Our findings highlight the need to look beyond standard support and confidence to find interesting rules.","PeriodicalId":355949,"journal":{"name":"2015 IEEE 22nd International Conference on Software Analysis, Evolution, and Reengineering (SANER)","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117188997","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A comparative study on the effectiveness of part-of-speech tagging techniques on bug reports","authors":"Yuan Tian, D. Lo","doi":"10.1109/SANER.2015.7081879","DOIUrl":"https://doi.org/10.1109/SANER.2015.7081879","url":null,"abstract":"Many software artifacts are written in natural language or contain substantial amount of natural language contents. Thus these artifacts could be analyzed using text analysis techniques from the natural language processing (NLP) community, e.g., the part-of-speech (POS) tagging technique that assigns POS tags (e.g., verb, noun, etc.) to words in a sentence. In the literature, several studies have already applied POS tagging technique on software artifacts to recover important words in them, which are then used for automating various tasks, e.g., locating buggy files for a given bug report, etc. There are many POS tagging techniques proposed and they are trained and evaluated on non software engineering corpus (documents). Thus it is unknown whether they can correctly identify the POS of a word in a software artifact and which of them performs the best. To fill this gap, in this work, we investigate the effectiveness of seven POS taggers on bug reports. We randomly sample 100 bug reports from Eclipse and Mozilla project and create a text corpus that contains 21,713 words. We manually assign POS tags to these words and use them to evaluate the studied POS taggers. Our comparative study shows that the state-of-the-art POS taggers achieve an accuracy of 83.6%-90.5% on bug reports and the Stanford POS tagger and the TreeTagger achieve the highest accuracy on the sampled bug reports. Our findings show that researchers could use these POS taggers to analyze software artifacts, if an accuracy of 80-90% is acceptable for their specific needs, and we recommend using the Stanford POS tagger or the TreeTagger.","PeriodicalId":355949,"journal":{"name":"2015 IEEE 22nd International Conference on Software Analysis, Evolution, and Reengineering (SANER)","volume":"84 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129258548","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}