2015 IEEE 22nd International Conference on Software Analysis, Evolution, and Reengineering (SANER)最新文献_第8页

Circular dependencies and change-proneness: An empirical study 循环依赖和变化倾向:一项实证研究

2015 IEEE 22nd International Conference on Software Analysis, Evolution, and Reengineering (SANER) Pub Date : 2015-03-02 DOI: 10.1109/SANER.2015.7081834

Tosin Daniel Oyetoyan, Jean-Rémy Falleri, Jens Dietrich, Kamil Jezek

{"title":"Circular dependencies and change-proneness: An empirical study","authors":"Tosin Daniel Oyetoyan, Jean-Rémy Falleri, Jens Dietrich, Kamil Jezek","doi":"10.1109/SANER.2015.7081834","DOIUrl":"https://doi.org/10.1109/SANER.2015.7081834","url":null,"abstract":"Advice that circular dependencies between programming artefacts should be avoided goes back to the earliest work on software design, and is well-established and rarely questioned. However, empirical studies have shown that real-world (Java) programs are riddled with circular dependencies between artefacts on different levels of abstraction and aggregation. It has been suggested that additional heuristics could be used to distinguish between bad and harmless cycles, for instances by relating them to the hierarchical structure of the packages within a program, or to violations of additional design principles. In this study, we try to explore this question further by analysing the relationship between different kinds of circular dependencies between Java classes, and their change frequency. We find that (1) the presence of cycles can have a significant impact on the change proneness of the classes near these cycles and (2) neither subtype knowledge nor the location of the cycle within the package containment tree are suitable criteria to distinguish between critical and harmless cycles.","PeriodicalId":355949,"journal":{"name":"2015 IEEE 22nd International Conference on Software Analysis, Evolution, and Reengineering (SANER)","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-03-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132008988","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 19

Software risk management in practice: Shed light on your software product 实践中的软件风险管理:阐明您的软件产品

2015 IEEE 22nd International Conference on Software Analysis, Evolution, and Reengineering (SANER) Pub Date : 2015-03-02 DOI: 10.1109/SANER.2015.7081884

J. Knodel, Matthias Naab, Eric Bouwers, Joost Visser

引用次数: 1

Understanding software performance regressions using differential flame graphs 使用微分火焰图理解软件性能回归

2015 IEEE 22nd International Conference on Software Analysis, Evolution, and Reengineering (SANER) Pub Date : 2015-03-02 DOI: 10.1109/SANER.2015.7081872

C. Bezemer, J. Pouwelse, B. Gregg

引用次数: 31

Code review: Veni, ViDI, vici

2015 IEEE 22nd International Conference on Software Analysis, Evolution, and Reengineering (SANER) Pub Date : 2015-03-01 DOI: 10.1109/SANER.2015.7081825

Y. Tymchuk, Andrea Mocci, Michele Lanza

引用次数: 15

Modeling the evolution of development topics using Dynamic Topic Models 使用动态主题模型对开发主题的演变进行建模

2015 IEEE 22nd International Conference on Software Analysis, Evolution, and Reengineering (SANER) Pub Date : 2015-03-01 DOI: 10.1109/SANER.2015.7081810

Jiajun Hu, Xiaobing Sun, D. Lo, Bin Li

{"title":"Modeling the evolution of development topics using Dynamic Topic Models","authors":"Jiajun Hu, Xiaobing Sun, D. Lo, Bin Li","doi":"10.1109/SANER.2015.7081810","DOIUrl":"https://doi.org/10.1109/SANER.2015.7081810","url":null,"abstract":"As the development of a software project progresses, its complexity grows accordingly, making it difficult to understand and maintain. During software maintenance and evolution, software developers and stakeholders constantly shift their focus between different tasks and topics. They need to investigate into software repositories (e.g., revision control systems) to know what tasks have recently been worked on and how much effort has been devoted to them. For example, if an important new feature request is received, an amount of work that developers perform on ought to be relevant to the addition of the incoming feature. If this does not happen, project managers might wonder what kind of work developers are currently working on. Several topic analysis tools based on Latent Dirichlet Allocation (LDA) have been proposed to analyze information stored in software repositories to model software evolution, thus helping software stakeholders to be aware of the focus of development efforts at various time during software evolution. Previous LDA-based topic analysis tools can capture either changes on the strengths of various development topics over time (i.e., strength evolution) or changes in the content of existing topics over time (i.e., content evolution). Unfortunately, none of the existing techniques can capture both strength and content evolution. In this paper, we use Dynamic Topic Models (DTM) to analyze commit messages within a project's lifetime to capture both strength and content evolution simultaneously. We evaluate our approach by conducting a case study on commit messages of two well-known open source software systems, jEdit and PostgreSQL. The results show that our approach could capture not only how the strengths of various development topics change over time, but also how the content of each topic (i.e., words that form the topic) changes over time. Compared with existing topic analysis approaches, our approach can provide a more complete and valuable view of software evolution to help developers better understand the evolution of their projects.","PeriodicalId":355949,"journal":{"name":"2015 IEEE 22nd International Conference on Software Analysis, Evolution, and Reengineering (SANER)","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114008772","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 44

Tracking known security vulnerabilities in proprietary software systems 跟踪专有软件系统中已知的安全漏洞

2015 IEEE 22nd International Conference on Software Analysis, Evolution, and Reengineering (SANER) Pub Date : 2015-03-01 DOI: 10.1109/SANER.2015.7081868

Mircea Cadariu, Eric Bouwers, Joost Visser, A. Deursen

引用次数: 61

Exploring the use of labels to categorize issues in Open-Source Software projects 探索使用标签对开源软件项目中的问题进行分类

2015 IEEE 22nd International Conference on Software Analysis, Evolution, and Reengineering (SANER) Pub Date : 2015-03-01 DOI: 10.1109/SANER.2015.7081875

Jordi Cabot, Javier Luis Cánovas Izquierdo, Valerio Cosentino, Belen Rolandi

{"title":"Exploring the use of labels to categorize issues in Open-Source Software projects","authors":"Jordi Cabot, Javier Luis Cánovas Izquierdo, Valerio Cosentino, Belen Rolandi","doi":"10.1109/SANER.2015.7081875","DOIUrl":"https://doi.org/10.1109/SANER.2015.7081875","url":null,"abstract":"Reporting bugs, asking for new features and in general giving any kind of feedback is a common way to contribute to an Open-Source Software (OSS) project. This feedback is generally reported in the form of new issues for the project, managed by the so-called issue-trackers. One of the features provided by most issue-trackers is the possibility to define a set of labels/tags to classify the issues and, at least in theory, facilitate their management. Nevertheless, there is little empirical evidence to confirm that taking the time to categorize new issues has indeed a beneficial impact on the project evolution. In this paper we analyze a population of more than three million of GitHub projects and give some insights on how labels are used in them. Our preliminary results reveal that, even if the label mechanism is scarcely used, using labels favors the resolution of issues. Our analysis also suggests that not all projects use labels in the same way (e.g., for some labels are only a way to prioritize the project while others use them to signal their temporal evolution as they move along in the development workflow). Further research is needed to precisely characterize these label “families” and learn more the ideal application scenarios for each of them.","PeriodicalId":355949,"journal":{"name":"2015 IEEE 22nd International Conference on Software Analysis, Evolution, and Reengineering (SANER)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114397503","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 62

NIRMAL: Automatic identification of software relevant tweets leveraging language model NIRMAL:利用语言模型自动识别软件相关推文

2015 IEEE 22nd International Conference on Software Analysis, Evolution, and Reengineering (SANER) Pub Date : 2015-03-01 DOI: 10.1109/SANER.2015.7081855

Abhishek Sharma, Yuan Tian, D. Lo

{"title":"NIRMAL: Automatic identification of software relevant tweets leveraging language model","authors":"Abhishek Sharma, Yuan Tian, D. Lo","doi":"10.1109/SANER.2015.7081855","DOIUrl":"https://doi.org/10.1109/SANER.2015.7081855","url":null,"abstract":"Twitter is one of the most widely used social media platforms today. It enables users to share and view short 140-character messages called “tweets”. About 284 million active users generate close to 500 million tweets per day. Such rapid generation of user generated content in large magnitudes results in the problem of information overload. Users who are interested in information related to a particular domain have limited means to filter out irrelevant tweets and tend to get lost in the huge amount of data they encounter. A recent study by Singer et al. found that software developers use Twitter to stay aware of industry trends, to learn from others, and to network with other developers. However, Singer et al. also reported that developers often find Twitter streams to contain too much noise which is a barrier to the adoption of Twitter. In this paper, to help developers cope with noise, we propose a novel approach named NIRMAL, which automatically identifies software relevant tweets from a collection or stream of tweets. Our approach is based on language modeling which learns a statistical model based on a training corpus (i.e., set of documents). We make use of a subset of posts from StackOverflow, a programming question and answer site, as a training corpus to learn a language model. A corpus of tweets was then used to test the effectiveness of the trained language model. The tweets were sorted based on the rank the model assigned to each of the individual tweets. The top 200 tweets were then manually analyzed to verify whether they are software related or not, and then an accuracy score was calculated. The results show that decent accuracy scores can be achieved by various variants of NIRMAL, which indicates that NIRMAL can effectively identify software related tweets from a huge corpus of tweets.","PeriodicalId":355949,"journal":{"name":"2015 IEEE 22nd International Conference on Software Analysis, Evolution, and Reengineering (SANER)","volume":"87 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125761128","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 32

Beyond support and confidence: Exploring interestingness measures for rule-based specification mining 超越支持和信任:探索基于规则的规范挖掘的有趣度量

2015 IEEE 22nd International Conference on Software Analysis, Evolution, and Reengineering (SANER) Pub Date : 2015-03-01 DOI: 10.1109/SANER.2015.7081843

Tien-Duy B. Le, D. Lo

{"title":"Beyond support and confidence: Exploring interestingness measures for rule-based specification mining","authors":"Tien-Duy B. Le, D. Lo","doi":"10.1109/SANER.2015.7081843","DOIUrl":"https://doi.org/10.1109/SANER.2015.7081843","url":null,"abstract":"Numerous rule-based specification mining approaches have been proposed in the literature. Many of these approaches analyze a set of execution traces to discover interesting usage rules, e.g., whenever lock() is invoked, eventually unlock() is invoked. These techniques often generate and enumerate a set of candidate rules and compute some interestingness scores. Rules whose interestingness scores are above a certain threshold would then be output. In past studies, two measures, namely support and confidence, which are well-known measures, are often used to compute these scores. However, aside from these two, many other interestingness measures have been proposed. It is thus unclear if support and confidence are the best interestingness measures for specification mining. In this work, we perform an empirical study that investigates the utility of 38 interestingness measures in recovering correct specifications of classes from Java libraries. We used a ground truth dataset consisting of 683 rules and recorded execution traces that are produced when we run the DaCapo test suite. We apply 38 different interestingness measures to identify correct rules from a pool of candidate rules. Our study highlights that many measures are on par to support and confidence. Some of the measures are even better than support or confidence and at least one of the measures is statistically significantly better than the two measures. We also find that compositions of several measures with support statistically significantly outperform the composition of support and confidence. Our findings highlight the need to look beyond standard support and confidence to find interesting rules.","PeriodicalId":355949,"journal":{"name":"2015 IEEE 22nd International Conference on Software Analysis, Evolution, and Reengineering (SANER)","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117188997","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 44

A comparative study on the effectiveness of part-of-speech tagging techniques on bug reports 词性标注技术在bug报告中的有效性比较研究

2015 IEEE 22nd International Conference on Software Analysis, Evolution, and Reengineering (SANER) Pub Date : 2015-03-01 DOI: 10.1109/SANER.2015.7081879

Yuan Tian, D. Lo

{"title":"A comparative study on the effectiveness of part-of-speech tagging techniques on bug reports","authors":"Yuan Tian, D. Lo","doi":"10.1109/SANER.2015.7081879","DOIUrl":"https://doi.org/10.1109/SANER.2015.7081879","url":null,"abstract":"Many software artifacts are written in natural language or contain substantial amount of natural language contents. Thus these artifacts could be analyzed using text analysis techniques from the natural language processing (NLP) community, e.g., the part-of-speech (POS) tagging technique that assigns POS tags (e.g., verb, noun, etc.) to words in a sentence. In the literature, several studies have already applied POS tagging technique on software artifacts to recover important words in them, which are then used for automating various tasks, e.g., locating buggy files for a given bug report, etc. There are many POS tagging techniques proposed and they are trained and evaluated on non software engineering corpus (documents). Thus it is unknown whether they can correctly identify the POS of a word in a software artifact and which of them performs the best. To fill this gap, in this work, we investigate the effectiveness of seven POS taggers on bug reports. We randomly sample 100 bug reports from Eclipse and Mozilla project and create a text corpus that contains 21,713 words. We manually assign POS tags to these words and use them to evaluate the studied POS taggers. Our comparative study shows that the state-of-the-art POS taggers achieve an accuracy of 83.6%-90.5% on bug reports and the Stanford POS tagger and the TreeTagger achieve the highest accuracy on the sampled bug reports. Our findings show that researchers could use these POS taggers to analyze software artifacts, if an accuracy of 80-90% is acceptable for their specific needs, and we recommend using the Stanford POS tagger or the TreeTagger.","PeriodicalId":355949,"journal":{"name":"2015 IEEE 22nd International Conference on Software Analysis, Evolution, and Reengineering (SANER)","volume":"84 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129258548","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 54