使用源代码中词频的对数似然比自动标记软件组件及其演变

2009 6th IEEE International Working Conference on Mining Software Repositories Pub Date : 2009-05-16 DOI:10.1109/MSR.2009.5069499

Adrian Kuhn

{"title":"使用源代码中词频的对数似然比自动标记软件组件及其演变","authors":"Adrian Kuhn","doi":"10.1109/MSR.2009.5069499","DOIUrl":null,"url":null,"abstract":"As more and more open-source software components become available on the internet we need automatic ways to label and compare them. For example, a developer who searches for reusable software must be able to quickly gain an understanding of retrieved components. This understanding cannot be gained at the level of source code due to the semantic gap between source code and the domain model. In this paper we present a lexical approach that uses the log-likelihood ratios of word frequencies to automatically provide labels for software components. We present a prototype implementation of our labeling/comparison algorithm and provide examples of its application. In particular, we apply the approach to detect trends in the evolution of a software system.","PeriodicalId":413721,"journal":{"name":"2009 6th IEEE International Working Conference on Mining Software Repositories","volume":"47 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2009-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"29","resultStr":"{\"title\":\"Automatic labeling of software components and their evolution using log-likelihood ratio of word frequencies in source code\",\"authors\":\"Adrian Kuhn\",\"doi\":\"10.1109/MSR.2009.5069499\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"As more and more open-source software components become available on the internet we need automatic ways to label and compare them. For example, a developer who searches for reusable software must be able to quickly gain an understanding of retrieved components. This understanding cannot be gained at the level of source code due to the semantic gap between source code and the domain model. In this paper we present a lexical approach that uses the log-likelihood ratios of word frequencies to automatically provide labels for software components. We present a prototype implementation of our labeling/comparison algorithm and provide examples of its application. In particular, we apply the approach to detect trends in the evolution of a software system.\",\"PeriodicalId\":413721,\"journal\":{\"name\":\"2009 6th IEEE International Working Conference on Mining Software Repositories\",\"volume\":\"47 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2009-05-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"29\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2009 6th IEEE International Working Conference on Mining Software Repositories\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/MSR.2009.5069499\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2009 6th IEEE International Working Conference on Mining Software Repositories","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/MSR.2009.5069499","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 29

摘要

随着越来越多的开源软件组件在互联网上可用，我们需要自动标记和比较它们的方法。例如，搜索可重用软件的开发人员必须能够快速了解检索到的组件。由于源代码和领域模型之间的语义差距，无法在源代码级别获得这种理解。在本文中，我们提出了一种使用词频的对数似然比来自动为软件组件提供标签的词法方法。我们提出了我们的标签/比较算法的原型实现，并提供了其应用的例子。特别是，我们应用该方法来检测软件系统发展的趋势。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Automatic labeling of software components and their evolution using log-likelihood ratio of word frequencies in source code

As more and more open-source software components become available on the internet we need automatic ways to label and compare them. For example, a developer who searches for reusable software must be able to quickly gain an understanding of retrieved components. This understanding cannot be gained at the level of source code due to the semantic gap between source code and the domain model. In this paper we present a lexical approach that uses the log-likelihood ratios of word frequencies to automatically provide labels for software components. We present a prototype implementation of our labeling/comparison algorithm and provide examples of its application. In particular, we apply the approach to detect trends in the evolution of a software system.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2009 6th IEEE International Working Conference on Mining Software Repositories

自引率

0.00%

发文量