比较为调用使用模式挖掘源代码的方法

Fourth International Workshop on Mining Software Repositories (MSR'07:ICSE Workshops 2007) Pub Date : 2007-05-20 DOI:10.1109/MSR.2007.3

Huzefa H. Kagdi, M. Collard, Jonathan I. Maletic

{"title":"比较为调用使用模式挖掘源代码的方法","authors":"Huzefa H. Kagdi, M. Collard, Jonathan I. Maletic","doi":"10.1109/MSR.2007.3","DOIUrl":null,"url":null,"abstract":"Two approaches for mining function-call usage patterns from source code are compared The first approach, itemset mining, has recently been applied to this problem. The other approach, sequential-pattern mining, has not been previously applied to this problem. Here, a call-usage pattern is a composition of function calls that occur in a function definition. Both approaches look for frequently occurring patterns that represent standard usage of functions and identify possible errors. Itemset mining produces unordered patterns, i.e., sets of function calls, whereas, sequential-pattern mining produces partially ordered patterns, i.e., sequences of function calls. The trade-off between the additional ordering context given by sequential-pattern mining and the efficiency of itemset mining is investigated. The two approaches are applied to the Lima kernel v2.6.14 and results show that mining ordered patterns is worth the additional cost.","PeriodicalId":201749,"journal":{"name":"Fourth International Workshop on Mining Software Repositories (MSR'07:ICSE Workshops 2007)","volume":"38 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2007-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"23","resultStr":"{\"title\":\"Comparing Approaches to Mining Source Code for Call-Usage Patterns\",\"authors\":\"Huzefa H. Kagdi, M. Collard, Jonathan I. Maletic\",\"doi\":\"10.1109/MSR.2007.3\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Two approaches for mining function-call usage patterns from source code are compared The first approach, itemset mining, has recently been applied to this problem. The other approach, sequential-pattern mining, has not been previously applied to this problem. Here, a call-usage pattern is a composition of function calls that occur in a function definition. Both approaches look for frequently occurring patterns that represent standard usage of functions and identify possible errors. Itemset mining produces unordered patterns, i.e., sets of function calls, whereas, sequential-pattern mining produces partially ordered patterns, i.e., sequences of function calls. The trade-off between the additional ordering context given by sequential-pattern mining and the efficiency of itemset mining is investigated. The two approaches are applied to the Lima kernel v2.6.14 and results show that mining ordered patterns is worth the additional cost.\",\"PeriodicalId\":201749,\"journal\":{\"name\":\"Fourth International Workshop on Mining Software Repositories (MSR'07:ICSE Workshops 2007)\",\"volume\":\"38 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2007-05-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"23\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Fourth International Workshop on Mining Software Repositories (MSR'07:ICSE Workshops 2007)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/MSR.2007.3\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Fourth International Workshop on Mining Software Repositories (MSR'07:ICSE Workshops 2007)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/MSR.2007.3","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 23

摘要

比较了从源代码中挖掘函数调用使用模式的两种方法。第一种方法，项集挖掘，最近被应用于这个问题。另一种方法，顺序模式挖掘，以前没有应用于此问题。这里，调用使用模式是在函数定义中发生的函数调用的组合。这两种方法都寻找代表函数标准用法的频繁出现的模式，并识别可能的错误。项目集挖掘产生无序模式，即函数调用集，而顺序模式挖掘产生部分有序模式，即函数调用序列。研究了顺序模式挖掘所提供的额外排序上下文与项目集挖掘效率之间的权衡关系。将这两种方法应用于Lima内核v2.6.14，结果表明挖掘有序模式值得付出额外的代价。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Comparing Approaches to Mining Source Code for Call-Usage Patterns

Two approaches for mining function-call usage patterns from source code are compared The first approach, itemset mining, has recently been applied to this problem. The other approach, sequential-pattern mining, has not been previously applied to this problem. Here, a call-usage pattern is a composition of function calls that occur in a function definition. Both approaches look for frequently occurring patterns that represent standard usage of functions and identify possible errors. Itemset mining produces unordered patterns, i.e., sets of function calls, whereas, sequential-pattern mining produces partially ordered patterns, i.e., sequences of function calls. The trade-off between the additional ordering context given by sequential-pattern mining and the efficiency of itemset mining is investigated. The two approaches are applied to the Lima kernel v2.6.14 and results show that mining ordered patterns is worth the additional cost.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Fourth International Workshop on Mining Software Repositories (MSR'07:ICSE Workshops 2007)

自引率

0.00%

发文量