比较为调用使用模式挖掘源代码的方法

Huzefa H. Kagdi, M. Collard, Jonathan I. Maletic
{"title":"比较为调用使用模式挖掘源代码的方法","authors":"Huzefa H. Kagdi, M. Collard, Jonathan I. Maletic","doi":"10.1109/MSR.2007.3","DOIUrl":null,"url":null,"abstract":"Two approaches for mining function-call usage patterns from source code are compared The first approach, itemset mining, has recently been applied to this problem. The other approach, sequential-pattern mining, has not been previously applied to this problem. Here, a call-usage pattern is a composition of function calls that occur in a function definition. Both approaches look for frequently occurring patterns that represent standard usage of functions and identify possible errors. Itemset mining produces unordered patterns, i.e., sets of function calls, whereas, sequential-pattern mining produces partially ordered patterns, i.e., sequences of function calls. The trade-off between the additional ordering context given by sequential-pattern mining and the efficiency of itemset mining is investigated. The two approaches are applied to the Lima kernel v2.6.14 and results show that mining ordered patterns is worth the additional cost.","PeriodicalId":201749,"journal":{"name":"Fourth International Workshop on Mining Software Repositories (MSR'07:ICSE Workshops 2007)","volume":"38 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2007-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"23","resultStr":"{\"title\":\"Comparing Approaches to Mining Source Code for Call-Usage Patterns\",\"authors\":\"Huzefa H. Kagdi, M. Collard, Jonathan I. Maletic\",\"doi\":\"10.1109/MSR.2007.3\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Two approaches for mining function-call usage patterns from source code are compared The first approach, itemset mining, has recently been applied to this problem. The other approach, sequential-pattern mining, has not been previously applied to this problem. Here, a call-usage pattern is a composition of function calls that occur in a function definition. Both approaches look for frequently occurring patterns that represent standard usage of functions and identify possible errors. Itemset mining produces unordered patterns, i.e., sets of function calls, whereas, sequential-pattern mining produces partially ordered patterns, i.e., sequences of function calls. The trade-off between the additional ordering context given by sequential-pattern mining and the efficiency of itemset mining is investigated. The two approaches are applied to the Lima kernel v2.6.14 and results show that mining ordered patterns is worth the additional cost.\",\"PeriodicalId\":201749,\"journal\":{\"name\":\"Fourth International Workshop on Mining Software Repositories (MSR'07:ICSE Workshops 2007)\",\"volume\":\"38 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2007-05-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"23\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Fourth International Workshop on Mining Software Repositories (MSR'07:ICSE Workshops 2007)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/MSR.2007.3\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Fourth International Workshop on Mining Software Repositories (MSR'07:ICSE Workshops 2007)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/MSR.2007.3","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 23

摘要

比较了从源代码中挖掘函数调用使用模式的两种方法。第一种方法,项集挖掘,最近被应用于这个问题。另一种方法,顺序模式挖掘,以前没有应用于此问题。这里,调用使用模式是在函数定义中发生的函数调用的组合。这两种方法都寻找代表函数标准用法的频繁出现的模式,并识别可能的错误。项目集挖掘产生无序模式,即函数调用集,而顺序模式挖掘产生部分有序模式,即函数调用序列。研究了顺序模式挖掘所提供的额外排序上下文与项目集挖掘效率之间的权衡关系。将这两种方法应用于Lima内核v2.6.14,结果表明挖掘有序模式值得付出额外的代价。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Comparing Approaches to Mining Source Code for Call-Usage Patterns
Two approaches for mining function-call usage patterns from source code are compared The first approach, itemset mining, has recently been applied to this problem. The other approach, sequential-pattern mining, has not been previously applied to this problem. Here, a call-usage pattern is a composition of function calls that occur in a function definition. Both approaches look for frequently occurring patterns that represent standard usage of functions and identify possible errors. Itemset mining produces unordered patterns, i.e., sets of function calls, whereas, sequential-pattern mining produces partially ordered patterns, i.e., sequences of function calls. The trade-off between the additional ordering context given by sequential-pattern mining and the efficiency of itemset mining is investigated. The two approaches are applied to the Lima kernel v2.6.14 and results show that mining ordered patterns is worth the additional cost.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信