{"title":"Scalable Automatic Concept Mining from Execution Traces","authors":"Soumaya Medini","doi":"10.1109/ICPC.2011.44","DOIUrl":null,"url":null,"abstract":"Concept identification is the task of locating and identifying concepts (e.g., domain concepts) into code region or, more generally, into artifact chunks. Concept identification is fundamental to program comprehension, software maintenance, and evolution. Different static, dynamic, and hybrid approaches for concept identification exist in the literature. Both static and dynamic techniques have advantages and limitations. In fact, they can be considered to complement each other. Indeed, recent works focused on hybrid techniques to improve the performance in time as well as accuracy (i.e., precision and recall) of the concept location process. Furthermore, sometimes only a single execution trace is available, however, to the best of our knowledge, only few works attempt to automatically identify concepts in a single execution trace. We propose an approach built upon a dynamic-programming algorithm to split an execution trace into segments likely representing concepts. The approach improves performance and scalability with respect to currently available techniques. We also plan to use techniques derived from Latent Dirichlet Allocation (LDA)to automatically assign meanings to segments.","PeriodicalId":345601,"journal":{"name":"2011 IEEE 19th International Conference on Program Comprehension","volume":"76 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-06-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2011 IEEE 19th International Conference on Program Comprehension","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICPC.2011.44","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
Concept identification is the task of locating and identifying concepts (e.g., domain concepts) into code region or, more generally, into artifact chunks. Concept identification is fundamental to program comprehension, software maintenance, and evolution. Different static, dynamic, and hybrid approaches for concept identification exist in the literature. Both static and dynamic techniques have advantages and limitations. In fact, they can be considered to complement each other. Indeed, recent works focused on hybrid techniques to improve the performance in time as well as accuracy (i.e., precision and recall) of the concept location process. Furthermore, sometimes only a single execution trace is available, however, to the best of our knowledge, only few works attempt to automatically identify concepts in a single execution trace. We propose an approach built upon a dynamic-programming algorithm to split an execution trace into segments likely representing concepts. The approach improves performance and scalability with respect to currently available techniques. We also plan to use techniques derived from Latent Dirichlet Allocation (LDA)to automatically assign meanings to segments.