{"title":"Semantic driven program analysis","authors":"Andrian Marcus","doi":"10.1109/ICSM.2004.1357837","DOIUrl":null,"url":null,"abstract":"The paper presents an approach to extract and to analyze the semantic content (i.e., problem and solution domain semantics) of existing software systems to support program understanding and software various maintenance tasks, such as: recovery of traceability links between documentation and source code, identification of abstract data types in legacy code, and identification of high-level concept clones in software. The semantic information is derived from the comments, documentation, and identifier names associated with the source code using information retrieval methods. The paper advocates for the use of latent semantic indexing as the underlying support for the semantic driven analysis. The presented results are based on the author's doctoral dissertation (Marcus, 2003).","PeriodicalId":348668,"journal":{"name":"20th IEEE International Conference on Software Maintenance, 2004. Proceedings.","volume":"125 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2004-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"47","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"20th IEEE International Conference on Software Maintenance, 2004. Proceedings.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICSM.2004.1357837","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 47
Abstract
The paper presents an approach to extract and to analyze the semantic content (i.e., problem and solution domain semantics) of existing software systems to support program understanding and software various maintenance tasks, such as: recovery of traceability links between documentation and source code, identification of abstract data types in legacy code, and identification of high-level concept clones in software. The semantic information is derived from the comments, documentation, and identifier names associated with the source code using information retrieval methods. The paper advocates for the use of latent semantic indexing as the underlying support for the semantic driven analysis. The presented results are based on the author's doctoral dissertation (Marcus, 2003).