{"title":"参数化模式匹配理论:算法与应用","authors":"B. S. Baker","doi":"10.1145/167088.167115","DOIUrl":null,"url":null,"abstract":"This paper develops a theory and algoritbrns for an application problem arising in software maintenance. The application is to track down duplication in a large software system. We want to find not only exact matches between sections of code, but parametrized matches, where a parametrized match between two sections of code means that one section can be transformed into the other by replacing the parameter names (e.g. identifiers and constants) of one section by the parameter names of the other via a one-to-one function. This paper formalizes this problem in terms of parametrized strings and parametrized pattern matching and detirtes a new data structure (parametrized sujjfi.x tree) suitable for parametrized pattern matching. It gives efficient algorithms for constructing this data structure, efficient algorithms for parametrized pattern matchmg, and an efficient algorithm for timing all maximal parametrized matches over a threshold length in a parametrized string. The algorithms for constructing parametrized suffix trees and for reporting duplication over a threshold length have been implemented. Tests on C code indicate that these algorithms should perform well in the application.","PeriodicalId":280602,"journal":{"name":"Proceedings of the twenty-fifth annual ACM symposium on Theory of Computing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1993-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"168","resultStr":"{\"title\":\"A theory of parameterized pattern matching: algorithms and applications\",\"authors\":\"B. S. Baker\",\"doi\":\"10.1145/167088.167115\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper develops a theory and algoritbrns for an application problem arising in software maintenance. The application is to track down duplication in a large software system. We want to find not only exact matches between sections of code, but parametrized matches, where a parametrized match between two sections of code means that one section can be transformed into the other by replacing the parameter names (e.g. identifiers and constants) of one section by the parameter names of the other via a one-to-one function. This paper formalizes this problem in terms of parametrized strings and parametrized pattern matching and detirtes a new data structure (parametrized sujjfi.x tree) suitable for parametrized pattern matching. It gives efficient algorithms for constructing this data structure, efficient algorithms for parametrized pattern matchmg, and an efficient algorithm for timing all maximal parametrized matches over a threshold length in a parametrized string. The algorithms for constructing parametrized suffix trees and for reporting duplication over a threshold length have been implemented. Tests on C code indicate that these algorithms should perform well in the application.\",\"PeriodicalId\":280602,\"journal\":{\"name\":\"Proceedings of the twenty-fifth annual ACM symposium on Theory of Computing\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1993-06-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"168\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the twenty-fifth annual ACM symposium on Theory of Computing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/167088.167115\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the twenty-fifth annual ACM symposium on Theory of Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/167088.167115","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A theory of parameterized pattern matching: algorithms and applications
This paper develops a theory and algoritbrns for an application problem arising in software maintenance. The application is to track down duplication in a large software system. We want to find not only exact matches between sections of code, but parametrized matches, where a parametrized match between two sections of code means that one section can be transformed into the other by replacing the parameter names (e.g. identifiers and constants) of one section by the parameter names of the other via a one-to-one function. This paper formalizes this problem in terms of parametrized strings and parametrized pattern matching and detirtes a new data structure (parametrized sujjfi.x tree) suitable for parametrized pattern matching. It gives efficient algorithms for constructing this data structure, efficient algorithms for parametrized pattern matchmg, and an efficient algorithm for timing all maximal parametrized matches over a threshold length in a parametrized string. The algorithms for constructing parametrized suffix trees and for reporting duplication over a threshold length have been implemented. Tests on C code indicate that these algorithms should perform well in the application.