Deriving tolerant grammars from a base-line grammar

International Conference on Software Maintenance, 2003. ICSM 2003. Proceedings. Pub Date : 2003-09-22 DOI:10.1109/ICSM.2003.1235420

Steven Klusener, R. Lämmel

{"title":"Deriving tolerant grammars from a base-line grammar","authors":"Steven Klusener, R. Lämmel","doi":"10.1109/ICSM.2003.1235420","DOIUrl":null,"url":null,"abstract":"A grammar-based approach to tool development in reengineering and reverse engineering promises precise structure awareness, but it is problematic in two respects. Firstly, it is a considerable up-front investment to obtain a grammar for a relevant language or cocktail of languages. Existing work on grammar recovery addresses this concern to some extent. Secondly, it is often not feasible to insist on a precise grammar, e.g., when different dialects need to be covered. This calls for tolerant grammars. In this paper, we provide a well-engineered approach to the derivation of tolerant grammars, which is based on previous work on error recovery, fuzzy parsing, and island grammars. The technology of this paper has been used in a complex Cobol restructuring project on several millions of lines of code in different Cobol dialects. Our approach is founded on an approximation relation between a tolerant grammar and a base-line grammar which serves as a point of reference. Thereby, we avoid false positives and false negatives when parsing constructs of interest in a tolerant mode. Our approach accomplishes the effective derivation of a tolerant grammar from the syntactical structure that is relevant for a certain re- or reverse engineering tool. To this end, the productions for the constructs of interest are reused from the base-line grammar together with further productions that are needed for completion.","PeriodicalId":141256,"journal":{"name":"International Conference on Software Maintenance, 2003. ICSM 2003. Proceedings.","volume":"72 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2003-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"44","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Conference on Software Maintenance, 2003. ICSM 2003. Proceedings.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICSM.2003.1235420","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 44

Abstract

A grammar-based approach to tool development in reengineering and reverse engineering promises precise structure awareness, but it is problematic in two respects. Firstly, it is a considerable up-front investment to obtain a grammar for a relevant language or cocktail of languages. Existing work on grammar recovery addresses this concern to some extent. Secondly, it is often not feasible to insist on a precise grammar, e.g., when different dialects need to be covered. This calls for tolerant grammars. In this paper, we provide a well-engineered approach to the derivation of tolerant grammars, which is based on previous work on error recovery, fuzzy parsing, and island grammars. The technology of this paper has been used in a complex Cobol restructuring project on several millions of lines of code in different Cobol dialects. Our approach is founded on an approximation relation between a tolerant grammar and a base-line grammar which serves as a point of reference. Thereby, we avoid false positives and false negatives when parsing constructs of interest in a tolerant mode. Our approach accomplishes the effective derivation of a tolerant grammar from the syntactical structure that is relevant for a certain re- or reverse engineering tool. To this end, the productions for the constructs of interest are reused from the base-line grammar together with further productions that are needed for completion.

查看原文本刊更多论文

从基线语法派生出容错语法

在再工程和逆向工程中，基于语法的工具开发方法承诺精确的结构感知，但它在两个方面存在问题。首先，获得相关语言或语言组合的语法是一笔可观的前期投资。现有的语法恢复工作在一定程度上解决了这一问题。其次，坚持一种精确的语法通常是不可行的，例如，当需要涵盖不同的方言时。这需要宽容语法。在本文中，我们基于先前在错误恢复、模糊解析和孤岛语法方面的工作，提供了一种设计良好的方法来派生容错语法。本文的技术已经在一个复杂的Cobol重组项目中使用，该项目涉及数百万行不同Cobol方言的代码。我们的方法建立在容忍语法和基线语法之间的近似关系上，基线语法作为参考点。因此，在以容忍模式解析感兴趣的构造时，我们避免了假阳性和假阴性。我们的方法从与某个重构或逆向工程工具相关的语法结构中有效地派生出宽容语法。为此，从基线语法中重用感兴趣的构造的结果以及完成所需的进一步结果。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

International Conference on Software Maintenance, 2003. ICSM 2003. Proceedings.

自引率

0.00%

发文量