{"title":"程序设计语言的上下文敏感解析","authors":"Boštjan Slivnik","doi":"10.1016/j.cola.2022.101172","DOIUrl":null,"url":null,"abstract":"<div><p>Parsing programming languages using context-sensitive rather than context-free grammars is being considered here because a stronger formalism might be beneficial for dealing with increasingly complex programming languages and their syntax, or is more appropriate in some applications. A new deterministic non-backtracking algorithm for parsing deterministic context-sensitive languages is described. It is a significant improvement of the algorithm built into <span>WEAVE</span> and <span>CWEAVE</span> tools for literate programming and requires that the language is described by a context-sensitive reduction system, namely a deterministic formalism similar to a context-sensitive grammar but with strict rules about how reductions are to be applied. The new algorithm uses a reduction automaton for finding the position of the next reduction at each step during parsing rather than a hardcoded trie that is build into the original algorithm of <span>WEAVE</span> and <span>CWEAVE</span>. The new algorithm performs at least twice as few operations per input symbol as the original one. Furthermore, it is shown that parsing a language described by a context-sensitive reduction system need not be limited to typesetting purposes as in literate programming but can be used as a general parsing approach.</p></div>","PeriodicalId":48552,"journal":{"name":"Journal of Computer Languages","volume":"73 ","pages":"Article 101172"},"PeriodicalIF":1.7000,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2590118422000697/pdfft?md5=71c396f28a0247b058401c5f5001037f&pid=1-s2.0-S2590118422000697-main.pdf","citationCount":"1","resultStr":"{\"title\":\"Context-sensitive parsing for programming languages\",\"authors\":\"Boštjan Slivnik\",\"doi\":\"10.1016/j.cola.2022.101172\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Parsing programming languages using context-sensitive rather than context-free grammars is being considered here because a stronger formalism might be beneficial for dealing with increasingly complex programming languages and their syntax, or is more appropriate in some applications. A new deterministic non-backtracking algorithm for parsing deterministic context-sensitive languages is described. It is a significant improvement of the algorithm built into <span>WEAVE</span> and <span>CWEAVE</span> tools for literate programming and requires that the language is described by a context-sensitive reduction system, namely a deterministic formalism similar to a context-sensitive grammar but with strict rules about how reductions are to be applied. The new algorithm uses a reduction automaton for finding the position of the next reduction at each step during parsing rather than a hardcoded trie that is build into the original algorithm of <span>WEAVE</span> and <span>CWEAVE</span>. The new algorithm performs at least twice as few operations per input symbol as the original one. Furthermore, it is shown that parsing a language described by a context-sensitive reduction system need not be limited to typesetting purposes as in literate programming but can be used as a general parsing approach.</p></div>\",\"PeriodicalId\":48552,\"journal\":{\"name\":\"Journal of Computer Languages\",\"volume\":\"73 \",\"pages\":\"Article 101172\"},\"PeriodicalIF\":1.7000,\"publicationDate\":\"2022-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.sciencedirect.com/science/article/pii/S2590118422000697/pdfft?md5=71c396f28a0247b058401c5f5001037f&pid=1-s2.0-S2590118422000697-main.pdf\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Computer Languages\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2590118422000697\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"COMPUTER SCIENCE, SOFTWARE ENGINEERING\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Computer Languages","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2590118422000697","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}
Context-sensitive parsing for programming languages
Parsing programming languages using context-sensitive rather than context-free grammars is being considered here because a stronger formalism might be beneficial for dealing with increasingly complex programming languages and their syntax, or is more appropriate in some applications. A new deterministic non-backtracking algorithm for parsing deterministic context-sensitive languages is described. It is a significant improvement of the algorithm built into WEAVE and CWEAVE tools for literate programming and requires that the language is described by a context-sensitive reduction system, namely a deterministic formalism similar to a context-sensitive grammar but with strict rules about how reductions are to be applied. The new algorithm uses a reduction automaton for finding the position of the next reduction at each step during parsing rather than a hardcoded trie that is build into the original algorithm of WEAVE and CWEAVE. The new algorithm performs at least twice as few operations per input symbol as the original one. Furthermore, it is shown that parsing a language described by a context-sensitive reduction system need not be limited to typesetting purposes as in literate programming but can be used as a general parsing approach.