{"title":"Computational Model for Parsing Expression Grammars","authors":"Alexander Rubtsov, Nikita Chudinov","doi":"arxiv-2406.14911","DOIUrl":null,"url":null,"abstract":"We present a computational model for Parsing Expression Grammars (PEGs). The\npredecessor of PEGs top-down parsing languages (TDPLs) were discovered by A.\nBirman and J. Ullman in the 1960-s, B. Ford showed in 2004 that both formalisms\nrecognize the same class named Parsing Expression Languages (PELs). A. Birman\nand J. Ullman established such important properties like TDPLs generate any\nDCFL and some non-context-free languages like $a^nb^nc^n$, a linear-time\nparsing algorithm was constructed as well. But since this parsing algorithm was\nimpractical in the 60-s TDPLs were abandoned and then upgraded by B. Ford to\nPEGs, so the parsing algorithm was improved (from the practical point of view)\nas well. Now PEGs are actively used in compilers (eg., Python replaced\nLL(1)-parser with a PEG one) so as for text processing as well. In this paper,\nwe present a computational model for PEG, obtain structural properties of PELs,\nnamely proof that PELs are closed over left concatenation with Boolean closure\nof regular closure of DCFLs, and present an extension of the PELs class based\non the extension of our computational model. Our model is an upgrade of\ndeterministic pushdown automata (DPDA) such that during the pop of a symbol it\nis allowed to return the head to the position of the push of the symbol. We\nprovide a linear-time simulation algorithm for the 2-way version of this model,\nwhich is similar to the S. Cook famous linear-time simulation algorithm of\n2-way DPDA.","PeriodicalId":501124,"journal":{"name":"arXiv - CS - Formal Languages and Automata Theory","volume":"178 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Formal Languages and Automata Theory","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2406.14911","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
We present a computational model for Parsing Expression Grammars (PEGs). The
predecessor of PEGs top-down parsing languages (TDPLs) were discovered by A.
Birman and J. Ullman in the 1960-s, B. Ford showed in 2004 that both formalisms
recognize the same class named Parsing Expression Languages (PELs). A. Birman
and J. Ullman established such important properties like TDPLs generate any
DCFL and some non-context-free languages like $a^nb^nc^n$, a linear-time
parsing algorithm was constructed as well. But since this parsing algorithm was
impractical in the 60-s TDPLs were abandoned and then upgraded by B. Ford to
PEGs, so the parsing algorithm was improved (from the practical point of view)
as well. Now PEGs are actively used in compilers (eg., Python replaced
LL(1)-parser with a PEG one) so as for text processing as well. In this paper,
we present a computational model for PEG, obtain structural properties of PELs,
namely proof that PELs are closed over left concatenation with Boolean closure
of regular closure of DCFLs, and present an extension of the PELs class based
on the extension of our computational model. Our model is an upgrade of
deterministic pushdown automata (DPDA) such that during the pop of a symbol it
is allowed to return the head to the position of the push of the symbol. We
provide a linear-time simulation algorithm for the 2-way version of this model,
which is similar to the S. Cook famous linear-time simulation algorithm of
2-way DPDA.