arXiv - CS - Formal Languages and Automata Theory最新文献_第6页

History-Determinism vs Fair Simulation 历史决定论与公平模拟

arXiv - CS - Formal Languages and Automata Theory Pub Date : 2024-07-11 DOI: arxiv-2407.08620

Udi Boker, Thomas A. Henzinger, Karoliina Lehtinen, Aditya Prakash

{"title":"History-Determinism vs Fair Simulation","authors":"Udi Boker, Thomas A. Henzinger, Karoliina Lehtinen, Aditya Prakash","doi":"arxiv-2407.08620","DOIUrl":"https://doi.org/arxiv-2407.08620","url":null,"abstract":"An automaton is history-deterministic if its nondeterminism can be resolved\u0000on the fly, only using the prefix of the word read so far. This mild form of\u0000nondeterminism has attracted particular attention for its applications in\u0000synthesis problems. An automaton $A$ is guidable with respect to a class $C$ of\u0000automata if it can fairly simulate every automaton in $C$ whose language is\u0000contained in that of $A$. In other words, guidable automata are those for which\u0000inclusion and simulation coincide, making them particularly interesting for\u0000model-checking. We study the connection between these two notions, and specifically the\u0000question of when they coincide. For classes of automata on which they do,\u0000deciding guidability, an otherwise challenging decision problem, reduces to\u0000deciding history-determinism, a problem that is starting to be well-understood\u0000for many classes. We provide a selection of sufficient criteria for a class of automata to\u0000guarantee the coincidence of the notions, and use them to show that the notions\u0000coincide for the most common automata classes, among which are $omega$-regular\u0000automata and many infinite-state automata with safety and reachability\u0000acceptance conditions, including vector addition systems with states,\u0000one-counter nets, pushdown-, Parikh-, and timed-automata. We also demonstrate that history-determinism and guidability do not always\u0000coincide, for example, for the classes of timed automata with a fixed number of\u0000clocks.","PeriodicalId":501124,"journal":{"name":"arXiv - CS - Formal Languages and Automata Theory","volume":"33 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141614300","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Automata-based constraints for language model decoding 基于自动机的语言模型解码约束

arXiv - CS - Formal Languages and Automata Theory Pub Date : 2024-07-11 DOI: arxiv-2407.08103

Terry Koo, Frederick Liu, Luheng He

{"title":"Automata-based constraints for language model decoding","authors":"Terry Koo, Frederick Liu, Luheng He","doi":"arxiv-2407.08103","DOIUrl":"https://doi.org/arxiv-2407.08103","url":null,"abstract":"LMs are often expected to generate strings in some formal language; for\u0000example, structured data, API calls, or code snippets. Although LMs can be\u0000tuned to improve their adherence to formal syntax, this does not guarantee\u0000conformance, especially with smaller LMs suitable for large-scale deployment.\u0000In addition, tuning requires significant resources, making it impractical for\u0000uncommon or task-specific formats. To prevent downstream parsing errors we\u0000would ideally constrain the LM to only produce valid output, but this is\u0000severely complicated by tokenization, which is typically both ambiguous and\u0000misaligned with the formal grammar. We solve these issues through the\u0000application of automata theory, deriving an efficient closed-form solution for\u0000the regular languages, a broad class of formal languages with many practical\u0000applications, including API calls or schema-guided JSON and YAML. We also\u0000discuss pragmatic extensions for coping with the issue of high branching\u0000factor. Finally, we extend our techniques to deterministic context-free\u0000languages, which similarly admit an efficient closed-form solution. In spite of\u0000its flexibility and representative power, our approach only requires access to\u0000per-token decoding logits and lowers into simple calculations that are\u0000independent of LM size, making it both efficient and easy to apply to almost\u0000any LM architecture.","PeriodicalId":501124,"journal":{"name":"arXiv - CS - Formal Languages and Automata Theory","volume":"23 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141614319","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

More on Maximally Permissive Similarity Control of Discrete Event Systems 离散事件系统最大允许相似性控制的更多信息

arXiv - CS - Formal Languages and Automata Theory Pub Date : 2024-07-10 DOI: arxiv-2407.08068

Yu Wang, Zhaohui Zhu, Rob van Glabbeek, Jinjin Zhang, Lixing Tan

引用次数: 0

Generalized Parikh Matrices For Tracking Subsequence Occurrences 用于跟踪后续出现的广义帕里克矩阵

arXiv - CS - Formal Languages and Automata Theory Pub Date : 2024-07-05 DOI: arxiv-2407.04462

Szilárd Zsolt Fazekas, Xinhao Huang

引用次数: 0

Complex Event Recognition with Symbolic Register Transducers: Extended Technical Report 利用符号寄存器转换器识别复杂事件：扩展技术报告

arXiv - CS - Formal Languages and Automata Theory Pub Date : 2024-07-03 DOI: arxiv-2407.02884

Elias Alevizos, Alexander Artikis, Georgios Paliouras

{"title":"Complex Event Recognition with Symbolic Register Transducers: Extended Technical Report","authors":"Elias Alevizos, Alexander Artikis, Georgios Paliouras","doi":"arxiv-2407.02884","DOIUrl":"https://doi.org/arxiv-2407.02884","url":null,"abstract":"We present a system for Complex Event Recognition (CER) based on automata.\u0000While multiple such systems have been described in the literature, they\u0000typically suffer from a lack of clear and denotational semantics, a limitation\u0000which often leads to confusion with respect to their expressive power. In order\u0000to address this issue, our system is based on an automaton model which is a\u0000combination of symbolic and register automata. We extend previous work on these\u0000types of automata, in order to construct a formalism with clear semantics and a\u0000corresponding automaton model whose properties can be formally investigated. We\u0000call such automata Symbolic Register Transducers (SRT). We show that SRT are\u0000closed under various operators, but are not in general closed under complement\u0000and they are not determinizable. However, they are closed under these\u0000operations when a window operator, quintessential in Complex Event Recognition,\u0000is used. We show how SRT can be used in CER in order to detect patterns upon\u0000streams of events, using our framework that provides declarative and\u0000compositional semantics, and that allows for a systematic treatment of such\u0000automata. For SRT to work in pattern detection, we allow them to mark events\u0000from the input stream as belonging to a complex event or not, hence the name\u0000\"transducers\". We also present an implementation of SRT which can perform CER.\u0000We compare our SRT-based CER engine against other state-of-the-art CER systems\u0000and show that it is both more expressive and more efficient.","PeriodicalId":501124,"journal":{"name":"arXiv - CS - Formal Languages and Automata Theory","volume":"15 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141546773","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

TheoremLlama: Transforming General-Purpose LLMs into Lean4 Experts TheoremLlama：将通用LLM转化为精益4专家

arXiv - CS - Formal Languages and Automata Theory Pub Date : 2024-07-03 DOI: arxiv-2407.03203

Ruida Wang, Jipeng Zhang, Yizhen Jia, Rui Pan, Shizhe Diao, Renjie Pi, Tong Zhang

{"title":"TheoremLlama: Transforming General-Purpose LLMs into Lean4 Experts","authors":"Ruida Wang, Jipeng Zhang, Yizhen Jia, Rui Pan, Shizhe Diao, Renjie Pi, Tong Zhang","doi":"arxiv-2407.03203","DOIUrl":"https://doi.org/arxiv-2407.03203","url":null,"abstract":"Proving mathematical theorems using computer-verifiable formal languages like\u0000Lean significantly impacts mathematical reasoning. One approach to formal\u0000theorem proving involves generating complete proofs using Large Language Models\u0000(LLMs) based on Natural Language (NL) proofs. Similar methods have shown\u0000promising results in code generation. However, most modern LLMs exhibit\u0000suboptimal performance due to the scarcity of aligned NL and Formal Language\u0000(FL) theorem-proving data. This scarcity results in a paucity of methodologies\u0000for training LLMs and techniques to fully utilize their capabilities in\u0000composing formal proofs. To address the challenges, this paper proposes\u0000**TheoremLlama**, an end-to-end framework to train a general-purpose LLM to\u0000become a Lean4 expert. This framework encompasses NL-FL aligned dataset\u0000generation methods, training approaches for the LLM formal theorem prover, and\u0000techniques for LLM Lean4 proof writing. Using the dataset generation method, we\u0000provide *Open Bootstrapped Theorems* (OBT), an NL-FL aligned and bootstrapped\u0000dataset. A key innovation in this framework is the NL-FL bootstrapping method,\u0000where NL proofs are integrated into Lean4 code for training datasets,\u0000leveraging the NL reasoning ability of LLMs for formal reasoning. The\u0000**TheoremLlama** framework achieves cumulative accuracies of 36.48% and 33.61%\u0000on MiniF2F-Valid and Test datasets respectively, surpassing the GPT-4 baseline\u0000of 22.95% and 25.41%. We have also open-sourced our model checkpoints and\u0000generated dataset, and will soon make all the code publicly available.","PeriodicalId":501124,"journal":{"name":"arXiv - CS - Formal Languages and Automata Theory","volume":"22 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141546772","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Monads, Comonads, and Transducers 单子、公元和变换器

arXiv - CS - Formal Languages and Automata Theory Pub Date : 2024-07-02 DOI: arxiv-2407.02704

Rafał Stefański

{"title":"Monads, Comonads, and Transducers","authors":"Rafał Stefański","doi":"arxiv-2407.02704","DOIUrl":"https://doi.org/arxiv-2407.02704","url":null,"abstract":"This paper proposes a definition of recognizable transducers over monads and\u0000comonads, which bridges two important ongoing efforts in the current research\u0000on regularity. The first effort is the study of regular transductions, which\u0000extends the notion of regularity from languages into word-to-word functions.\u0000The other important effort is generalizing the notion of regular languages from\u0000words to arbitrary monads, introduced in arXiv:1502.04898. In this paper, we\u0000present a number of examples of transducer classes that fit the proposed\u0000framework. In particular we show that our class generalizes the classes of\u0000Mealy machines and rational transductions. We also present examples of\u0000recognizable transducers for infinite words and a specific type of trees called\u0000terms. The main result of this paper is a theorem, which states the class of\u0000recognizable transductions is closed under composition, subject to some\u0000coherence axioms between the structure of a monad and the structure of a\u0000comonad. Due to its complexity, we formalize the proof of the theorem in Coq\u0000Proof Assistant. In the proof, we introduce the concepts of a context and a\u0000generalized wreath product for Eilenberg-Moore algebras, which could be\u0000valuable tools for studying these algebras.","PeriodicalId":501124,"journal":{"name":"arXiv - CS - Formal Languages and Automata Theory","volume":"4 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141546774","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

On Shuffling and Splitting Automata 关于洗牌和分裂自动机

arXiv - CS - Formal Languages and Automata Theory Pub Date : 2024-07-02 DOI: arxiv-2407.02660

Ignacio Mollo Cunningham

引用次数: 0

Some Remarks on First-Order Definable Tree Languages 关于一阶可定义树状语言的一些评论

arXiv - CS - Formal Languages and Automata Theory Pub Date : 2024-07-01 DOI: arxiv-2407.01169

Achim Blumensath

引用次数: 0

Regular Expressions with Backreferences on Multiple Context-Free Languages, and the Closed-Star Condition 多上下文自由语言上具有反向引用的正则表达式和闭星条件

arXiv - CS - Formal Languages and Automata Theory Pub Date : 2024-06-27 DOI: arxiv-2406.18918

Taisei Nogami, Tachio Terauchi

{"title":"Regular Expressions with Backreferences on Multiple Context-Free Languages, and the Closed-Star Condition","authors":"Taisei Nogami, Tachio Terauchi","doi":"arxiv-2406.18918","DOIUrl":"https://doi.org/arxiv-2406.18918","url":null,"abstract":"Backreference is a well-known practical extension of regular expressions and\u0000most modern programming languages, such as Java, Python, JavaScript and more,\u0000support regular expressions with backreferences (rewb) in their standard\u0000libraries for string processing. A difficulty of backreference is\u0000non-regularity: unlike some other extensions, backreference strictly enhances\u0000the expressive power of regular expressions and thus rewbs can describe\u0000non-regular (in fact, even non-context-free) languages. In this paper, we\u0000investigate the expressive power of rewbs by comparing rewbs to multiple\u0000context-free languages (MCFL) and parallel multiple context-free languages\u0000(PMCFL). First, we prove that the language class of rewbs is a proper subclass\u0000of unary-PMCFLs. The class of unary-PMCFLs coincides with that of EDT0L\u0000languages, and our result strictly improves the known upper bound of rewbs.\u0000Additionally, we show that, however, the language class of rewbs is not\u0000contained in that of MCFLs even when restricted to rewbs with only one\u0000capturing group and no captured references. Therefore, in general, the\u0000parallelism seems essential for rewbs. Backed by these results, we define a\u0000novel syntactic condition on rewbs that we call closed-star and observe that it\u0000provides an upper bound on the number of times a rewb references the same\u0000captured string. The closed-star condition allows dispensing with the\u0000parallelism: that is, we prove that the language class of closed-star rewbs\u0000falls inside the class of unary-MCFLs, which is equivalent to that of EDT0L\u0000systems of finite index. Furthermore, as additional evidence for the robustness\u0000of the condition, we show that the language class of closed-star rewbs also\u0000falls inside the class of nonerasing stack languages (NESL).","PeriodicalId":501124,"journal":{"name":"arXiv - CS - Formal Languages and Automata Theory","volume":"37 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141511255","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0