{"title":"Automata Extraction from Transformers","authors":"Yihao Zhang, Zeming Wei, Meng Sun","doi":"arxiv-2406.05564","DOIUrl":"https://doi.org/arxiv-2406.05564","url":null,"abstract":"In modern machine (ML) learning systems, Transformer-based architectures have\u0000achieved milestone success across a broad spectrum of tasks, yet understanding\u0000their operational mechanisms remains an open problem. To improve the\u0000transparency of ML systems, automata extraction methods, which interpret\u0000stateful ML models as automata typically through formal languages, have proven\u0000effective for explaining the mechanism of recurrent neural networks (RNNs).\u0000However, few works have been applied to this paradigm to Transformer models. In\u0000particular, understanding their processing of formal languages and identifying\u0000their limitations in this area remains unexplored. In this paper, we propose an\u0000automata extraction algorithm specifically designed for Transformer models.\u0000Treating the Transformer model as a black-box system, we track the model\u0000through the transformation process of their internal latent representations\u0000during their operations, and then use classical pedagogical approaches like L*\u0000algorithm to interpret them as deterministic finite-state automata (DFA).\u0000Overall, our study reveals how the Transformer model comprehends the structure\u0000of formal languages, which not only enhances the interpretability of the\u0000Transformer-based ML systems but also marks a crucial step toward a deeper\u0000understanding of how ML systems process formal languages. Code and data are\u0000available at https://github.com/Zhang-Yihao/Transfomer2DFA.","PeriodicalId":501124,"journal":{"name":"arXiv - CS - Formal Languages and Automata Theory","volume":"13 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-06-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141511278","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ruben Becker, Sung-Hwan Kim, Nicola Prezza, Carlo Tosoni
{"title":"Indexing Finite-State Automata Using Forward-Stable Partitions","authors":"Ruben Becker, Sung-Hwan Kim, Nicola Prezza, Carlo Tosoni","doi":"arxiv-2406.02763","DOIUrl":"https://doi.org/arxiv-2406.02763","url":null,"abstract":"An index on a finite-state automaton is a data structure able to locate\u0000specific patterns on the automaton's paths and consequently on the regular\u0000language accepted by the automaton itself. Cotumaccio and Prezza [SODA '21],\u0000introduced a data structure able to solve pattern matching queries on automata,\u0000generalizing the famous FM-index for strings of Ferragina and Manzini [FOCS\u0000'00]. The efficiency of their index depends on the width of a particular\u0000partial order of the automaton's states, the smaller the width of the partial\u0000order, the faster is the index. However, computing the partial order of minimal\u0000width is NP-hard. This problem was mitigated by Cotumaccio [DCC '22], who\u0000relaxed the conditions on the partial order, allowing it to be a partial\u0000preorder. This relaxation yields the existence of a unique partial preorder of\u0000minimal width that can be computed in polynomial time. In the paper at hand, we\u0000present a new class of partial preorders and show that they have the following\u0000useful properties: (i) they can be computed in polynomial time, (ii) their\u0000width is never larger than the width of Cotumaccio's preorders, and (iii) there\u0000exist infinite classes of automata on which the width of Cotumaccio's pre-order\u0000is linearly larger than the width of our preorder.","PeriodicalId":501124,"journal":{"name":"arXiv - CS - Formal Languages and Automata Theory","volume":"22 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141546776","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pascal Baumann, Eren Keskin, Roland Meyer, Georg Zetzsche
{"title":"Separability in Büchi Vass and Singly Non-Linear Systems of Inequalities","authors":"Pascal Baumann, Eren Keskin, Roland Meyer, Georg Zetzsche","doi":"arxiv-2406.01008","DOIUrl":"https://doi.org/arxiv-2406.01008","url":null,"abstract":"The omega-regular separability problem for B\"uchi VASS coverability\u0000languages has recently been shown to be decidable, but with an EXPSPACE lower\u0000and a non-primitive recursive upper bound -- the exact complexity remained\u0000open. We close this gap and show that the problem is EXPSPACE-complete. A\u0000careful analysis of our complexity bounds additionally yields a PSPACE\u0000procedure in the case of fixed dimension >= 1, which matches a pre-established\u0000lower bound of PSPACE for one dimensional B\"uchi VASS. Our algorithm is a\u0000non-deterministic search for a witness whose size, as we show, can be suitably\u0000bounded. Part of the procedure is to decide the existence of runs in VASS that\u0000satisfy certain non-linear properties. Therefore, a key technical ingredient is\u0000to analyze a class of systems of inequalities where one variable may occur in\u0000non-linear (polynomial) expressions. These so-called singly non-linear systems (SNLS) take the form A(x).y >=\u0000b(x), where A(x) and b(x) are a matrix resp. a vector whose entries are\u0000polynomials in x, and y ranges over vectors in the rationals. Our main\u0000contribution on SNLS is an exponential upper bound on the size of rational\u0000solutions to singly non-linear systems. The proof consists of three steps.\u0000First, we give a tailor-made quantifier elimination to characterize all real\u0000solutions to x. Second, using the root separation theorem about the distance of\u0000real roots of polynomials, we show that if a rational solution exists, then\u0000there is one with at most polynomially many bits. Third, we insert the solution\u0000for x into the SNLS, making it linear and allowing us to invoke standard\u0000solution bounds from convex geometry. Finally, we combine the results about SNLS with several techniques from the\u0000area of VASS to devise an EXPSPACE decision procedure for omega-regular\u0000separability of B\"uchi VASS.","PeriodicalId":501124,"journal":{"name":"arXiv - CS - Formal Languages and Automata Theory","volume":"1 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-06-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141255985","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Linear equations and recursively enumerable sets","authors":"Juha Honkala","doi":"arxiv-2406.00688","DOIUrl":"https://doi.org/arxiv-2406.00688","url":null,"abstract":"We study connections between linear equations over various semigroups and\u0000recursively enumerable sets of positive integers. We give variants of the\u0000universal Diophantine representation of recursively enumerable sets of positive\u0000integers established by Matiyasevich. These variants use linear equations with\u0000one unkwown instead of polynomial equations with several unknowns. As a\u0000corollary we get undecidability results for linear equations over morphism\u0000semigoups and over matrix semigroups.","PeriodicalId":501124,"journal":{"name":"arXiv - CS - Formal Languages and Automata Theory","volume":"34 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141255904","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Manfred Droste, Zoltán Fülöp, Andreja Tepavčević, Heiko Vogler
{"title":"The generating power of weighted tree automata with initial algebra semantics","authors":"Manfred Droste, Zoltán Fülöp, Andreja Tepavčević, Heiko Vogler","doi":"arxiv-2405.20753","DOIUrl":"https://doi.org/arxiv-2405.20753","url":null,"abstract":"We consider the images of the initial algebra semantics of weighted tree\u0000automata over strong bimonoids (hence also over semirings). These images are\u0000subsets of the carrier set of the underlying strong bimonoid. We consider\u0000locally finite, weakly locally finite, and bi-locally finite strong bimonoids.\u0000We show that there exists a strong bimonoid which is weakly locally finite and\u0000not locally finite. We also show that if the ranked alphabet contains a binary\u0000symbol, then for any finitely generated strong bimonoid, weighted tree automata\u0000can generate, via their initial algebra semantics, all elements of the strong\u0000bimonoid. As a consequence of these results, for weakly locally finite strong\u0000bimonoids which are not locally finite, weighted tree automata can generate\u0000infinite images provided that the input ranked alphabet contains at least one\u0000binary symbol. This is in sharp contrast to the setting of weighted string\u0000automata, where each such image is known to be finite. As a further\u0000consequence, for any finitely generated semiring, there exists a weighted tree\u0000automaton which generates, via its run semantics, all elements of the semiring.","PeriodicalId":501124,"journal":{"name":"arXiv - CS - Formal Languages and Automata Theory","volume":"70 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141255739","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Andrew C. Li, Zizhao Chen, Toryn Q. Klassen, Pashootan Vaezipoor, Rodrigo Toro Icarte, Sheila A. McIlraith
{"title":"Reward Machines for Deep RL in Noisy and Uncertain Environments","authors":"Andrew C. Li, Zizhao Chen, Toryn Q. Klassen, Pashootan Vaezipoor, Rodrigo Toro Icarte, Sheila A. McIlraith","doi":"arxiv-2406.00120","DOIUrl":"https://doi.org/arxiv-2406.00120","url":null,"abstract":"Reward Machines provide an automata-inspired structure for specifying\u0000instructions, safety constraints, and other temporally extended reward-worthy\u0000behaviour. By exposing complex reward function structure, they enable\u0000counterfactual learning updates that have resulted in impressive sample\u0000efficiency gains. While Reward Machines have been employed in both tabular and\u0000deep RL settings, they have typically relied on a ground-truth interpretation\u0000of the domain-specific vocabulary that form the building blocks of the reward\u0000function. Such ground-truth interpretations can be elusive in many real-world\u0000settings, due in part to partial observability or noisy sensing. In this paper,\u0000we explore the use of Reward Machines for Deep RL in noisy and uncertain\u0000environments. We characterize this problem as a POMDP and propose a suite of RL\u0000algorithms that leverage task structure under uncertain interpretation of\u0000domain-specific vocabulary. Theoretical analysis exposes pitfalls in naive\u0000approaches to this problem, while experimental results show that our algorithms\u0000successfully leverage task structure to improve performance under noisy\u0000interpretations of the vocabulary. Our results provide a general framework for\u0000exploiting Reward Machines in partially observable environments.","PeriodicalId":501124,"journal":{"name":"arXiv - CS - Formal Languages and Automata Theory","volume":"6 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141255823","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The CFG Complexity of Singleton Sets","authors":"Lance Fortnow, William Gasarch","doi":"arxiv-2405.20026","DOIUrl":"https://doi.org/arxiv-2405.20026","url":null,"abstract":"Let G be a context-free grammar (CFG) in Chomsky normal form. We take the\u0000number of rules in G to be the size of G. We also assume all CFGs are in\u0000Chomsky normal form. We consider the question of, given a string w of length n, what is the\u0000smallest CFG such that L(G)={w}? We show the following: 1) For all w, |w|=n, there is a CFG of size with O(n/log n) rules, such that\u0000L(G)={w}. 2) There exists a string w, |w|=n, such that every CFG G with L(G)={w} is of\u0000size Omega(n/log n). We give two proofs of: one nonconstructive, the other\u0000constructive.","PeriodicalId":501124,"journal":{"name":"arXiv - CS - Formal Languages and Automata Theory","volume":"103 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-05-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141197585","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"DFAMiner: Mining minimal separating DFAs from labelled samples","authors":"Daniele Dell'Erba, Yong Li, Sven Schewe","doi":"arxiv-2405.18871","DOIUrl":"https://doi.org/arxiv-2405.18871","url":null,"abstract":"We propose DFAMiner, a passive learning tool for learning minimal separating\u0000deterministic finite automata (DFA) from a set of labelled samples. Separating\u0000automata are an interesting class of automata that occurs generally in regular\u0000model checking and has raised interest in foundational questions of parity game\u0000solving. We first propose a simple and linear-time algorithm that incrementally\u0000constructs a three-valued DFA (3DFA) from a set of labelled samples given in\u0000the usual lexicographical order. This 3DFA has accepting and rejecting states\u0000as well as don't-care states, so that it can exactly recognise the labelled\u0000examples. We then apply our tool to mining a minimal separating DFA for the\u0000labelled samples by minimising the constructed automata via a reduction to\u0000solving SAT problems. Empirical evaluation shows that our tool outperforms\u0000current state-of-the-art tools significantly on standard benchmarks for\u0000learning minimal separating DFAs from samples. Progress in the efficient\u0000construction of separating DFAs can also lead to finding the lower bound of\u0000parity game solving, where we show that DFAMiner can create optimal separating\u0000automata for simple languages with up to 7 colours. Future improvements might\u0000offer inroads to better data structures.","PeriodicalId":501124,"journal":{"name":"arXiv - CS - Formal Languages and Automata Theory","volume":"4 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-05-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141197579","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Oblivious Monitoring for Discrete-Time STL via Fully Homomorphic Encryption","authors":"Masaki Waga, Kotaro Matsuoka, Takashi Suwa, Naoki Matsumoto, Ryotaro Banno, Song Bian, Kohei Suenaga","doi":"arxiv-2405.16767","DOIUrl":"https://doi.org/arxiv-2405.16767","url":null,"abstract":"When monitoring a cyber-physical system (CPS) from a remote server, keeping\u0000the monitored data secret is crucial, particularly when they contain sensitive\u0000information, e.g., biological or location data. Recently, Banno et al. (CAV'22)\u0000proposed a protocol for online LTL monitoring that keeps data concealed from\u0000the server using Fully Homomorphic Encryption (FHE). We build on this protocol\u0000to allow arithmetic operations over encrypted values, e.g., to compute a safety\u0000measurement combining distance, velocity, and so forth. Overall, our protocol\u0000enables oblivious online monitoring of discrete-time real-valued signals\u0000against signal temporal logic (STL) formulas. Our protocol combines two FHE\u0000schemes, CKKS and TFHE, leveraging their respective strengths. We employ CKKS\u0000to evaluate arithmetic predicates in STL formulas while utilizing TFHE to\u0000process them using a DFA derived from the STL formula. We conducted case\u0000studies on monitoring blood glucose levels and vehicles' behavior against the\u0000Responsibility-Sensitive Safety (RSS) rules. Our results suggest the practical\u0000relevance of our protocol.","PeriodicalId":501124,"journal":{"name":"arXiv - CS - Formal Languages and Automata Theory","volume":"164 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141172667","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The Expressive Capacity of State Space Models: A Formal Language Perspective","authors":"Yash Sarrof, Yana Veitsman, Michael Hahn","doi":"arxiv-2405.17394","DOIUrl":"https://doi.org/arxiv-2405.17394","url":null,"abstract":"Recently, recurrent models based on linear state space models (SSMs) have\u0000shown promising performance in language modeling (LM), competititve with\u0000transformers. However, there is little understanding of the in-principle\u0000abilities of such models, which could provide useful guidance to the search for\u0000better LM architectures. We present a comprehensive theoretical study of the\u0000capacity of such SSMs as it compares to that of transformers and traditional\u0000RNNs. We find that SSMs and transformers have overlapping but distinct\u0000strengths. In star-free state tracking, SSMs implement straightforward and\u0000exact solutions to problems that transformers struggle to represent exactly.\u0000They can also model bounded hierarchical structure with optimal memory even\u0000without simulating a stack. On the other hand, we identify a design choice in\u0000current SSMs that limits their expressive power. We discuss implications for\u0000SSM and LM research, and verify results empirically on a recent SSM, Mamba.","PeriodicalId":501124,"journal":{"name":"arXiv - CS - Formal Languages and Automata Theory","volume":"98 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141172738","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}