IDLIQ: An Incremental Deterministic Finite Automaton Learning Algorithm Through Inverse Queries for Regular Grammar Inference.

IF 2.6 4区计算机科学 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS

Big Data Pub Date : 2024-12-01 Epub Date: 2023-05-18 DOI:10.1089/big.2022.0158

Farah Haneef, Muddassar A Sindhu

{"title":"IDLIQ: An Incremental Deterministic Finite Automaton Learning Algorithm Through Inverse Queries for Regular Grammar Inference.","authors":"Farah Haneef, Muddassar A Sindhu","doi":"10.1089/big.2022.0158","DOIUrl":null,"url":null,"abstract":"We present an efficient incremental learning algorithm for Deterministic Finite Automaton (DFA) with the help of inverse query (IQ) and membership query (MQ). This algorithm is an extension of the Identification of Regular Languages (ID) algorithm from a complete to an incremental learning setup. The learning algorithm learns by making use of a set of labeled examples and by posing queries to a knowledgeable teacher, which is equipped to answer IQs along with MQs and equivalence query. Based on the examples (elements of the live complete set) and responses against IQs from the minimally adequate teacher (MAT), the learning algorithm constructs the hypothesis automaton, consistent with all observed examples. The Incremental DFA Learning algorithm through Inverse Queries (IDLIQ) takes <math><mstyle><mi>O</mi></mstyle><mrow><mo>(</mo><mrow><mo>|</mo><mi>Σ</mi><mo>|</mo><mi>N</mi><mo>+</mo><mo>|</mo><msub><mrow><mi>P</mi></mrow><mrow><mi>c</mi></mrow></msub><mo>|</mo><mo>|</mo><mi>F</mi><mo>|</mo></mrow><mo>)</mo></mrow></math> time complexity in the presence of a MAT and ensures convergence to a minimal representation of the target DFA with finite number of labeled examples. Existing incremental learning algorithms; the Incremental ID, the Incremental Distinguishing Strings have polynomial (cubic) time complexity in the presence of a MAT. Therefore, sometimes, these algorithms even fail to learn large complex software systems. In this research work, we have reduced the complexity (from cubic to square form) of the DFA learning in an incremental setup. Finally, we prove the correctness and termination of the IDLIQ algorithm.","PeriodicalId":51314,"journal":{"name":"Big Data","volume":" ","pages":"446-455"},"PeriodicalIF":2.6000,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Big Data","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1089/big.2022.0158","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2023/5/18 0:00:00","PubModel":"Epub","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}

引用次数: 0

Abstract

We present an efficient incremental learning algorithm for Deterministic Finite Automaton (DFA) with the help of inverse query (IQ) and membership query (MQ). This algorithm is an extension of the Identification of Regular Languages (ID) algorithm from a complete to an incremental learning setup. The learning algorithm learns by making use of a set of labeled examples and by posing queries to a knowledgeable teacher, which is equipped to answer IQs along with MQs and equivalence query. Based on the examples (elements of the live complete set) and responses against IQs from the minimally adequate teacher (MAT), the learning algorithm constructs the hypothesis automaton, consistent with all observed examples. The Incremental DFA Learning algorithm through Inverse Queries (IDLIQ) takes $O (| Σ | N + | P_{c} | | F |)$ time complexity in the presence of a MAT and ensures convergence to a minimal representation of the target DFA with finite number of labeled examples. Existing incremental learning algorithms; the Incremental ID, the Incremental Distinguishing Strings have polynomial (cubic) time complexity in the presence of a MAT. Therefore, sometimes, these algorithms even fail to learn large complex software systems. In this research work, we have reduced the complexity (from cubic to square form) of the DFA learning in an incremental setup. Finally, we prove the correctness and termination of the IDLIQ algorithm.

查看原文本刊更多论文

基于逆查询的正则语法推理的增量确定性有限自动机学习算法。

提出了一种基于逆查询（IQ）和隶属查询（MQ）的确定性有限自动机（DFA）的高效增量学习算法。该算法是正则语言识别（ID）算法的扩展，从一个完整的学习设置到一个增量的学习设置。学习算法通过使用一组标记的示例并向知识渊博的教师提出问题来学习，该教师配备了回答iq以及MQs和等价查询的设备。基于示例（实时完整集的元素）和对最低适足教师（MAT）智商的响应，学习算法构建假设自动机，与所有观察到的示例一致。通过逆查询的增量DFA学习算法（IDLIQ）在MAT存在下的时间复杂度为0 (|Σ|N+|Pc||F|)，并确保收敛到具有有限数量标记示例的目标DFA的最小表示。现有的增量学习算法；在存在MAT的情况下，增量ID、增量区分字符串具有多项式（三次）时间复杂度。因此，有时这些算法甚至无法学习大型复杂软件系统。在这项研究工作中，我们在增量设置中降低了DFA学习的复杂性（从立方形式到平方形式）。最后，我们证明了IDLIQ算法的正确性和终止性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Big Data COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS-COMPUTER SCIENCE, THEORY & METHODS

CiteScore

9.10

自引率

2.20%

发文量

期刊介绍： Big Data is the leading peer-reviewed journal covering the challenges and opportunities in collecting, analyzing, and disseminating vast amounts of data. The Journal addresses questions surrounding this powerful and growing field of data science and facilitates the efforts of researchers, business managers, analysts, developers, data scientists, physicists, statisticians, infrastructure developers, academics, and policymakers to improve operations, profitability, and communications within their businesses and institutions. Spanning a broad array of disciplines focusing on novel big data technologies, policies, and innovations, the Journal brings together the community to address current challenges and enforce effective efforts to organize, store, disseminate, protect, manipulate, and, most importantly, find the most effective strategies to make this incredible amount of information work to benefit society, industry, academia, and government. Big Data coverage includes: Big data industry standards, New technologies being developed specifically for big data, Data acquisition, cleaning, distribution, and best practices, Data protection, privacy, and policy, Business interests from research to product, The changing role of business intelligence, Visualization and design principles of big data infrastructures, Physical interfaces and robotics, Social networking advantages for Facebook, Twitter, Amazon, Google, etc, Opportunities around big data and how companies can harness it to their advantage.