On validation of XML streams using finite state machines

Cristiana Chitic, D. Rosu
{"title":"On validation of XML streams using finite state machines","authors":"Cristiana Chitic, D. Rosu","doi":"10.1145/1017074.1017096","DOIUrl":null,"url":null,"abstract":"We study validation of streamed XML documents by means of finite state machines. Previous work has shown that validation is in principle possible by finite state automata, but the construction was prohibitively expensive, giving an exponential-size nondeterministic automaton. Instead, we want to find deterministic automata for validating streamed documents: for them, the complexity of validation is constant per tag. We show that for a reading window of size one and nonrecursive DTDs with one-unambiguous content (i.e. conforming to the current XML standard) there is an algorithm producing a deterministic automaton that validates documents with respect to that DTD. The size of the automaton is at most exponential and we give matching lower bounds. To capture the possible advantages offered by reading windows of size k, we introduce k-unambiguity as a generalization of one-unambiguity, and study the validation against DTDs with k-unambiguous content. We also consider recursive DTDs and give conditions under which they can be validated against by using one-counter automata.","PeriodicalId":93360,"journal":{"name":"Proceedings of the 5th International Workshop on Exploratory Search in Databases and the Web. International Workshop on Exploratory Search in Databases and the Web (5th : 2018 : Houston, Tex.)","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2004-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"46","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 5th International Workshop on Exploratory Search in Databases and the Web. International Workshop on Exploratory Search in Databases and the Web (5th : 2018 : Houston, Tex.)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/1017074.1017096","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 46

Abstract

We study validation of streamed XML documents by means of finite state machines. Previous work has shown that validation is in principle possible by finite state automata, but the construction was prohibitively expensive, giving an exponential-size nondeterministic automaton. Instead, we want to find deterministic automata for validating streamed documents: for them, the complexity of validation is constant per tag. We show that for a reading window of size one and nonrecursive DTDs with one-unambiguous content (i.e. conforming to the current XML standard) there is an algorithm producing a deterministic automaton that validates documents with respect to that DTD. The size of the automaton is at most exponential and we give matching lower bounds. To capture the possible advantages offered by reading windows of size k, we introduce k-unambiguity as a generalization of one-unambiguity, and study the validation against DTDs with k-unambiguous content. We also consider recursive DTDs and give conditions under which they can be validated against by using one-counter automata.
关于使用有限状态机验证XML流
我们研究了用有限状态机对流XML文档进行验证。先前的工作表明,原则上可以通过有限状态自动机进行验证,但构造过于昂贵,给出了指数大小的不确定性自动机。相反,我们希望找到用于验证流文档的确定性自动机:对于它们,每个标记的验证复杂性是恒定的。我们展示了对于大小为1的读取窗口和具有一个明确内容(即符合当前XML标准)的非递归DTD,存在一种生成确定性自动机的算法,该算法根据该DTD验证文档。自动机的大小不超过指数,我们给出了匹配的下界。为了捕捉大小为k的阅读窗口可能提供的优势,我们引入了k-无歧义作为1 -无歧义的推广,并研究了具有k-无歧义内容的dtd的验证。我们还考虑了递归dtd,并给出了使用单计数器自动机验证它们的条件。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信