按需 JSON:解析文档的更好方法?

John Keiser, Daniel Lemire
{"title":"按需 JSON:解析文档的更好方法?","authors":"John Keiser, Daniel Lemire","doi":"10.1002/spe.3313","DOIUrl":null,"url":null,"abstract":"JSON is a popular standard for data interchange on the Internet. Ingesting JSON documents can be a performance bottleneck. A popular parsing strategy consists in converting the input text into a tree-based data structure—sometimes called a Document Object Model or DOM. We designed and implemented a novel JSON parsing interface—called On-Demand—that appears to the programmer like a conventional DOM-based approach. However, the underlying implementation is a pointer iterating through the content, only materializing the results (objects, arrays, strings, numbers) lazily. On recent commodity processors, an implementation of our approach provides superior performance in multiple benchmarks. To ensure reproducibility, our work is freely available as open source software. Several systems use On Demand: for example, Apache Doris, the Node.js JavaScript runtime, Milvus, and Velox.","PeriodicalId":21899,"journal":{"name":"Software: Practice and Experience","volume":"47 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-01-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"On-demand JSON: A better way to parse documents?\",\"authors\":\"John Keiser, Daniel Lemire\",\"doi\":\"10.1002/spe.3313\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"JSON is a popular standard for data interchange on the Internet. Ingesting JSON documents can be a performance bottleneck. A popular parsing strategy consists in converting the input text into a tree-based data structure—sometimes called a Document Object Model or DOM. We designed and implemented a novel JSON parsing interface—called On-Demand—that appears to the programmer like a conventional DOM-based approach. However, the underlying implementation is a pointer iterating through the content, only materializing the results (objects, arrays, strings, numbers) lazily. On recent commodity processors, an implementation of our approach provides superior performance in multiple benchmarks. To ensure reproducibility, our work is freely available as open source software. Several systems use On Demand: for example, Apache Doris, the Node.js JavaScript runtime, Milvus, and Velox.\",\"PeriodicalId\":21899,\"journal\":{\"name\":\"Software: Practice and Experience\",\"volume\":\"47 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-01-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Software: Practice and Experience\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1002/spe.3313\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Software: Practice and Experience","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1002/spe.3313","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

JSON 是互联网上一种流行的数据交换标准。接收 JSON 文档可能会成为性能瓶颈。一种流行的解析策略是将输入文本转换为基于树的数据结构--有时称为文档对象模型或 DOM。我们设计并实现了一种新颖的 JSON 解析界面(称为 On-Demand),它在程序员看来就像传统的基于 DOM 的方法。然而,其底层实现是指针在内容中迭代,只是将结果(对象、数组、字符串、数字)懒散地具体化。在最新的商品处理器上,我们的方法在多个基准测试中都取得了优异的性能。为确保可重现性,我们的工作作为开源软件免费提供。多个系统使用了 On Demand,例如 Apache Doris、Node.js JavaScript 运行时、Milvus 和 Velox。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
On-demand JSON: A better way to parse documents?
JSON is a popular standard for data interchange on the Internet. Ingesting JSON documents can be a performance bottleneck. A popular parsing strategy consists in converting the input text into a tree-based data structure—sometimes called a Document Object Model or DOM. We designed and implemented a novel JSON parsing interface—called On-Demand—that appears to the programmer like a conventional DOM-based approach. However, the underlying implementation is a pointer iterating through the content, only materializing the results (objects, arrays, strings, numbers) lazily. On recent commodity processors, an implementation of our approach provides superior performance in multiple benchmarks. To ensure reproducibility, our work is freely available as open source software. Several systems use On Demand: for example, Apache Doris, the Node.js JavaScript runtime, Milvus, and Velox.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信