是否有用于 JSON 的流式 API?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/444380/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Is there a streaming API for JSON?
提问by kal
Is DOM the only way to parse JSON?
DOM 是解析 JSON 的唯一方法吗?
回答by StaxMan
Some JSON parsers do offer incremental ("streaming") parser; for Java, at least following parsers from json.org page offer such an interface:
一些 JSON 解析器确实提供了增量(“流式”)解析器;对于 Java,至少以下来自 json.org 页面的解析器提供了这样的接口:
- Hymanson(pull interface)
- Json-simple(SAX-style push interface)
- Hymanson(拉取接口)
- Json-simple(SAX 风格的推送接口)
(in addition to Software Monkey's parser referred to by another answer)
(除了另一个答案提到的软件猴子的解析器)
Actually, it is kind of odd that so many JSON parsers do NOT offer this simple low-level interface -- after all, they already need to implement low-level parsing, so why not expose it.
实际上,这么多 JSON 解析器不提供这种简单的低级接口有点奇怪——毕竟,他们已经需要实现低级解析,所以为什么不公开它。
EDIT (June 2011): Gson too has its own streaming API(with gson 1.6)
编辑(2011 年 6 月):Gson 也有自己的流 API(使用 gson 1.6)
回答by Lawrence Dol
By DOM, I assume you mean that the parser reads an entire document at once before you can work with it. Note that saying DOM tends to imply XML, these days, but IMO that is not really an accurate inference.
通过 DOM,我假设您的意思是解析器在您可以使用它之前立即读取整个文档。请注意,现在说 DOM 往往意味着 XML,但 IMO 这并不是真正准确的推断。
So, in answer to your questions - "Yes", there are streaming API's and "No", DOM is not the only way. That said, processing a JSON document as a stream is often problematic in that many objects are not simple field/value pairs, but contain other objects as values, which you need to parse to process, and this tends to end up a recursive thing. But for simple messages you can do useful things with a stream/event based parser.
因此,回答您的问题 - “是”,有流 API 和“否”,DOM 不是唯一的方法。也就是说,将 JSON 文档作为流处理通常是有问题的,因为许多对象不是简单的字段/值对,而是包含其他对象作为值,您需要解析这些值以进行处理,这往往会导致递归。但是对于简单的消息,您可以使用基于流/事件的解析器做有用的事情。
I have written a pull-event parser for JSON (it was one class, about 700 lines). But most of the others I have seen are document oriented. One of the layers I have built on top of my parser is a document reader, which took about 30 LOC. I have only ever used my parser in practice as a document loader (for the above reason).
我为 JSON 编写了一个 pull-event 解析器(它是一个类,大约 700 行)。但我见过的大多数其他人都是面向文档的。我在解析器之上构建的层之一是文档阅读器,它占用了大约 30 LOC。我在实践中只将我的解析器用作文档加载器(出于上述原因)。
I am sure if you search the net you will find pull and push based parsers for JSON.
我相信如果您搜索网络,您会发现基于拉和推的 JSON 解析器。
EDIT: I have posted the parserto my site for download. A working compilable class and a complete example are included.
编辑:我已将解析器发布到我的网站以供下载。包括一个工作的可编译类和一个完整的示例。
EDIT2: You'll also want to look at the JSON website.
EDIT2:您还需要查看JSON 网站。
回答by pykler
As stefanB mentioned, http://lloyd.github.com/yajl/is a C library for stream parsing JSON. There are also many wrappers mentioned on that page for other languages:
正如 stefanB 所提到的,http: //lloyd.github.com/yajl/是一个用于流解析 JSON 的 C 库。该页面上还提到了许多其他语言的包装器:
- yajl-ruby - ruby bindings for YAJL
- yajl-objc - Objective-C bindings for YAJL
- YAJL IO bindings (for the IO language)
- Python bindings come in two flavors, py-yajl OR yajl-py
- yajl-js - node.js bindings (mirrored to github).
- lua-yajl - lua bindings
- ooc-yajl - ooc bindings
- yajl-tcl - tcl bindings
- yajl-ruby - YAJL 的 ruby 绑定
- yajl-objc - YAJL 的 Objective-C 绑定
- YAJL IO 绑定(用于 IO 语言)
- Python 绑定有两种风格,py-yajl 或 yajl-py
- yajl-js - node.js 绑定(镜像到 github)。
- lua-yajl - lua 绑定
- ooc-yajl - ooc 绑定
- yajl-tcl - tcl 绑定
some of them may not allow streaming, but many of them certainly do.
其中一些可能不允许流式传输,但其中许多肯定允许。
回答by jimhigson
Disclaimer: I'm suggesting my own project.
免责声明:我建议我自己的项目。
I maintain a streaming JSON parser in Javascript which combines some of the features of SAX and DOM:
我在 Javascript 中维护了一个流式 JSON 解析器,它结合了 SAX 和 DOM 的一些特性:
The idea is to allow streaming parsing, but not require the programmer to listen to lots of different events like with raw SAX. I like SAX but it tends to be quite low level for what most people need. You can listen for any interesting node from the JSON stream by registering JSONPath patterns.
这个想法是允许流式解析,但不需要程序员像原始 SAX 一样监听许多不同的事件。我喜欢 SAX,但对于大多数人的需要来说,它往往是相当低的水平。您可以通过注册 JSONPath 模式从 JSON 流中侦听任何有趣的节点。
The code is on Github here:
代码在 Github 上:
回答by dscape
If you want to use pure javascript and a library that runs both in node.js and in the browser you can try clarinet:
如果你想使用纯 javascript 和一个在 node.js 和浏览器中运行的库,你可以尝试 clarinet:
https://github.com/dscape/clarinet
https://github.com/dscape/clarinet
The parser is event-based, and since it's streaming it makes dealing with huge files possible. The API is very close to sax and the code is forked from sax-js.
解析器是基于事件的,并且由于它是流式传输的,因此可以处理大文件。API 非常接近 sax,代码是从 sax-js 派生出来的。
回答by Tom Chapin
Here's a NodeJS NPM library for parsing and handling streams of JSON: https://npmjs.org/package/JSONStream
这是一个用于解析和处理 JSON 流的 NodeJS NPM 库:https: //npmjs.org/package/JSONStream
回答by haridsv
If you are looking specifically for Python, then ijsonclaims to support it. However, it is only a parser, so I didn't come across anything for Python that can generate json as a stream.
如果您专门寻找 Python,那么ijson声称支持它。然而,它只是一个解析器,所以我没有遇到任何可以将json生成为流的Python。
For C++ there is rapidjsonthat claims to support both parsing and generation in a streaming manner.
对于 C++,有rapidjson声称以流方式支持解析和生成。
回答by Agnel Kurian
LitJSON supports a streaming-style API. Quoting from the manual:
LitJSON 支持流式 API。引用手册:
"An alternative interface to handling JSON data that might be familiar to some developers is through classes that make it possible to read and write data in a stream-like fashion. These classes are JsonReaderand JsonWriter.
“一些开发人员可能熟悉的另一种处理 JSON 数据的接口是通过类,这些类可以以类似流的方式读取和写入数据。这些类是JsonReader和JsonWriter。
"These two types are in fact the foundation of this library, and the JsonMappertype is built on top of them, so in a way, the developer can think of the reader and writer classes as the low-level programming interface for LitJSON."
“这两种类型实际上是这个库的基础,并且JsonMapper类型建立在它们之上,所以在某种程度上,开发人员可以将读取器和写入器类视为 LitJSON 的低级编程接口。”
回答by Pietro Battiston
For Python, an alternative (apparently lighter and more efficient) to ijson is jsaone(see that link for rough benchmarks, showing that jsaone is approximately 3x faster).
对于 Python,ijson 的替代方案(显然更轻且更高效)是jsaone(请参阅粗略基准的链接,表明 jsaone 大约快 3 倍)。
DISCLAIMER: I'm the author of jsaone, and the tests I made are very basic... I'll be happy to be proven wrong!
免责声明:我是 jsaone 的作者,我所做的测试非常基本......我很高兴被证明是错误的!
回答by stefanB
Answering the question title: YAJLa JSON parser library in C:
回答题名:YAJLa JSON parser library in C:
YAJL remembers all state required to support restarting parsing. This allows parsing to occur incrementally as data is read off a disk or network.
YAJL 会记住支持重新启动解析所需的所有状态。这允许在从磁盘或网络读取数据时增量地进行解析。
So I guess using yajl to parse JSON can be considered as processing stream of data.
所以我猜用yajl解析JSON可以看作是处理数据流。

