如何用 Python 解析有点错误的 JSON?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/1931454/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-11-03 23:23:11  来源:igfitidea点击:

How to parse somewhat wrong JSON with Python?

pythonjsongoogle-app-engine

提问by Serge Tarkovski

I have a following JSON string coming from external input source:

我有以下来自外部输入源的 JSON 字符串:

{value: "82363549923gnyh49c9djl239pjm01223", id: 17893}

This is wrong-formatted JSON string ("id" and "value" must be in quotes), but I need to parse it anyway. I have tried simplejson and json-py and seems they could not be set up to parse such strings.

这是格式错误的 JSON 字符串(“id”和“value”必须用引号引起来),但无论如何我都需要解析它。我尝试过 simplejson 和 json-py,似乎无法设置它们来解析此类字符串。

I am running Python 2.5 on Google App engine, so any C-based solutions like python-cjson are not applicable.

我在 Google App 引擎上运行 Python 2.5,因此任何基于 C 的解决方案(如 python-cjson)都不适用。

Input format could be changed to XML or YAML, in adition to JSON listed above, but I am using JSON within the project and changing format in specific place would not be very good.

除了上面列出的JSON之外,输入格式可以更改为XML或YAML,但我在项目中使用JSON并且在特定位置更改格式不会很好。

Now I've switched to XML and parsing the data successfully, but looking forward to any solution that would allow me to switch back to JSON.

现在我已经切换到 XML 并成功解析数据,但期待任何可以让我切换回 JSON 的解决方案。

回答by mykhal

since YAML (>=1.2) is a superset of JSON, you can do:

由于 YAML (>=1.2) 是 JSON 的超集,您可以执行以下操作:

>>> import yaml
>>> s = '{value: "82363549923gnyh49c9djl239pjm01223", id: 17893}'
>>> yaml.load(s)
{'id': 17893, 'value': '82363549923gnyh49c9djl239pjm01223'}

回答by null

You can use demjson.

您可以使用demjson

>>> import demjson
>>> demjson.decode('{foo:3}')
{u'foo': 3}

回答by davidosomething

You could use a string parser to fix it first, a regex could do it provided that this is as complicated as the JSON will get.

您可以先使用字符串解析器来修复它,如果这与 JSON 一样复杂,则可以使用正则表达式。

回答by PaulMcG

Pyparsing includes a JSON parser example, here is the online source. You could modify the definition of memberDef to allow a non-quoted string for the member name, and then you could use this to parser your not-quite-JSON source text.

Pyparsing 包含一个 JSON 解析器示例,这里是在线源。您可以修改 memberDef 的定义以允许成员名称使用不带引号的字符串,然后您可以使用它来解析您的非 JSON 源文本。

[The August, 2008 issue of Python Magazine has a lot more detailed info about this parser. It shows some sample JSON, and code that accesses the parsed results like it was a deserialized object.

[Python 杂志 2008 年 8 月号有更多关于这个解析器的详细信息。它显示了一些示例 JSON,以及访问解析结果的代码,就像它是一个反序列化对象。