如何将原始 javascript 对象转换为 python 字典?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/24027589/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-10-28 02:01:43  来源:igfitidea点击:

How to convert raw javascript object to python dictionary?

javascriptpythonjsonweb-scraping

提问by kev

When screen-scraping some website, I extract data from <script>tags.
The data I get is not in standard JSONformat. I cannot use json.loads().

当屏幕抓取某个网站时,我从<script>标签中提取数据。
我得到的数据不是标准JSON格式。我无法使用json.loads().

# from
js_obj = '{x:1, y:2, z:3}'

# to
py_obj = {'x':1, 'y':2, 'z':3}

Currently, I use regexto transform the raw data to JSONformat.
But I feel pretty bad when I encounter complicated data structure.

目前,我使用regex将原始数据转换为JSON格式。
但是当我遇到复杂的数据结构时,我感觉很糟糕。

Do you have some better solutions?

你有更好的解决方案吗?

回答by kev

demjson.decode()

demjson.decode()

import demjson

# from
js_obj = '{x:1, y:2, z:3}'

# to
py_obj = demjson.decode(js_obj)

jsonnet.evaluate_snippet()

jsonnet.evaluate_snippet()

import json, _jsonnet

# from
js_obj = '{x:1, y:2, z:3}'

# to
py_obj = json.loads(_jsonnet.evaluate_snippet('snippet', js_obj))

ast.literal_eval()

ast.literal_eval()

import ast

# from
js_obj = "{'x':1, 'y':2, 'z':3}"

# to
py_obj = ast.literal_eval(js_obj)

回答by chrisb

This will likely not work everywhere, but as a start, here's a simple regex that should convert the keys into quoted strings so you can pass into json.loads. Or is this what you're already doing?

这可能不会在任何地方都有效,但作为开始,这里有一个简单的正则表达式,它应该将键转换为带引号的字符串,以便您可以传递到 json.loads。或者这就是你已经在做的事情?

In[70] : quote_keys_regex = r'([\{\s,])(\w+)(:)'

In[71] : re.sub(quote_keys_regex, r'""', js_obj)
Out[71]: '{"x":1, "y":2, "z":3}'

In[72] : js_obj_2 = '{x:1, y:2, z:{k:3,j:2}}'

Int[73]: re.sub(quote_keys_regex, r'""', js_obj_2)
Out[73]: '{"x":1, "y":2, "z":{"k":3,"j":2}}'

回答by thiagola92

Not including objects

不包括对象

json.loads()

json.loads()

  • json.loads()doesn't accept undefined, you have to change to null
  • json.loads()onlyaccept double quotes
    • {"foo": 1, "bar": null}
  • json.loads()不接受undefined,您必须更改为null
  • json.loads()接受双引号
    • {"foo": 1, "bar": null}

Use this if you are sure that your javascript code only have double quotes on key names.

如果您确定您的 javascript 代码在键名上只有双引号,请使用此选项。

import json

json_text = """{"foo": 1, "bar": undefined}"""
json_text = re.sub(r'("\s*:\s*)undefined(\s*[,}])', '\1null\2', json_text)

py_obj = json.loads(json_text)

ast.literal_eval()

ast.literal_eval()

  • ast.literal_eval()doesn't accept undefined, you have to change to None
  • ast.literal_eval()doesn't accept null, you have to change to None
  • ast.literal_eval()doesn't accept true, you have to change to True
  • ast.literal_eval()doesn't accept false, you have to change to False
  • ast.literal_eval()accept single and double quotes
    • {"foo": 1, "bar": None}or {'foo': 1, 'bar': None}
  • ast.literal_eval()不接受undefined,你必须改为None
  • ast.literal_eval()不接受null,您必须更改为None
  • ast.literal_eval()不接受true,您必须更改为True
  • ast.literal_eval()不接受false,您必须更改为False
  • ast.literal_eval()接受单引号和双引号
    • {"foo": 1, "bar": None}或者 {'foo': 1, 'bar': None}
import ast

js_obj = """{'foo': 1, 'bar': undefined}"""
js_obj = re.sub(r'([\'\"]\s*:\s*)undefined(\s*[,}])', '\1None\2', js_obj)
js_obj = re.sub(r'([\'\"]\s*:\s*)null(\s*[,}])', '\1None\2', js_obj)
js_obj = re.sub(r'([\'\"]\s*:\s*)NaN(\s*[,}])', '\1None\2', js_obj)
js_obj = re.sub(r'([\'\"]\s*:\s*)true(\s*[,}])', '\1True\2', js_obj)
js_obj = re.sub(r'([\'\"]\s*:\s*)false(\s*[,}])', '\1False\2', js_obj)

py_obj = ast.literal_eval(js_obj) 

回答by Chris Billington

If you have nodeavailable on the system, you can ask it to evaluate the javascript expression for you, and print the stringified result. The resulting JSON can then be fed to json.loads:

如果您node在系统上可用,您可以要求它为您评估 javascript 表达式,并打印字符串化的结果。然后可以将生成的 JSON 提供给json.loads

def evaluate_javascript(s):
    """Evaluate and stringify a javascript expression in node.js, and convert the
    resulting JSON to a Python object"""
    node = Popen(['node', '-'], stdin=PIPE, stdout=PIPE)
    stdout, _ = node.communicate(f'console.log(JSON.stringify({s}))'.encode('utf8'))
    return json.loads(stdout.decode('utf8'))

回答by clw

Simply:

简单地:

import json
py_obj = json.loads(js_obj_stringified)

Above is the Python portion of the code. In javascript portion of the code:

以上是代码的 Python 部分。在代码的 javascript 部分:

js_obj_stringified = JSON.stringify(data);

JSON.stringify turns a Javascript object into JSON text and stores that JSON text in a string. It is a safe way to pass (via POST/GET) a javascript object to python to process.

JSON.stringify 将 Javascript 对象转换为 JSON 文本并将该 JSON 文本存储在字符串中。这是将 javascript 对象(通过 POST/GET)传递给 python 进行处理的一种安全方式。