json 在 Python 中解析 HTTP 响应
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/23049767/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Parsing HTTP Response in Python
提问by Colton Allen
I want to manipulate the information at THISurl. I can successfully open it and read its contents. But what I really want to do is throw out all the stuff I don't want, and to manipulate the stuff I want to keep.
我想操纵这个网址上的信息。我可以成功打开它并阅读其内容。但我真正想做的是把我不想要的东西都扔掉,操纵我想保留的东西。
Is there a way to convert the string into a dict so I can iterate over it? Or do I just have to parse it as is (str type)?
有没有办法将字符串转换为 dict 以便我可以迭代它?还是我只需要按原样解析它(str 类型)?
from urllib.request import urlopen
url = 'http://www.quandl.com/api/v1/datasets/FRED/GDP.json'
response = urlopen(url)
print(response.read()) # returns string with info
回答by Colton Allen
When I printed response.read()I noticed that bwas preprended to the string (e.g. b'{"a":1,..). The "b" stands for bytes and serves as a declaration for the type of the object you're handling. Since, I knew that a string could be converted to a dict by using json.loads('string'), I just had to convert the byte type to a string type. I did this by decoding the response to utf-8 decode('utf-8'). Once it was in a string type my problem was solved and I was easily able to iterate over the dict.
当我打印时,response.read()我注意到它b被预先添加到字符串(例如b'{"a":1,..)。“b”代表字节,用作您正在处理的对象类型的声明。因为,我知道可以使用 将字符串转换为 dict json.loads('string'),所以我只需要将字节类型转换为字符串类型。我通过解码对 utf-8 的响应来做到这一点decode('utf-8')。一旦它是字符串类型,我的问题就解决了,我可以轻松地遍历dict.
I don't know if this is the fastest or most 'pythonic' way of writing this but it works and theres always time later of optimization and improvement! Full code for my solution:
我不知道这是最快还是最“pythonic”的写法,但它有效,而且总是有时间进行优化和改进!我的解决方案的完整代码:
from urllib.request import urlopen
import json
# Get the dataset
url = 'http://www.quandl.com/api/v1/datasets/FRED/GDP.json'
response = urlopen(url)
# Convert bytes to string type and string type to dict
string = response.read().decode('utf-8')
json_obj = json.loads(string)
print(json_obj['source_name']) # prints the string with 'source_name' key
回答by Shaurya Mittal
You can also use python's requests library instead.
您也可以改用 python 的请求库。
import requests
url = 'http://www.quandl.com/api/v1/datasets/FRED/GDP.json'
response = requests.get(url)
dict = response.json()
Now you can manipulate the "dict" like a python dictionary.
现在您可以像 Python 字典一样操作“dict”。
回答by jfs
jsonworks with Unicode text in Python 3 (JSON format itself is defined only in terms of Unicode text) and therefore you need to decode bytes received in HTTP response. r.headers.get_content_charset('utf-8')gets your the character encoding:
json在 Python 3 中处理 Unicode 文本(JSON 格式本身仅根据 Unicode 文本定义),因此您需要解码 HTTP 响应中收到的字节。r.headers.get_content_charset('utf-8')获取您的字符编码:
#!/usr/bin/env python3
import io
import json
from urllib.request import urlopen
with urlopen('https://httpbin.org/get') as r, \
io.TextIOWrapper(r, encoding=r.headers.get_content_charset('utf-8')) as file:
result = json.load(file)
print(result['headers']['User-Agent'])
It is not necessary to use io.TextIOWrapperhere:
io.TextIOWrapper这里没有必要使用:
#!/usr/bin/env python3
import json
from urllib.request import urlopen
with urlopen('https://httpbin.org/get') as r:
result = json.loads(r.read().decode(r.headers.get_content_charset('utf-8')))
print(result['headers']['User-Agent'])
回答by Ajay Gautam
I guess things have changed in python 3.4. This worked for me:
我想事情在 python 3.4 中发生了变化。这对我有用:
print("resp:" + json.dumps(resp.json()))
回答by FlyingV
TL&DR: When you typically get data from a server, it is sent in bytes. The rationale is that these bytes will need to be 'decoded' by the recipient, who should know how to use the data. You should decode the binary upon arrival to not get 'b' (bytes) but instead a string.
TL&DR:当您通常从服务器获取数据时,它以字节为单位发送。基本原理是这些字节需要由接收者“解码”,接收者应该知道如何使用数据。您应该在到达时解码二进制文件,而不是得到 'b'(字节)而是一个字符串。
Use case:
用例:
import requests
def get_data_from_url(url):
response = requests.get(url_to_visit)
response_data_split_by_line = response.content.decode('utf-8').splitlines()
return response_data_split_by_line
In this example, I decode the content that I received into UTF-8. For my purposes, I then split it by line, so I can loop through each line with a for loop.
在此示例中,我将收到的内容解码为 UTF-8。出于我的目的,我将它逐行拆分,这样我就可以使用 for 循环遍历每一行。

