Python Beautiful Soup 如何将 JSON 解码为`dict`?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/19915010/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Python Beautiful Soup how to JSON decode to `dict`?
提问by JPC
I'm new to BeautifulSoup in Python and I'm trying to extract dict
from BeautifulSoup.
我是 Python 中 BeautifulSoup 的新手,我正在尝试dict
从 BeautifulSoup 中提取。
I've used BeautifulSoup to extract JSON and got beautifulsoup.beautifulsoup
variable soup
.
我使用 BeautifulSoup 来提取 JSON 并得到beautifulsoup.beautifulsoup
变量soup
。
I'm trying to get values out of soup
, but when I do result = soup.findAll("bill")
I get an empty list []
. How can I extract soup to get dict
result of:
我试图从 中获取值soup
,但是当我这样做时,result = soup.findAll("bill")
我得到了一个空列表[]
。我如何提取汤以获得以下dict
结果:
{u'congress': 113,
u'number': 325,
u'title': u'A bill to ensure the complete and timely payment of the obligations of the United States Government until May 19, 2013, and for other purposes.',
u'type': u'hr'}
print type(soup)
print soup
=> result below
=> 结果如下
BeautifulSoup.BeautifulSoup
{
"bill": {
"congress": 113,
"number": 325,
"title": "A bill to ensure the complete and timely payment of the obligations of the United States Government until May 19, 2013, and for other purposes.",
"type": "hr"
},
"category": "passage",
"chamber": "s"
}
UPDATE
更新
Here is how I got soup
:
这是我如何得到的soup
:
from BeautifulSoup import BeautifulSoup
import urllib2
url = urllib2.urlopen("https://www.govtrack.us/data/congress/113/votes/2013/s11/data.json")
content = url.read()
soup = BeautifulSoup(content)
采纳答案by Rudy Bunel
Not very familiar with BeautifulSoup but if you just need to decode JSON
对 BeautifulSoup 不是很熟悉,但如果你只需要解码 JSON
import json
newDictionary=json.loads(str(soup))
回答by jfs
You could remove BeautifulSoup
:
你可以删除BeautifulSoup
:
import json
import urllib2
url = "https://www.govtrack.us/data/congress/113/votes/2013/s11/data.json"
data = json.load(urllib2.urlopen(url))