Python pandas.DataFrame.from_dict 不使用 OrderedDict 保留顺序
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/33752819/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
pandas.DataFrame.from_dict not preserving order using OrderedDict
提问by dkapitan
I want to import OData XML datafeeds from the Dutch Bureau of Statistics (CBS) into our database. Using lxml and pandas I thought this should be straigtforward. By using OrderDict I want to preserve the order of the columns for readability, but somehow I can't get it right.
我想将来自荷兰统计局 (CBS) 的 OData XML 数据馈送导入我们的数据库。使用 lxml 和 pandas 我认为这应该是直截了当的。通过使用 OrderDict 我想保留列的顺序以提高可读性,但不知何故我无法做到正确。
from collections import OrderedDict
from lxml import etree
import requests
import pandas as pd
# CBS URLs
base_url = 'http://opendata.cbs.nl/ODataFeed/odata'
datasets = ['/37296ned', '/82245NED']
feed = requests.get(base_url + datasets[1] + '/TypedDataSet')
root = etree.fromstring(feed.content)
# all record entries start at tag m:properties, parse into data dict
data = []
for record in root.iter('{{{}}}properties'.format(root.nsmap['m'])):
row = OrderedDict()
for element in record:
row[element.tag.split('}')[1]] = element.text
data.append(row)
df = pd.DataFrame.from_dict(data)
df.columns
Inspecting data
, the OrderDict is in the right order. But looking at df.head()
the columns have been sorted alphabetically with CAPS first?
检查data
,OrderDict 的顺序是正确的。但是查看df.head()
已按字母顺序排列的列,先使用大写字母?
Help, anyone?
帮助,有人吗?
采纳答案by chris-sc
Something in your example seems to be inconsistent, as data
is a list
and no dict
, but assuming you really have an OrderedDict
:
您的示例中的某些内容似乎不一致,data
alist
和 no 也是如此dict
,但假设您确实有一个OrderedDict
:
Try to explicitly specify your column order when you create your DataFrame:
创建 DataFrame 时,尝试明确指定列顺序:
# ... all your data collection
df = pd.DataFrame(data, columns=data.keys())
This should give you your DataFrame with the columns ordered just in exact the way they are in the OrderedDict (via the data.keys()
generated list)
这应该为您的 DataFrame 提供与它们在 OrderedDict 中完全相同的列排序方式(通过data.keys()
生成的列表)
回答by Daniel Wu
The above answer doesn't work for me and keep giving me "ValueError: cannot use columns parameter with orient='columns'".
上面的答案对我不起作用,并不断给我“ValueError: cannot use columns parameter with orient='columns'”。
Later I found a solution by doing this below and worked:
后来我通过在下面执行此操作找到了解决方案并工作:
df = pd.DataFrame.from_dict (dict_data) [list (dict_data[0].keys())]