Python pandas.DataFrame.from_dict 不使用 OrderedDict 保留顺序

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/33752819/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 13:55:24  来源:igfitidea点击:

pandas.DataFrame.from_dict not preserving order using OrderedDict

pythonpandaspython-collections

提问by dkapitan

I want to import OData XML datafeeds from the Dutch Bureau of Statistics (CBS) into our database. Using lxml and pandas I thought this should be straigtforward. By using OrderDict I want to preserve the order of the columns for readability, but somehow I can't get it right.

我想将来自荷兰统计局 (CBS) 的 OData XML 数据馈送导入我们的数据库。使用 lxml 和 pandas 我认为这应该是直截了当的。通过使用 OrderDict 我想保留列的顺序以提高可读性,但不知何故我无法做到正确。

from collections import OrderedDict
from lxml import etree
import requests
import pandas as pd


# CBS URLs
base_url = 'http://opendata.cbs.nl/ODataFeed/odata'
datasets = ['/37296ned', '/82245NED']

feed = requests.get(base_url + datasets[1] + '/TypedDataSet')
root = etree.fromstring(feed.content)

# all record entries start at tag m:properties, parse into data dict
data = []
for record in root.iter('{{{}}}properties'.format(root.nsmap['m'])):
    row = OrderedDict()
    for element in record:
        row[element.tag.split('}')[1]] = element.text
    data.append(row)

df = pd.DataFrame.from_dict(data)
df.columns

Inspecting data, the OrderDict is in the right order. But looking at df.head()the columns have been sorted alphabetically with CAPS first?

检查data,OrderDict 的顺序是正确的。但是查看df.head()已按字母顺序排列的列,先使用大写字母?

Help, anyone?

帮助,有人吗?

采纳答案by chris-sc

Something in your example seems to be inconsistent, as datais a listand no dict, but assuming you really have an OrderedDict:

您的示例中的某些内容似乎不一致,dataalist和 no 也是如此dict,但假设您确实有一个OrderedDict

Try to explicitly specify your column order when you create your DataFrame:

创建 DataFrame 时,尝试明确指定列顺序:

# ... all your data collection
df = pd.DataFrame(data, columns=data.keys())

This should give you your DataFrame with the columns ordered just in exact the way they are in the OrderedDict (via the data.keys()generated list)

这应该为您的 DataFrame 提供与它们在 OrderedDict 中完全相同的列排序方式(通过data.keys()生成的列表)

回答by Daniel Wu

The above answer doesn't work for me and keep giving me "ValueError: cannot use columns parameter with orient='columns'".

上面的答案对我不起作用,并不断给我“ValueError: cannot use columns parameter with orient='columns'”。

Later I found a solution by doing this below and worked:

后来我通过在下面执行此操作找到了解决方案并工作:

df = pd.DataFrame.from_dict (dict_data) [list (dict_data[0].keys())]