来自带有列表的字典的 Pandas DataFrame
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/33504424/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Pandas DataFrame from Dictionary with Lists
提问by Conway
I have an API that returns a single row of data as a Python dictionary. Most of the keys have a single value, but some of the keys have values that are lists (or even lists-of-lists or lists-of-dictionaries).
我有一个 API,它将单行数据作为 Python 字典返回。大多数键只有一个值,但有些键的值是列表(甚至是列表列表或字典列表)。
When I throw the dictionary into pd.DataFrame to try to convert it to a pandas DataFrame, it throws a "Arrays must be the same length" error. This is because it cannot process the keys which have multiple values (i.e. the keys which have values of lists).
当我将字典放入 pd.DataFrame 以尝试将其转换为 Pandas DataFrame 时,它会抛出“数组必须具有相同长度”错误。这是因为它无法处理具有多个值的键(即具有列表值的键)。
How do I get pandas to treat the lists as 'single values'?
如何让Pandas将列表视为“单个值”?
As a hypothetical example:
作为一个假设的例子:
data = { 'building': 'White House', 'DC?': True, 'occupants': ['Barack', 'Michelle', 'Sasha', 'Malia'] }
I want to turn it into a DataFrame like this:
我想把它变成这样的 DataFrame:
ix building DC? occupants
0 'White House' True ['Barack', 'Michelle', 'Sasha', 'Malia']
回答by Andy Hayden
This works if you pass a list (of rows):
如果您传递一个列表(行),这会起作用:
In [11]: pd.DataFrame(data)
Out[11]:
DC? building occupants
0 True White House Barack
1 True White House Michelle
2 True White House Sasha
3 True White House Malia
In [12]: pd.DataFrame([data])
Out[12]:
DC? building occupants
0 True White House [Barack, Michelle, Sasha, Malia]
回答by Chinmay Kanchi
This turns out to be very trivial in the end
这最终证明是非常微不足道的
data = { 'building': 'White House', 'DC?': True, 'occupants': ['Barack', 'Michelle', 'Sasha', 'Malia'] }
df = pandas.DataFrame([data])
print df
Which results in:
结果是:
DC? building occupants
0 True White House [Barack, Michelle, Sasha, Malia]
回答by Tommy Kahn
Would it be acceptable if instead of having one entry with a list of occupants, you had individual entries for each occupant? If so you could just do
如果您没有一个包含住户列表的条目,而是为每个住户设置单独的条目,这是否可以接受?如果是这样,你可以做
n = len(data['occupants'])
for key, val in data.items():
if key != 'occupants':
data[key] = n*[val]
EDIT: Actually, I'm getting this behavior in pandas (i.e. just with pd.DataFrame(data)
) even without this pre-processing. What version are you using?
编辑:实际上,pd.DataFrame(data)
即使没有这种预处理,我也会在 Pandas 中得到这种行为(即只是 with )。你用的是什么版本?
回答by AbtPst
if you know the keys of the dictionary beforehand, why not first create an empty data frame and then keep adding rows?
如果您事先知道字典的键,为什么不先创建一个空数据框,然后继续添加行呢?