Pandas:创建一个以列列表作为值的字典
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/42275521/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Pandas: create a dictionary with a list of columns as values
提问by FaCoffee
Given this DataFrame
:
鉴于此DataFrame
:
import pandas as pd
first=[0,1,2,3,4]
second=[10.2,5.7,7.4,17.1,86.11]
third=['a','b','c','d','e']
fourth=['z','zz','zzz','zzzz','zzzzz']
df=pd.DataFrame({'first':first,'second':second,'third':third,'fourth':fourth})
df=df[['first','second','third','fourth']]
first second third fourth
0 0 10.20 a z
1 1 5.70 b zz
2 2 7.40 c zzz
3 3 17.10 d zzzz
4 4 86.11 e zzzzz
I can create a dictionary out of df
using
我可以创建一个不df
使用的字典
a=df.set_index('first')['second'].to_dict()
so that I can decide what is keys
and what is values
. But what if you want values
to be a list of columns, such as second
AND third
?
这样我就可以决定什么是keys
什么是什么values
。但是如果你想values
成为一个列的列表,比如second
ANDthird
呢?
If I try this
如果我试试这个
b=df.set_index('first')[['second','third']].to_dict()
I get a weird dictionary of dictionaries
我得到一本奇怪的字典
{'second': {0: 10.199999999999999,
1: 5.7000000000000002,
2: 7.4000000000000004,
3: 17.100000000000001,
4: 86.109999999999999},
'third': {0: 'a', 1: 'b', 2: 'c', 3: 'd', 4: 'e'}}
Instead, I want a dictionary of lists
相反,我想要一个列表字典
{0: [10.199999999999999,a],
1: [5.7000000000000002,b],
2: [7.4000000000000004,c],
3: [17.100000000000001,d],
4: [86.109999999999999,e]}
How to deal with this?
如何处理?
采纳答案by blacksite
Someone else can probably chime in with a pure-pandas solution, but in a pinch I think this ought to work for you. You'd basically create the dictionary on-the-fly, indexing values in each row instead.
其他人可能会加入纯Pandas解决方案,但在紧要关头,我认为这应该适合你。您基本上是即时创建字典,而不是在每一行中索引值。
d = {df.loc[idx, 'first']: [df.loc[idx, 'second'], df.loc[idx, 'third']] for idx in range(df.shape[0])}
d
Out[5]:
{0: [10.199999999999999, 'a'],
1: [5.7000000000000002, 'b'],
2: [7.4000000000000004, 'c'],
3: [17.100000000000001, 'd'],
4: [86.109999999999999, 'e']}
Edit: You could also do this:
编辑:你也可以这样做:
df['new'] = list(zip(df['second'], df['third']))
df
Out[25]:
first second third fourth new
0 0 10.20 a z (10.2, a)
1 1 5.70 b zz (5.7, b)
2 2 7.40 c zzz (7.4, c)
3 3 17.10 d zzzz (17.1, d)
4 4 86.11 e zzzzz (86.11, e)
df = df[['first', 'new']]
df
Out[27]:
first new
0 0 (10.2, a)
1 1 (5.7, b)
2 2 (7.4, c)
3 3 (17.1, d)
4 4 (86.11, e)
df.set_index('first').to_dict()
Out[28]:
{'new': {0: (10.199999999999999, 'a'),
1: (5.7000000000000002, 'b'),
2: (7.4000000000000004, 'c'),
3: (17.100000000000001, 'd'),
4: (86.109999999999999, 'e')}}
In this approach, you would first create the list (or tuple), you want to keep and then "drop" the other columns. This is basically your original approach, modified.
在这种方法中,您将首先创建要保留的列表(或元组),然后“删除”其他列。这基本上是你原来的方法,修改。
And if you really wanted lists instead of tuples, just map
the list
type onto that 'new'
column:
如果你真的通缉名单,而不是元组,只是map
在list
类型上是'new'
列:
df['new'] = list(map(list, zip(df['second'], df['third'])))
回答by jezrael
回答by EdChum
You can zip
the values:
您可以使用zip
以下值:
In [118]:
b=df.set_index('first')[['second','third']].values.tolist()
dict(zip(df['first'].index,b))
Out[118]:
{0: [10.2, 'a'], 1: [5.7, 'b'], 2: [7.4, 'c'], 3: [17.1, 'd'], 4: [86.11, 'e']}