Pandas：创建一个以列列表作为值的字典

Question

提问by FaCoffee

Given this DataFrame:

鉴于此DataFrame：

import pandas as pd
first=[0,1,2,3,4]
second=[10.2,5.7,7.4,17.1,86.11]
third=['a','b','c','d','e']
fourth=['z','zz','zzz','zzzz','zzzzz']
df=pd.DataFrame({'first':first,'second':second,'third':third,'fourth':fourth})
df=df[['first','second','third','fourth']]

   first  second third fourth
0      0   10.20     a      z
1      1    5.70     b     zz
2      2    7.40     c    zzz
3      3   17.10     d   zzzz
4      4   86.11     e  zzzzz

I can create a dictionary out of dfusing

我可以创建一个不df使用的字典

a=df.set_index('first')['second'].to_dict()

so that I can decide what is keysand what is values. But what if you want valuesto be a list of columns, such as secondAND third?

这样我就可以决定什么是keys什么是什么values。但是如果你想values成为一个列的列表，比如secondANDthird呢？

If I try this

如果我试试这个

b=df.set_index('first')[['second','third']].to_dict()

I get a weird dictionary of dictionaries

我得到一本奇怪的字典

{'second': {0: 10.199999999999999,
  1: 5.7000000000000002,
  2: 7.4000000000000004,
  3: 17.100000000000001,
  4: 86.109999999999999},
 'third': {0: 'a', 1: 'b', 2: 'c', 3: 'd', 4: 'e'}}

Instead, I want a dictionary of lists

相反，我想要一个列表字典

{0: [10.199999999999999,a],
 1: [5.7000000000000002,b],
 2: [7.4000000000000004,c],
 3: [17.100000000000001,d],
 4: [86.109999999999999,e]}

How to deal with this?

如何处理？

Answer 1

采纳答案by blacksite

Someone else can probably chime in with a pure-pandas solution, but in a pinch I think this ought to work for you. You'd basically create the dictionary on-the-fly, indexing values in each row instead.

其他人可能会加入纯Pandas解决方案，但在紧要关头，我认为这应该适合你。您基本上是即时创建字典，而不是在每一行中索引值。

d = {df.loc[idx, 'first']: [df.loc[idx, 'second'], df.loc[idx, 'third']] for idx in range(df.shape[0])}

d
Out[5]: 
{0: [10.199999999999999, 'a'],
 1: [5.7000000000000002, 'b'],
 2: [7.4000000000000004, 'c'],
 3: [17.100000000000001, 'd'],
 4: [86.109999999999999, 'e']}

Edit: You could also do this:

编辑：你也可以这样做：

df['new'] = list(zip(df['second'], df['third']))

df
Out[25]: 
   first  second third fourth         new
0      0   10.20     a      z   (10.2, a)
1      1    5.70     b     zz    (5.7, b)
2      2    7.40     c    zzz    (7.4, c)
3      3   17.10     d   zzzz   (17.1, d)
4      4   86.11     e  zzzzz  (86.11, e)

df = df[['first', 'new']]

df
Out[27]: 
   first         new
0      0   (10.2, a)
1      1    (5.7, b)
2      2    (7.4, c)
3      3   (17.1, d)
4      4  (86.11, e)

df.set_index('first').to_dict()
Out[28]: 
{'new': {0: (10.199999999999999, 'a'),
  1: (5.7000000000000002, 'b'),
  2: (7.4000000000000004, 'c'),
  3: (17.100000000000001, 'd'),
  4: (86.109999999999999, 'e')}}

In this approach, you would first create the list (or tuple), you want to keep and then "drop" the other columns. This is basically your original approach, modified.

在这种方法中，您将首先创建要保留的列表（或元组），然后“删除”其他列。这基本上是你原来的方法，修改。

And if you really wanted lists instead of tuples, just mapthe listtype onto that 'new'column:

如果你真的通缉名单，而不是元组，只是map在list类型上是'new'列：

df['new'] = list(map(list, zip(df['second'], df['third'])))

Answer 2

回答by jezrael

You can create numpy arrayby values, zipby column firstand convert to dict:

您可以创建numpy array通过values，zip通过柱first并转换为dict：

a = dict(zip(df['first'], df[['second','third']].values.tolist()))
print (a)
{0: [10.2, 'a'], 1: [5.7, 'b'], 2: [7.4, 'c'], 3: [17.1, 'd'], 4: [86.11, 'e']}

Answer 3

回答by EdChum

You can zipthe values:

您可以使用zip以下值：

In [118]:
b=df.set_index('first')[['second','third']].values.tolist()
dict(zip(df['first'].index,b))

Out[118]:
{0: [10.2, 'a'], 1: [5.7, 'b'], 2: [7.4, 'c'], 3: [17.1, 'd'], 4: [86.11, 'e']}

Pandas：创建一个以列列表作为值的字典

提问by FaCoffee

采纳答案by blacksite

回答by jezrael

回答by EdChum

相关推荐

最近更新

标签

Pandas：创建一个以列列表作为值的字典

提问by FaCoffee

采纳答案by blacksite

回答by jezrael

回答by EdChum

相关推荐

使用 lambda 条件和 Pandas str.contains 来合并字符串

使用列中的日期范围扩展 Pandas 数据框

pandas 如何更改数据框中的日期格式？

pandas 如何在计数图中的条形顶部显示计数值？

相关推荐

最近更新

标签