python pandas数据框到字典
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/18695605/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
python pandas dataframe to dictionary
提问by perigee
I've a two columns dataframe, and intend to convert it to python dictionary - the first column will be the key and the second will be the value. Thank you in advance.
我有一个两列数据框,并打算将其转换为 python 字典 - 第一列将是键,第二列将是值。先感谢您。
Dataframe:
数据框:
id value
0 0 10.2
1 1 5.7
2 2 7.4
回答by joris
See the docs for to_dict
. You can use it like this:
请参阅文档to_dict
。你可以这样使用它:
df.set_index('id').to_dict()
And if you have only one column, to avoid the column name is also a level in the dict (actually, in this case you use the Series.to_dict()
):
如果你只有一列,为了避免列名也是字典中的一个级别(实际上,在这种情况下你使用Series.to_dict()
):
df.set_index('id')['value'].to_dict()
回答by dalloliogm
The answers by joris in this thread and by punchagan in the duplicated threadare very elegant, however they will not give correct results if the column used for the keys contains any duplicated value.
此线程中 joris 和重复线程中的 Punchagan 的答案非常优雅,但是如果用于键的列包含任何重复值,它们将不会给出正确的结果。
For example:
例如:
>>> ptest = p.DataFrame([['a',1],['a',2],['b',3]], columns=['id', 'value'])
>>> ptest
id value
0 a 1
1 a 2
2 b 3
# note that in both cases the association a->1 is lost:
>>> ptest.set_index('id')['value'].to_dict()
{'a': 2, 'b': 3}
>>> dict(zip(ptest.id, ptest.value))
{'a': 2, 'b': 3}
If you have duplicated entries and do not want to lose them, you can use this ugly but working code:
如果您有重复的条目并且不想丢失它们,您可以使用这个丑陋但有效的代码:
>>> mydict = {}
>>> for x in range(len(ptest)):
... currentid = ptest.iloc[x,0]
... currentvalue = ptest.iloc[x,1]
... mydict.setdefault(currentid, [])
... mydict[currentid].append(currentvalue)
>>> mydict
{'a': [1, 2], 'b': [3]}
回答by DSM
If you want a simple way to preserve duplicates, you could use groupby
:
如果您想要一种简单的方法来保留重复项,您可以使用groupby
:
>>> ptest = pd.DataFrame([['a',1],['a',2],['b',3]], columns=['id', 'value'])
>>> ptest
id value
0 a 1
1 a 2
2 b 3
>>> {k: g["value"].tolist() for k,g in ptest.groupby("id")}
{'a': [1, 2], 'b': [3]}
回答by praful gupta
mydict = dict(zip(df.id, df.value))
回答by user1376377
Another (slightly shorter) solution for not losing duplicate entries:
另一个(略短)不丢失重复条目的解决方案:
>>> ptest = pd.DataFrame([['a',1],['a',2],['b',3]], columns=['id','value'])
>>> ptest
id value
0 a 1
1 a 2
2 b 3
>>> pdict = dict()
>>> for i in ptest['id'].unique().tolist():
... ptest_slice = ptest[ptest['id'] == i]
... pdict[i] = ptest_slice['value'].tolist()
...
>>> pdict
{'b': [3], 'a': [1, 2]}
回答by Vincent Appiah
in some versions the code below might not work
在某些版本中,下面的代码可能不起作用
mydict = dict(zip(df.id, df.value))
so make it explicit
所以要明确
id_=df.id.values
value=df.value.values
mydict=dict(zip(id_,value))
Notei used id_ because the word id is reserved word
注意我使用了 id_ 因为这个词 id 是保留字
回答by Dmitry
You need a list as a dictionary value. This code will do the trick.
您需要一个列表作为字典值。这段代码可以解决问题。
from collections import defaultdict
mydict = defaultdict(list)
for k, v in zip(df.id.values,df.value.values):
mydict[k].append(v)
回答by Dongwan Kim
You can use 'dict comprehension'
您可以使用“字典理解”
my_dict = {row[0]: row[1] for row in df.values}
回答by Gil Baggio
Simplest solution:
最简单的解决方案:
df.set_index('id').T.to_dict('records')
Example:
例子:
df= pd.DataFrame([['a',1],['a',2],['b',3]], columns=['id','value'])
df.set_index('id').T.to_dict('records')
If you have multiple values, like val1, val2, val3,etc and u want them as lists, then use the below code:
如果您有多个值,例如 val1、val2、val3 等,并且您希望将它们作为列表,请使用以下代码:
df.set_index('id').T.to_dict('list')
回答by SummersKing
def get_dict_from_pd(df, key_col, row_col):
result = dict()
for i in set(df[key_col].values):
is_i = df[key_col] == i
result[i] = list(df[is_i][row_col].values)
return result
this is my sloution, a basic loop
这是我的 slotion,一个基本的循环