Python GroupBy 结果到列表字典
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/29876184/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
GroupBy results to dictionary of lists
提问by SuperDougDougy
I have an excel sheet that looks like so:
我有一个看起来像这样的excel表:
Column1 Column2 Column3
0 23 1
1 5 2
1 2 3
1 19 5
2 56 1
2 22 2
3 2 4
3 14 5
4 59 1
5 44 1
5 1 2
5 87 3
And I'm looking to extract that data, group it by column 1, and add it to a dictionary so it appears like this:
我希望提取该数据,按第 1 列对其进行分组,然后将其添加到字典中,使其显示如下:
{0: [1],
1: [2,3,5],
2: [1,2],
3: [4,5],
4: [1],
5: [1,2,3]}
This is my code so far
到目前为止,这是我的代码
excel = pandas.read_excel(r"e:\test_data.xlsx", sheetname='mySheet', parse_cols'A,C')
myTable = excel.groupby("Column1").groups
print myTable
However, my output looks like this:
但是,我的输出如下所示:
{0: [0L], 1: [1L, 2L, 3L], 2: [4L, 5L], 3: [6L, 7L], 4: [8L], 5: [9L, 10L, 11L]}
Thanks!
谢谢!
采纳答案by Zero
You could groupby
on Column1
and then take Column3
to apply(list)
and call to_dict
?
你可以groupby
上Column1
再取Column3
来apply(list)
和呼叫to_dict
?
In [81]: df.groupby('Column1')['Column3'].apply(list).to_dict()
Out[81]: {0: [1], 1: [2, 3, 5], 2: [1, 2], 3: [4, 5], 4: [1], 5: [1, 2, 3]}
Or, do
或者,做
In [433]: {k: list(v) for k, v in df.groupby('Column1')['Column3']}
Out[433]: {0: [1], 1: [2, 3, 5], 2: [1, 2], 3: [4, 5], 4: [1], 5: [1, 2, 3]}
回答by EdChum
According to the docs, the GroupBy.groups
:
根据文档,GroupBy.groups
:
is a dict whose keys are the computed unique groups and corresponding values being the axis labelsbelonging to each group.
是一个字典,其键是计算出的唯一组,对应的值是属于每个组的轴标签。
If you want the values themselves, you can groupby
'Column1' and then call apply
and pass the list
method to apply to each group.
如果您想要这些值本身,您可以groupby
“列 1”,然后调用apply
并传递该list
方法以应用于每个组。
You can then convert it to a dict as desired:
然后,您可以根据需要将其转换为 dict:
In [5]:
dict(df.groupby('Column1')['Column3'].apply(list))
Out[5]:
{0: [1], 1: [2, 3, 5], 2: [1, 2], 3: [4, 5], 4: [1], 5: [1, 2, 3]}
(Note: have a look at this SO questionfor why the numbers are followed by L
)
(注意:看看这个 SO 问题为什么数字后面跟着L
)