Python GroupBy 结果到列表字典

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/29876184/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 05:07:10  来源:igfitidea点击:

GroupBy results to dictionary of lists

pythonpandasxlrd

提问by SuperDougDougy

I have an excel sheet that looks like so:

我有一个看起来像这样的excel表:

Column1 Column2 Column3
0       23      1
1       5       2
1       2       3
1       19      5
2       56      1
2       22      2
3       2       4
3       14      5
4       59      1
5       44      1
5       1       2
5       87      3

And I'm looking to extract that data, group it by column 1, and add it to a dictionary so it appears like this:

我希望提取该数据,按第 1 列对其进行分组,然后将其添加到字典中,使其显示如下:

{0: [1],
1: [2,3,5],
2: [1,2],
3: [4,5],
4: [1],
5: [1,2,3]}

This is my code so far

到目前为止,这是我的代码

excel = pandas.read_excel(r"e:\test_data.xlsx", sheetname='mySheet', parse_cols'A,C')
myTable = excel.groupby("Column1").groups
print myTable

However, my output looks like this:

但是,我的输出如下所示:

{0: [0L], 1: [1L, 2L, 3L], 2: [4L, 5L], 3: [6L, 7L], 4: [8L], 5: [9L, 10L, 11L]}

Thanks!

谢谢!

采纳答案by Zero

You could groupbyon Column1and then take Column3to apply(list)and call to_dict?

你可以groupbyColumn1再取Column3apply(list)和呼叫to_dict

In [81]: df.groupby('Column1')['Column3'].apply(list).to_dict()
Out[81]: {0: [1], 1: [2, 3, 5], 2: [1, 2], 3: [4, 5], 4: [1], 5: [1, 2, 3]}

Or, do

或者,做

In [433]: {k: list(v) for k, v in df.groupby('Column1')['Column3']}
Out[433]: {0: [1], 1: [2, 3, 5], 2: [1, 2], 3: [4, 5], 4: [1], 5: [1, 2, 3]}

回答by EdChum

According to the docs, the GroupBy.groups:

根据文档GroupBy.groups

is a dict whose keys are the computed unique groups and corresponding values being the axis labelsbelonging to each group.

是一个字典,其键是计算出的唯一组,对应的值是属于每个组的轴标签

If you want the values themselves, you can groupby'Column1' and then call applyand pass the listmethod to apply to each group.

如果您想要这些值本身,您可以groupby“列 1”,然后调用apply并传递该list方法以应用于每个组。

You can then convert it to a dict as desired:

然后,您可以根据需要将其转换为 dict:

In [5]:

dict(df.groupby('Column1')['Column3'].apply(list))
Out[5]:
{0: [1], 1: [2, 3, 5], 2: [1, 2], 3: [4, 5], 4: [1], 5: [1, 2, 3]}

(Note: have a look at this SO questionfor why the numbers are followed by L)

(注意:看看这个 SO 问题为什么数字后面跟着L