对包含列表的 Pandas 列进行分组操作

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/19635048/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-13 21:17:14  来源:igfitidea点击:

Group operations on Pandas column containing lists

pythonpandas

提问by urschrei

I have a DataFrame containing a column, props, which contains lists of strings.

我有一个包含列的 DataFrame props,其中包含字符串列表。

Ideally, I'd like to group by this column, but I predictably get an error when I do:

理想情况下,我想按此列分组,但可以预见的是,当我这样做时会出现错误:

TypeError: unhashable type: 'list'

Is there a sensible way to re-arrange my DataFrame so I can work with these values?

有没有一种明智的方法来重新排列我的 DataFrame 以便我可以使用这些值?

回答by Matti Lyra

You can convert the lists of strings into tuples of strings. Tuples are hashable, as they are unmutable. This is of course assuming that you don't need to be adding to or removing from those lists after creation.

您可以将字符串列表转换为字符串元组。元组是可散列的,因为它们是不可变的。这当然是假设您不需要在创建后添加到这些列表或从这些列表中删除。

回答by miku

You can use the immutable counterpart to lists, which are tuples:

您可以使用列表的不可变对应物,它们是元组

>>> import pandas as pd
>>> df = pd.DataFrame([[[1, 2], 'ab'], [[2, 3], 'bc']])
>>> df.groupby(0).groups
...
... TypeError: unhashable type: 'list'

You could applythe conversion on the appropriate column:

您可以apply在适当的列上进行转换:

>>> df[0] = df[0].apply(tuple)
>>> df.groupby(0).groups
{(1, 2): [0], (2, 3): [1]}