对包含列表的 Pandas 列进行分组操作

Question

提问by urschrei

I have a DataFrame containing a column, props, which contains lists of strings.

我有一个包含列的 DataFrame props，其中包含字符串列表。

Ideally, I'd like to group by this column, but I predictably get an error when I do:

理想情况下，我想按此列分组，但可以预见的是，当我这样做时会出现错误：

TypeError: unhashable type: 'list'

Is there a sensible way to re-arrange my DataFrame so I can work with these values?

有没有一种明智的方法来重新排列我的 DataFrame 以便我可以使用这些值？

Answer 1

回答by Matti Lyra

You can convert the lists of strings into tuples of strings. Tuples are hashable, as they are unmutable. This is of course assuming that you don't need to be adding to or removing from those lists after creation.

您可以将字符串列表转换为字符串元组。元组是可散列的，因为它们是不可变的。这当然是假设您不需要在创建后添加到这些列表或从这些列表中删除。

Answer 2

回答by miku

You can use the immutable counterpart to lists, which are tuples:

您可以使用列表的不可变对应物，它们是元组：

>>> import pandas as pd
>>> df = pd.DataFrame([[[1, 2], 'ab'], [[2, 3], 'bc']])
>>> df.groupby(0).groups
...
... TypeError: unhashable type: 'list'

You could applythe conversion on the appropriate column:

您可以apply在适当的列上进行转换：

>>> df[0] = df[0].apply(tuple)
>>> df.groupby(0).groups
{(1, 2): [0], (2, 3): [1]}

对包含列表的 Pandas 列进行分组操作

提问by urschrei

回答by Matti Lyra

回答by miku

相关推荐

最近更新

标签

对包含列表的 Pandas 列进行分组操作

提问by urschrei

回答by Matti Lyra

回答by miku

相关推荐

pandas 使用熊猫叠加多个直方图

pandas - 具有非数字值的pivot_table？（数据错误：没有要聚合的数字类型）

Pandas 导入 CSV 和 Excel 文件错误

pandas sort_values 和 sort_index 有什么区别？

相关推荐

最近更新

标签