Python 如何按熊猫中的值对系列进行分组？

Question

提问by Martín Fixman

I currently have a pandas Serieswith dtype Timestamp, and I want to group it by date (and have many rows with different times in each group).

我目前有一个Series带有 dtype的熊猫Timestamp，我想按日期对其进行分组（并且每组中有许多行的时间不同）。

The seemingly obvious way of doing this would be something similar to

这样做的看似明显的方式将类似于

grouped = s.groupby(lambda x: x.date())

However, pandas' groupbygroups Series by its index. How can I make it group by value instead?

但是，pandasgroupby按其索引对 Series 进行分组。我怎样才能让它按值分组？

Answer 1

回答by mirthbottle

You should convert it to a DataFrame, then add a column that is the date(). You can do groupby on the DataFrame with the date column.

您应该将其转换为 DataFrame，然后添加一列 date()。您可以使用日期列对 DataFrame 进行 groupby。

df = pandas.DataFrame(s, columns=["datetime"])
df["date"] = df["datetime"].apply(lambda x: x.date())
df.groupby("date")

Then "date" becomes your index. You have to do it this way because the final grouped object needs an index so you can do things like select a group.

然后“日期”成为您的索引。您必须这样做，因为最终分组的对象需要一个索引，以便您可以执行诸如选择组之类的操作。

Answer 2

回答by luca

grouped = s.groupby(s)

Or:

或者：

grouped = s.groupby(lambda x: s[x])

Answer 3

回答by Hangyu Liu

Three methods:

三种方法：

DataFrame: pd.groupby(['column']).size()

数据框： pd.groupby(['column']).size()

Series： sel.groupby(sel).size()

系列： sel.groupby(sel).size()

Series to DataFrame:

系列到数据帧：

pd.DataFrame( sel, columns=['column']).groupby(['column']).size()

Answer 4

回答by Andy Jones

For anyone else who wants to do this inline without throwing a lambda in (which tends to kill performance):

对于任何想要内联而不抛出 lambda 的人（这往往会降低性能）：

s.to_frame(0).groupby(0)[0]

Answer 5

回答by mchl_k

To add another suggestion, I often use the following as it uses simple logic:

要添加另一个建议，我经常使用以下内容，因为它使用简单的逻辑：

pd.Series(index=s.values).groupby(level=0)

Python 如何按熊猫中的值对系列进行分组？

提问by Martín Fixman

回答by mirthbottle

回答by luca

回答by Hangyu Liu

回答by Andy Jones

回答by mchl_k

相关推荐

最近更新

标签

Python 如何按熊猫中的值对系列进行分组？

提问by Martín Fixman

回答by mirthbottle

回答by luca

回答by Hangyu Liu

回答by Andy Jones

回答by mchl_k

相关推荐

Python 没有名为“pymysql”的模块

Python 如何将列表的每个元素分配给一个单独的变量？

Python 遍历数组

Python sklearn 的 PLSRegression：“ValueError：数组不能包含 infs 或 NaN”

相关推荐

最近更新

标签