pandas 熊猫 groupby 后缺少列

Question

提问by user3439329

I've got a pandas dataframe df. I group it by 3 columns, and count the results. When I do this I lose some information, specifically, the namecolumn. This column is mapped 1:1 with the desk_idcolumn. Is there anyway to include both in my final dataframe?

我有一个Pandas数据框df。我将其按 3 列分组，并计算结果。当我这样做时，我会丢失一些信息，特别是name列。此列与desk_id列按1:1 映射。无论如何都要将两者都包含在我的最终数据框中？

here is the dataframe:

这是数据框：

   shift_id    shift_start_time      shift_end_time        name                   end_time       desk_id  shift_hour
0  37423064 2014-01-17 08:00:00 2014-01-17 12:00:00  Adam Scott 2014-01-17 10:16:41.040000  15557987           2
1  37423064 2014-01-17 08:00:00 2014-01-17 12:00:00  Adam Scott 2014-01-17 10:16:41.096000  15557987           2
2  37423064 2014-01-17 08:00:00 2014-01-17 12:00:00  Adam Scott 2014-01-17 10:52:17.402000  15557987           2
3  37423064 2014-01-17 08:00:00 2014-01-17 12:00:00  Adam Scott 2014-01-17 11:06:59.083000  15557987           3
4  37423064 2014-01-17 08:00:00 2014-01-17 12:00:00  Adam Scott 2014-01-17 08:27:57.998000  15557987           0

I group it like this:

我这样分组：

grouped = df.groupby(['desk_id', 'shift_id', 'shift_hour']).size()
grouped = grouped.reset_index()

And here is the result, missing the namecolumn.

这是结果，缺少name列。

    desk_id  shift_id  shift_hour  0
0  14468690  37729081           0  7
1  14468690  37729081           1  3
2  14468690  37729081           2  6
3  14468690  37729081           3  5
4  14468690  37729082           0  5

Also, anyway to rename the count column as 'count' instead of '0'?

另外，无论如何要将计数列重命名为“计数”而不是“0”？

Answer 1

采纳答案by CT Zhu

You need to include 'name'in groupbyby groups:

您需要包括'name'在groupby通过组：

In [43]:

grouped = df.groupby(['desk_id', 'shift_id', 'shift_hour', 'name']).size()
grouped = grouped.reset_index()
grouped.columns=np.where(grouped.columns==0, 'count', grouped.columns) #replace the default 0 to 'count'
print grouped
    desk_id  shift_id  shift_hour        name  count
0  15557987  37423064           0  Adam Scott      1
1  15557987  37423064           2  Adam Scott      3
2  15557987  37423064           3  Adam Scott      1

If the name-to-id relationship is a many-to-one type, say we have a pete scott for the same set of data, the result will become:

如果 name-to-id 关系是多对一类型，假设我们有一个 pete scott 用于同一组数据，结果将变为：

    desk_id  shift_id  shift_hour        name  count
0  15557987  37423064           0  Adam Scott      1
1  15557987  37423064           0  Pete Scott      1
2  15557987  37423064           2  Adam Scott      3
3  15557987  37423064           2  Pete Scott      3
4  15557987  37423064           3  Adam Scott      1
5  15557987  37423064           3  Pete Scott      1

pandas 熊猫 groupby 后缺少列

提问by user3439329

采纳答案by CT Zhu

相关推荐

最近更新

标签

pandas 熊猫 groupby 后缺少列

提问by user3439329

采纳答案by CT Zhu

相关推荐

Python Pandas：仅保留包含第一次出现的项目的数据帧行

pandas 使用日期时间数据类型对熊猫多索引进行切片

KeyError：不在索引中，使用从 Pandas 数据帧本身生成的键

pandas 将元组列表转换为熊猫中的数据框

相关推荐

最近更新

标签