Python 如何计算 Pandas 中另一列分组的平均值

Question

提问by Rafa

For the following dataframe:

对于以下数据框：

StationID  HoursAhead    BiasTemp  
SS0279           0          10
SS0279           1          20
KEOPS            0          0
KEOPS            1          5
BB               0          5
BB               1          5

I'd like to get something like:

我想得到类似的东西：

StationID  BiasTemp  
SS0279     15
KEOPS      2.5
BB         5

I know I can script something like this to get the desired result:

我知道我可以编写这样的脚本以获得所需的结果：

def transform_DF(old_df,col):
    list_stations = list(set(old_df['StationID'].values.tolist()))
    header = list(old_df.columns.values)
    header.remove(col)
    header_new = header
    new_df = pandas.DataFrame(columns = header_new)
    for i,station in enumerate(list_stations):
        general_results = old_df[(old_df['StationID'] == station)].describe()
        new_row = []
        for column in header_new:
            if column in ['StationID']: 
                new_row.append(station)
                continue
            new_row.append(general_results[column]['mean'])
        new_df.loc[i] = new_row
    return new_df

But I wonder if there is something more straightforward in pandas.

但我想知道大熊猫是否有更直接的东西。

Answer 1

采纳答案by Zero

You could groupbyon StationIDand then take mean()on BiasTemp. To output Dataframe, use as_index=False

你可以先groupby上StationID再mean()上BiasTemp。要输出Dataframe，请使用as_index=False

In [4]: df.groupby('StationID', as_index=False)['BiasTemp'].mean()
Out[4]:
  StationID  BiasTemp
0        BB       5.0
1     KEOPS       2.5
2    SS0279      15.0

Without as_index=False, it returns a Seriesinstead

如果没有as_index=False，它返回一个Series代替

In [5]: df.groupby('StationID')['BiasTemp'].mean()
Out[5]:
StationID
BB            5.0
KEOPS         2.5
SS0279       15.0
Name: BiasTemp, dtype: float64

Read more about groupbyin this pydata tutorial.

groupby在这个 pydata教程中阅读更多信息。

Answer 2

回答by EdChum

This is what groupbyis for:

这groupby是为了：

In [117]:
df.groupby('StationID')['BiasTemp'].mean()

Out[117]:
StationID
BB         5.0
KEOPS      2.5
SS0279    15.0
Name: BiasTemp, dtype: float64

Here we groupby the 'StationID' column, we then access the 'BiasTemp' column and call meanon it

这里我们按“StationID”列分组，然后访问“BiasTemp”列并调用mean它

There is a section in the docson this functionality.

有一个在一个部分文档这一功能。

Python 如何计算 Pandas 中另一列分组的平均值

提问by Rafa

采纳答案by Zero

回答by EdChum

相关推荐

最近更新

标签

Python 如何计算 Pandas 中另一列分组的平均值

提问by Rafa

采纳答案by Zero

回答by EdChum

相关推荐

VIM：在 python 模式下使用 python3 解释器

无法在 Python 2.x 和 Python 3.x 中导入海龟模块

Python3 字典遍历值？

Python 如何将熊猫数据添加到现有的 csv 文件？

相关推荐

最近更新

标签