Python 在 Pandas 和 numpy 中聚合 lambda 函数

Question

提问by user2524994

I have an aggregation statement below:

我在下面有一个聚合语句：

data = data.groupby(['type', 'status', 'name']).agg({'one' : np.mean, 'two' : lambda value: 100* ((value>32).sum() / reading.mean()), 'test2': lambda value: 100* ((value > 45).sum() / value.mean())})

I continue to get key errors. I have been able to make it work for one lambda function but not two.

我继续收到关键错误。我已经能够使它适用于一个 lambda 函数，但不能适用于两个。

Answer 1

采纳答案by unutbu

You need to specify the column in datawhose values are to be aggregated. For example,

您需要指定data要聚合其值的列。例如，

data = data.groupby(['type', 'status', 'name'])['value'].agg(...)

instead of

代替

data = data.groupby(['type', 'status', 'name']).agg(...)

If you don't mention the column (e.g. 'value'), then the keys in dict passed to aggare taken to be the column names. The KeyErrors are Pandas' way of telling you that it can't find columns named one, twoor test2in the DataFrame data.

如果您不提及列（例如'value'），则传递给的 dict 中的键将agg被视为列名。该KeyErrors为告诉你它找不到列命名的大熊猫的方式one，two或test2在数据帧data。

Note: Passing a dict to groupby/agghas been deprecated. Instead, going forward you should pass a list-of-tuples instead. Each tuple is expected to be of the form ('new_column_name', callable).

注意：groupby/agg不推荐将 dict 传递给。相反，你应该传递一个元组列表。每个元组都应该是('new_column_name', callable).

Here is runnable example:

这是可运行的示例：

import numpy as np
import pandas as pd

N = 100
data = pd.DataFrame({
    'type': np.random.randint(10, size=N),
    'status': np.random.randint(10, size=N),
    'name': np.random.randint(10, size=N),
    'value': np.random.randint(10, size=N),
})

reading = np.random.random(10,)

data = data.groupby(['type', 'status', 'name'])['value'].agg(
    [('one',  np.mean), 
    ('two', lambda value: 100* ((value>32).sum() / reading.mean())), 
    ('test2', lambda value: 100* ((value > 45).sum() / value.mean()))])
print(data)
#                   one  two  test2
# type status name                 
# 0    1      3     3.0    0    0.0
#             7     4.0    0    0.0
#             9     8.0    0    0.0
#      3      1     5.0    0    0.0
#             6     3.0    0    0.0
# ...

If this does not match your situation, then please provide runnable code that does.

如果这与您的情况不符，请提供可运行的代码。

Python 在 Pandas 和 numpy 中聚合 lambda 函数

提问by user2524994

采纳答案by unutbu

相关推荐

最近更新

标签

Python 在 Pandas 和 numpy 中聚合 lambda 函数

提问by user2524994

采纳答案by unutbu

相关推荐

Python 熊猫比较下一行

如何使用 Python 从文本文件中删除回车符？

Python 将值附加到空列表的最佳实践

无法在 Windows 8 中使用 Python 3.3 找到 vcvarsall.bat

相关推荐

最近更新

标签