pandas 对满足条件的 SeriesGroupBy 对象使用 Apply

Question

提问by Miquel

I have a DataFrame df1:

我有一个数据帧df1：

 df1.head() = 

           id      ret     eff
    1469  2300 -0.010879  4480.0
    328   2300 -0.000692 -4074.0
    1376  2300 -0.009551  4350.0
    2110  2300 -0.014013  5335.0
    849   2300 -0.286490 -9460.0

I would like to create a new column that contains the normalized values of the column df1['eff'].
In other words, I would like to group df1['eff']by df1['id'], look for the max value (mx = df1['eff'].max()) and the min value (mn = df2['eff'].min()), and divide in a pairwise fashion each value of the column df1['eff']by mnor mxdepending if df1['eff'] > 0or df1['eff']< 0.

我想创建一个包含列的规范化值的新列df1['eff']。
换句话说，我想对df1['eff']by进行分组df1['id']，查找最大值 ( mx = df1['eff'].max()) 和最小值 ( mn = df2['eff'].min())，并以成对方式df1['eff']除以mn或mx取决于 ifdf1['eff'] > 0或的列的每个值df1['eff']< 0。

The code that I have written is the following:

我写的代码如下：

df1['normd'] = df1.groupby('id')['eff'].apply(lambda x: x/x.max() if x > 0 else x/x.min())

However python throws the following error:

但是python抛出以下错误：

*** ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(),
 a.item(), a.any() or a.all().

Since df1.groupby('id')['eff']is a SeriesGroupBy Object, i decided to use map(). But again python throws the following error:

由于df1.groupby('id')['eff']是SeriesGroupBy Object，我决定使用map(). 但是python再次抛出以下错误：

 *** AttributeError: Cannot access callable attribute 'map' of 'SeriesGroupBy' ob
 jects, try using the 'apply' method

Many thanks in advance.

提前谢谢了。

Answer 1

回答by jezrael

You can use custom function f, where is possible easy add print. So xis Seriesand you need compare each group by numpy.where. Output is numpy arrayand you need convert it to Series:

您可以使用自定义功能f，在可能的地方轻松添加print。所以x是Series你需要通过比较各组numpy.where。输出是numpy array，您需要将其转换为Series：

def f(x):
    #print (x)
    #print (x/x.max())
    #print (x/x.min())
    return pd.Series(np.where(x>0, x/x.max(), x/x.min()), index=x.index)


df1['normd'] = df1.groupby('id')['eff'].apply(f)
print (df1)
        id       ret     eff     normd
1469  2300 -0.010879  4480.0  0.839738
328   2300 -0.000692 -4074.0  0.430655
1376  2300 -0.009551  4350.0  0.815370
2110  2300 -0.014013  5335.0  1.000000
849   2300 -0.286490 -9460.0  1.000000

What is same as:

什么是相同的：

df1['normd'] = df1.groupby('id')['eff']
                  .apply(lambda x: pd.Series(np.where(x>0, 
                                                      x/x.max(), 
                                                      x/x.min()), index=x.index))
print (df1)
        id       ret     eff     normd
1469  2300 -0.010879  4480.0  0.839738
328   2300 -0.000692 -4074.0  0.430655
1376  2300 -0.009551  4350.0  0.815370
2110  2300 -0.014013  5335.0  1.000000
849   2300 -0.286490 -9460.0  1.000000

pandas 对满足条件的 SeriesGroupBy 对象使用 Apply

提问by Miquel

回答by jezrael

相关推荐

最近更新

标签

pandas 对满足条件的 SeriesGroupBy 对象使用 Apply

提问by Miquel

回答by jezrael

相关推荐

Data Frames Pandas 中所有行的 Pearson 相关性

如何在 Pandas 中使用 apply 并行化许多（模糊）字符串比较？

Pandas 重新索引并填充缺失值：“索引必须是单调的”

Pandas - 在 groupby 之后返回一个数据帧

相关推荐

最近更新

标签