Python 如果在熊猫数据框中的其他功能

Question

提问by progster

I'm trying to apply an if condition over a dataframe, but I'm missing something (error: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().)

我正在尝试在数据帧上应用 if 条件，但我遗漏了一些东西（错误：系列的真值不明确。使用 a.empty、a.bool()、a.item()、a。 any() 或 a.all().)

raw_data = {'age1': [23,45,21],'age2': [10,20,50]}
df = pd.DataFrame(raw_data, columns = ['age1','age2'])

def my_fun (var1,var2,var3):
if (df[var1]-df[var2])>0 :
    df[var3]=df[var1]-df[var2]
else:
    df[var3]=0
print(df[var3])

my_fun('age1','age2','diff')

Answer 1

回答by jezrael

You can use numpy.where:

您可以使用numpy.where：

def my_fun (var1,var2,var3):
    df[var3]= np.where((df[var1]-df[var2])>0, df[var1]-df[var2], 0)
    return df

df1 = my_fun('age1','age2','diff')
print (df1)
   age1  age2  diff
0    23    10    13
1    45    20    25
2    21    50     0

Error is better explain here.

错误最好在这里解释。

Slowier solution with apply, where need axis=1for data processing by rows:

较慢的解决方案apply，需要axis=1按行处理数据：

def my_fun(x, var1, var2, var3):
    print (x)
    if (x[var1]-x[var2])>0 :
        x[var3]=x[var1]-x[var2]
    else:
        x[var3]=0
    return x    

print (df.apply(lambda x: my_fun(x, 'age1', 'age2','diff'), axis=1))
   age1  age2  diff
0    23    10    13
1    45    20    25
2    21    50     0

Also is possible use loc, but sometimes data can be overwritten:

也可以使用loc，但有时数据会被覆盖：

def my_fun(x, var1, var2, var3):
    print (x)
    mask = (x[var1]-x[var2])>0
    x.loc[mask, var3] = x[var1]-x[var2]
    x.loc[~mask, var3] = 0

    return x    

print (my_fun(df, 'age1', 'age2','diff'))
   age1  age2  diff
0    23    10  13.0
1    45    20  25.0
2    21    50   0.0

Answer 2

回答by piRSquared

You can use pandas.Series.where

您可以使用 pandas.Series.where

df.assign(age3=(df.age1 - df.age2).where(df.age1 > df.age2, 0))

   age1  age2  age3
0    23    10    13
1    45    20    25
2    21    50     0

You can wrap this in a function

你可以把它包装在一个函数中

def my_fun(v1, v2):
    return v1.sub(v2).where(v1 > v2, 0)

df.assign(age3=my_fun(df.age1, df.age2))

   age1  age2  age3
0    23    10    13
1    45    20    25
2    21    50     0

Answer 3

回答by cardamom

There is another way without np.whereor pd.Series.where. Am not saying it is better, but after trying to adapt this solution to a challenging problem today, was finding the syntax for whereno so intuitive. In the end, not sure whether it would have possible with where, but found the following method lets you have a look at the subset before you modify it and it for me led more quickly to a solution. Works for the OP here of course as well.

还有另一种没有np.where或的方法pd.Series.where。我并不是说它更好，但是在尝试将此解决方案应用于今天的一个具有挑战性的问题之后，发现语法where不那么直观。最后，不确定是否可以使用 where，但发现以下方法可以让您在修改之前查看子集，并且它对我来说更快地找到了解决方案。当然也适用于这里的 OP。

You deliberately set a value on a slice of a dataframe as Pandas so often warns you not to.

你故意在数据帧的一个切片上设置一个值，因为 Pandas 经常警告你不要这样做。

Thisanswer shows you the correct method to do that.

这个答案向您展示了正确的方法来做到这一点。

The following gives you a slice:

下面给你一个切片：

df.loc[df['age1'] - df['age2'] > 0]

..which looks like:

..看起来像：

   age1  age2
0    23    10
1    45    20

Add an extra column to the original dataframe for the values you want to remain after modifying the slice:

为修改切片后要保留的值向原始数据帧添加额外的列：

df['diff'] = 0

Now modify the slice:

现在修改切片：

df.loc[df['age1'] - df['age2'] > 0, 'diff'] = df['age1'] - df['age2']

..and the result:

..结果：

   age1  age2  diff
0    23    10    13
1    45    20    25
2    21    50     0

Python 如果在熊猫数据框中的其他功能

提问by progster

回答by jezrael

回答by piRSquared

回答by cardamom

相关推荐

最近更新

标签

Python 如果在熊猫数据框中的其他功能

提问by progster

回答by jezrael

回答by piRSquared

回答by cardamom

相关推荐

Python 在 VsCode 中激活 Anaconda 环境

使用 Python http 请求而不是 INT 获取 <response[200]>

Python 在 Jupyter 笔记本中清除单元格输出的键盘快捷键

Python 为什么找不到 tkinter 发行版？

相关推荐

最近更新

标签