Python Pandas：如何根据现有列的多个条件分配值？

Question

提问by Eric

I would like to create a new column with a numerical value based on the following conditions:

我想根据以下条件创建一个带有数值的新列：

a. if gender is male & pet1=pet2, points = 5

一种。如果性别是男性 & pet1=pet2，点数 = 5

b. if gender is female & (pet1 is 'cat' or pet1='dog'), points = 5

湾如果性别是女性 &（pet1 是 'cat' 或 pet1='dog'），点 = 5

c. all other combinations, points = 0

C。所有其他组合，点数 = 0

    gender    pet1      pet2
0   male      dog       dog
1   male      cat       cat
2   male      dog       cat
3   female    cat       squirrel
4   female    dog       dog
5   female    squirrel  cat
6   squirrel  dog       cat

I would like the end result to be as follows:

我希望最终结果如下：

    gender    pet1      pet2      points
0   male      dog       dog       5
1   male      cat       cat       5
2   male      dog       cat       0
3   female    cat       squirrel  5
4   female    dog       dog       5
5   female    squirrel  cat       0
6   squirrel  dog       cat       0

How do I accomplish this?

我该如何实现？

Answer 1

采纳答案by EdChum

You can do this using np.where, the conditions use bitwise &and |for andand orwith parentheses around the multiple conditions due to operator precedence. So where the condition is true 5is returned and 0otherwise:

由于运算符优先级np.where，您可以使用，条件使用按位&和|forand并or在多个条件周围使用括号。因此，条件为真时5返回，0否则：

In [29]:
df['points'] = np.where( ( (df['gender'] == 'male') & (df['pet1'] == df['pet2'] ) ) | ( (df['gender'] == 'female') & (df['pet1'].isin(['cat','dog'] ) ) ), 5, 0)
df

Out[29]:
     gender      pet1      pet2  points
0      male       dog       dog       5
1      male       cat       cat       5
2      male       dog       cat       0
3    female       cat  squirrel       5
4    female       dog       dog       5
5    female  squirrel       cat       0
6  squirrel       dog       cat       0

Answer 2

回答by Ruggero Turra

using apply.

使用apply。

def f(x):
  if x['gender'] == 'male' and x['pet1'] == x['pet2']: return 5
  elif x['gender'] == 'female' and (x['pet1'] == 'cat' or x['pet1'] == 'dog'): return 5
  else: return 0

data['points'] = data.apply(f, axis=1)

Answer 3

回答by leonard

The apply method described by @RuggeroTurra takes a lot longer for 500k rows. I ended up using something like

@RuggeroTurra 描述的 apply 方法对于 500k 行需要更长的时间。我最终使用了类似的东西

df['result'] = ((df.a == 0) & (df.b != 1)).astype(int) * 2 + \
               ((df.a != 0) & (df.b != 1)).astype(int) * 3 + \
               ((df.a == 0) & (df.b == 1)).astype(int) * 4 + \
               ((df.a != 0) & (df.b == 1)).astype(int) * 5

where the apply method took 25 seconds and this method above took about 18ms.

其中 apply 方法需要 25 秒，上面的方法需要大约 18 毫秒。

Answer 4

回答by Erfan

`numpy.select`

2020 answer

2020 答案

This is a perfect case for np.selectwhere we can create a column based on multiple conditions and it's a readable method when there are more conditions:

这是一个完美的例子np.select，我们可以根据多个条件创建一个列，当有更多条件时，这是一种可读的方法：

conditions = [
    df['gender'].eq('male') & df['pet1'].eq(df['pet2']),
    df['gender'].eq('female') & df['pet1'].isin(['cat', 'dog'])
]

df['points'] = np.select(conditions, [5,5], default=0)

print(df)
     gender      pet1      pet2  points
0      male       dog       dog       5
1      male       cat       cat       5
2      male       dog       cat       0
3    female       cat  squirrel       5
4    female       dog       dog       5
5    female  squirrel       cat       0
6  squirrel       dog       cat       0

Answer 5

回答by George Pipis

You can also use the applyfunction. For example:

您也可以使用该apply功能。例如：

def myfunc(gender, pet1, pet2):
    if gender=='male' and pet1==pet2:
        myvalue=5
    elif gender=='female' and (pet1=='cat' or pet1=='dog'):
        myvalue=5
    else:
        myvalue=0
    return myvalue

And then using the apply function by setting axis=1

然后通过设置使用应用功能 axis=1

df['points'] = df.apply(lambda x: myfunc(x['gender'], x['pet1'], x['pet2']), axis=1)

We get:

我们得到：

     gender      pet1      pet2  points
0      male       dog       dog       5
1      male       cat       cat       5
2      male       dog       cat       0
3    female       cat  squirrel       5
4    female       dog       dog       5
5    female  squirrel       cat       0
6  squirrel       dog       cat       0

Python Pandas：如何根据现有列的多个条件分配值？

提问by Eric

采纳答案by EdChum

回答by Ruggero Turra

回答by leonard

回答by Erfan

`numpy.select`

`numpy.select`

回答by George Pipis

相关推荐

最近更新

标签

Python Pandas：如何根据现有列的多个条件分配值？

提问by Eric

采纳答案by EdChum

回答by Ruggero Turra

回答by leonard

回答by Erfan

numpy.select

numpy.select

回答by George Pipis

相关推荐

如何用python解压文件

Python 如何使用 elasticsearch-py 更新文档？

如何在 Python3 中将“二进制字符串”转换为普通字符串？

如何在python中导入OpenSSL

相关推荐

最近更新

标签

`numpy.select`

`numpy.select`