Pandas/Python:根据另一列中的值设置一列的值

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/49161120/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 19:00:07  来源:igfitidea点击:

Pandas/Python: Set value of one column based on value in another column

pythonpandasconditional

提问by NLR

I need to set the value of one column based on the value of another in a Pandas dataframe. This is the logic:

我需要根据 Pandas 数据框中另一列的值设置一列的值。这是逻辑:

if df['c1'] == 'Value':
    df['c2'] = 10
else:
    df['c2'] = df['c3']

I am unable to get this to do what I want, which is to simply create a column with new values (or change the value of an existing column: either one works for me).

我无法让它做我想做的事,即简单地创建一个具有新值的列(或更改现有列的值:任何一个都适合我)。

If I try to run the code above or if I write it as a function and use the apply method, I get the following:

如果我尝试运行上面的代码,或者将其编写为函数并使用 apply 方法,则会得到以下信息:

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

回答by sacuL

one way to do this would be to use indexing with .loc.

一种方法是将索引与.loc.

Example

例子

In the absence of an example dataframe, I'll make one up here:

在没有示例数据框的情况下,我将在这里补一个:

import numpy as np
import pandas as pd

df = pd.DataFrame({'c1': list('abcdefg')})
df.loc[5, 'c1'] = 'Value'

>>> df
      c1
0      a
1      b
2      c
3      d
4      e
5  Value
6      g

Assuming you wanted to create a new columnc2, equivalent to c1except where c1is Value, in which case, you would like to assign it to 10:

假设您想创建一个新列c2,相当于c1除了 where c1is Value,在这种情况下,您希望将其分配给 10:

First, you could create a new column c2, and set it to equivalent as c1, using one of the following two lines (they essentially do the same thing):

首先,您可以创建一个新 column c2,并将其设置为等效 as c1,使用以下两行之一(它们基本上做同样的事情):

df = df.assign(c2 = df['c1'])
# OR:
df['c2'] = df['c1']

Then, find all the indices where c1is equal to 'Value'using .loc, and assign your desired value in c2at those indices:

然后,找到c1等于'Value'using 的所有索引.loc,并c2在这些索引处分配所需的值:

df.loc[df['c1'] == 'Value', 'c2'] = 10

And you end up with this:

你最终会得到这个:

>>> df
      c1  c2
0      a   a
1      b   b
2      c   c
3      d   d
4      e   e
5  Value  10
6      g   g

If, as you suggested in your question, you would perhaps sometimes just want to replace the values in the column you already have, rather than create a new column, then just skip the column creation, and do the following:

如果,正如您在问题中所建议的那样,您有时可能只想替换已有列中的值,而不是创建新列,则只需跳过列创建,然后执行以下操作:

df['c1'].loc[df['c1'] == 'Value'] = 10
# or:
df.loc[df['c1'] == 'Value', 'c1'] = 10

Giving you:

给你:

>>> df
      c1
0      a
1      b
2      c
3      d
4      e
5     10
6      g

回答by DJK

You can use np.where()to set values based on a specified condition:

您可以使用np.where()根据指定条件设置值:

#df
   c1  c2  c3
0   4   2   1
1   8   7   9
2   1   5   8
3   3   3   5
4   3   6   8

Now change values (or set) in column ['c2']based on your condition.

现在['c2']根据您的条件更改列中的值(或设置)。

df['c2'] = np.where(df.c1 == 8,'X', df.c3)

   c1  c3  c4
0   4   1   1
1   8   9   X
2   1   8   8
3   3   5   5
4   3   8   8

回答by AlexanderHughes

try:

尝试:

df['c2'] = df['c1'].apply(lambda x: 10 if x == 'Value' else x)

df['c2'] = df['c1'].apply(lambda x: 10 if x == 'Value' else x)

回答by nimbous

You can use pandas.DataFrame.maskto add virtually as many conditions as you need:

您可以根据pandas.DataFrame.mask需要添加几乎任意数量的条件:

data = {'a': [1,2,3,4,5], 'b': [6,8,9,10,11]}

d = pd.DataFrame.from_dict(data, orient='columns')
c = {'c1': (2, 'Value1'), 'c2': (3, 'Value2'), 'c3': (5, d['b'])}

d['new'] = np.nan
for value in c.values():
    d['new'].mask(d['a'] == value[0], value[1], inplace=True)

d['new'] = d['new'].fillna('Else')
d

Output:

输出:

    a   b   new
0   1   6   Else
1   2   8   Value1
2   3   9   Value2
3   4   10  Else
4   5   11  11

回答by Ralf

I suggest doing it in two steps:

我建议分两步做:

# set fixed value to 'c2' where the condition is met
df.loc[df['c1'] == 'Value', 'c2'] = 10

# copy value from 'c3' to 'c2' where the condition is NOT met
df.loc[df['c1'] != 'Value', 'c2'] = df[df['c1'] != 'Value', 'c3']