Pandas/Python:根据另一列中的值设置一列的值
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/49161120/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Pandas/Python: Set value of one column based on value in another column
提问by NLR
I need to set the value of one column based on the value of another in a Pandas dataframe. This is the logic:
我需要根据 Pandas 数据框中另一列的值设置一列的值。这是逻辑:
if df['c1'] == 'Value':
df['c2'] = 10
else:
df['c2'] = df['c3']
I am unable to get this to do what I want, which is to simply create a column with new values (or change the value of an existing column: either one works for me).
我无法让它做我想做的事,即简单地创建一个具有新值的列(或更改现有列的值:任何一个都适合我)。
If I try to run the code above or if I write it as a function and use the apply method, I get the following:
如果我尝试运行上面的代码,或者将其编写为函数并使用 apply 方法,则会得到以下信息:
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
回答by sacuL
one way to do this would be to use indexing with .loc
.
一种方法是将索引与.loc
.
Example
例子
In the absence of an example dataframe, I'll make one up here:
在没有示例数据框的情况下,我将在这里补一个:
import numpy as np
import pandas as pd
df = pd.DataFrame({'c1': list('abcdefg')})
df.loc[5, 'c1'] = 'Value'
>>> df
c1
0 a
1 b
2 c
3 d
4 e
5 Value
6 g
Assuming you wanted to create a new columnc2
, equivalent to c1
except where c1
is Value
, in which case, you would like to assign it to 10:
假设您想创建一个新列c2
,相当于c1
除了 where c1
is Value
,在这种情况下,您希望将其分配给 10:
First, you could create a new column c2
, and set it to equivalent as c1
, using one of the following two lines (they essentially do the same thing):
首先,您可以创建一个新 column c2
,并将其设置为等效 as c1
,使用以下两行之一(它们基本上做同样的事情):
df = df.assign(c2 = df['c1'])
# OR:
df['c2'] = df['c1']
Then, find all the indices where c1
is equal to 'Value'
using .loc
, and assign your desired value in c2
at those indices:
然后,找到c1
等于'Value'
using 的所有索引.loc
,并c2
在这些索引处分配所需的值:
df.loc[df['c1'] == 'Value', 'c2'] = 10
And you end up with this:
你最终会得到这个:
>>> df
c1 c2
0 a a
1 b b
2 c c
3 d d
4 e e
5 Value 10
6 g g
If, as you suggested in your question, you would perhaps sometimes just want to replace the values in the column you already have, rather than create a new column, then just skip the column creation, and do the following:
如果,正如您在问题中所建议的那样,您有时可能只想替换已有列中的值,而不是创建新列,则只需跳过列创建,然后执行以下操作:
df['c1'].loc[df['c1'] == 'Value'] = 10
# or:
df.loc[df['c1'] == 'Value', 'c1'] = 10
Giving you:
给你:
>>> df
c1
0 a
1 b
2 c
3 d
4 e
5 10
6 g
回答by DJK
You can use np.where()
to set values based on a specified condition:
您可以使用np.where()
根据指定条件设置值:
#df
c1 c2 c3
0 4 2 1
1 8 7 9
2 1 5 8
3 3 3 5
4 3 6 8
Now change values (or set) in column ['c2']
based on your condition.
现在['c2']
根据您的条件更改列中的值(或设置)。
df['c2'] = np.where(df.c1 == 8,'X', df.c3)
c1 c3 c4
0 4 1 1
1 8 9 X
2 1 8 8
3 3 5 5
4 3 8 8
回答by AlexanderHughes
try:
尝试:
df['c2'] = df['c1'].apply(lambda x: 10 if x == 'Value' else x)
df['c2'] = df['c1'].apply(lambda x: 10 if x == 'Value' else x)
回答by nimbous
You can use pandas.DataFrame.mask
to add virtually as many conditions as you need:
您可以根据pandas.DataFrame.mask
需要添加几乎任意数量的条件:
data = {'a': [1,2,3,4,5], 'b': [6,8,9,10,11]}
d = pd.DataFrame.from_dict(data, orient='columns')
c = {'c1': (2, 'Value1'), 'c2': (3, 'Value2'), 'c3': (5, d['b'])}
d['new'] = np.nan
for value in c.values():
d['new'].mask(d['a'] == value[0], value[1], inplace=True)
d['new'] = d['new'].fillna('Else')
d
Output:
输出:
a b new
0 1 6 Else
1 2 8 Value1
2 3 9 Value2
3 4 10 Else
4 5 11 11
回答by Ralf
I suggest doing it in two steps:
我建议分两步做:
# set fixed value to 'c2' where the condition is met
df.loc[df['c1'] == 'Value', 'c2'] = 10
# copy value from 'c3' to 'c2' where the condition is NOT met
df.loc[df['c1'] != 'Value', 'c2'] = df[df['c1'] != 'Value', 'c3']