Python 在 Pandas 中满足特定条件时更新行值
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/36909977/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Update row values where certain condition is met in pandas
提问by Stanko
Say I have the following dataframe:
假设我有以下数据框:
What is the most efficient way to update the values of the columns featand another_featwhere the streamis number 2?
什么是更新列的值最有效的方式壮举和another_feat其中流编号为2?
Is this it?
是这个吗?
for index, row in df.iterrows():
if df1.loc[index,'stream'] == 2:
# do something
UPDATE:What to do if I have more than a 100 columns? I don't want to explicitly name the columns that I want to update. I want to divide the value of each column by 2 (except for the stream column).
更新:如果我有超过 100 列怎么办?我不想明确命名要更新的列。我想将每列的值除以 2(流列除外)。
So to be clear what my goal is:
所以要明确我的目标是什么:
Dividing all values by 2 of all rows that have stream 2, but not changing the stream column
将具有流 2 的所有行的所有值除以 2,但不更改流列
回答by jezrael
I think you can use loc
if you need update two columns to same value:
loc
如果您需要将两列更新为相同的值,我认为您可以使用:
df1.loc[df1['stream'] == 2, ['feat','another_feat']] = 'aaaa'
print df1
stream feat another_feat
a 1 some_value some_value
b 2 aaaa aaaa
c 2 aaaa aaaa
d 3 some_value some_value
If you need update separate, one option is use:
如果您需要单独更新,一种选择是使用:
df1.loc[df1['stream'] == 2, 'feat'] = 10
print df1
stream feat another_feat
a 1 some_value some_value
b 2 10 some_value
c 2 10 some_value
d 3 some_value some_value
Another common option is use numpy.where
:
另一个常见的选择是使用numpy.where
:
df1['feat'] = np.where(df1['stream'] == 2, 10,20)
print df1
stream feat another_feat
a 1 20 some_value
b 2 10 some_value
c 2 10 some_value
d 3 20 some_value
EDIT: If you need divide all columns without stream
where condition is True
, use:
编辑:如果您需要stream
在没有where condition is 的情况下划分所有列True
,请使用:
print df1
stream feat another_feat
a 1 4 5
b 2 4 5
c 2 2 9
d 3 1 7
#filter columns all without stream
cols = [col for col in df1.columns if col != 'stream']
print cols
['feat', 'another_feat']
df1.loc[df1['stream'] == 2, cols ] = df1 / 2
print df1
stream feat another_feat
a 1 4.0 5.0
b 2 2.0 2.5
c 2 1.0 4.5
d 3 1.0 7.0
回答by Thanos
You can do the same with .ix
, like this:
你可以用 做同样的事情.ix
,像这样:
In [1]: df = pd.DataFrame(np.random.randn(5,4), columns=list('abcd'))
In [2]: df
Out[2]:
a b c d
0 -0.323772 0.839542 0.173414 -1.341793
1 -1.001287 0.676910 0.465536 0.229544
2 0.963484 -0.905302 -0.435821 1.934512
3 0.266113 -0.034305 -0.110272 -0.720599
4 -0.522134 -0.913792 1.862832 0.314315
In [3]: df.ix[df.a>0, ['b','c']] = 0
In [4]: df
Out[4]:
a b c d
0 -0.323772 0.839542 0.173414 -1.341793
1 -1.001287 0.676910 0.465536 0.229544
2 0.963484 0.000000 0.000000 1.934512
3 0.266113 0.000000 0.000000 -0.720599
4 -0.522134 -0.913792 1.862832 0.314315
EDIT
编辑
After the extra information, the following will return all columns - where some condition is met - with halved values:
在额外信息之后,以下将返回所有列 - 满足某些条件 - 值减半:
>> condition = df.a > 0
>> df[condition][[i for i in df.columns.values if i not in ['a']]].apply(lambda x: x/2)
I hope this helps!
我希望这有帮助!