Python 一次更改 Pandas DataFrame 的多列中的某些值

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/19867734/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 14:50:32  来源:igfitidea点击:

Changing certain values in multiple columns of a pandas DataFrame at once

pythonpandas

提问by dbliss

Suppose I have the following DataFrame:

假设我有以下数据帧:

In [1]: df
Out[1]:
  apple banana cherry
0     0      3   good
1     1      4    bad
2     2      5   good

This works as expected:

这按预期工作:

In [2]: df['apple'][df.cherry == 'bad'] = np.nan
In [3]: df
Out[3]:
  apple banana cherry
0     0      3   good
1   NaN      4    bad
2     2      5   good

But this doesn't:

但这不会:

In [2]: df[['apple', 'banana']][df.cherry == 'bad'] = np.nan
In [3]: df
Out[3]:
  apple banana cherry
0     0      3   good
1     1      4    bad
2     2      5   good

Why? How can I achieve the conversion of both the 'apple' and 'banana' values without having to write out two lines, as in

为什么?如何在不必写出两行的情况下实现 'apple' 和 'banana' 值的转换,如

In [2]: df['apple'][df.cherry == 'bad'] = np.nan
In [3]: df['banana'][df.cherry == 'bad'] = np.nan

采纳答案by Andy Hayden

You should use loc and do this without chaining:

您应该使用 loc 并在不链接的情况下执行此操作:

In [11]: df.loc[df.cherry == 'bad', ['apple', 'banana']] = np.nan

In [12]: df
Out[12]: 
   apple  banana cherry
0      0       3   good
1    NaN     NaN    bad
2      2       5   good

See the docs on returning a view vs a copy, if you chain the assignment is made to the copy (and thrown away) but if you do it in one loc then pandas cleverly realises you want to assign to the original.

请参阅有关返回视图与副本的文档,如果您将分配链接到副本(并丢弃),但如果您在一个位置执行此操作,那么熊猫会巧妙地意识到您想要分配给原件。

回答by Roman Pekar

It's because df[['apple', 'banana']][df.cherry == 'bad'] = np.nanassigning to the copy of DataFrame. Try this:

这是因为df[['apple', 'banana']][df.cherry == 'bad'] = np.nan分配给DataFrame的副本。尝试这个:

df.ix[df.cherry == 'bad', ['apple', 'banana']] = np.nan