Python Pandas DataFrame:根据条件替换列中的所有值
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/31511997/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Pandas DataFrame: replace all values in a column, based on condition
提问by ichimok
I have a simple DataFrame like the following:
我有一个简单的 DataFrame,如下所示:
I want to select all values from the 'First Season' column and replace those that are over 1990 by 1. In this example, only Baltimore Ravens would have the 1996 replaced by 1 (keeping the rest of the data intact).
我想从“第一季”列中选择所有值,并将 1990 年以上的值替换为 1。在此示例中,只有巴尔的摩乌鸦队会将 1996 年替换为 1(保持其余数据不变)。
I have used the following:
我使用了以下内容:
df.loc[(df['First Season'] > 1990)] = 1
But, it replaces all the values in that row by 1, and not just the values in the 'First Season' column.
但是,它将该行中的所有值替换为 1,而不仅仅是“第一季”列中的值。
How can I replace just the values from that column?
如何仅替换该列中的值?
采纳答案by EdChum
You need to select that column:
您需要选择该列:
In [41]:
df.loc[df['First Season'] > 1990, 'First Season'] = 1
df
Out[41]:
Team First Season Total Games
0 Dallas Cowboys 1960 894
1 Chicago Bears 1920 1357
2 Green Bay Packers 1921 1339
3 Miami Dolphins 1966 792
4 Baltimore Ravens 1 326
5 San Franciso 49ers 1950 1003
So the syntax here is:
所以这里的语法是:
df.loc[<mask>(here mask is generating the labels to index) , <optional column(s)> ]
You can check the docsand also the 10 minutes to pandaswhich shows the semantics
您可以查看文档以及显示语义的10 分钟到 Pandas
EDIT
编辑
If you want to generate a boolean indicator then you can just use the boolean condition to generate a boolean Series and cast the dtype to int
this will convert True
and False
to 1
and 0
respectively:
如果你想生成一个布尔指标,那么你可以只使用布尔条件来生成一个布尔系列并将 dtypeint
转换为this will convert True
and False
to1
和0
分别:
In [43]:
df['First Season'] = (df['First Season'] > 1990).astype(int)
df
Out[43]:
Team First Season Total Games
0 Dallas Cowboys 0 894
1 Chicago Bears 0 1357
2 Green Bay Packers 0 1339
3 Miami Dolphins 0 792
4 Baltimore Ravens 1 326
5 San Franciso 49ers 0 1003
回答by Amir F
A bit late to the party but still - I prefer using numpy where:
聚会有点晚了,但仍然 - 我更喜欢使用 numpy ,其中:
import numpy as np
df['First Season'] = np.where(df['First Season'] > 1990, 1, df['First Season'])
回答by Amal Varghese
df.First Season.loc[(df['First Season'] > 1990)] = 1
回答by Odz
df['First Season'].loc[(df['First Season'] > 1990)] = 1
strange that nobody has this answer, the only missing part of your code is the ['First Season'] right after df and just remove your curly brackets inside.
奇怪的是没有人有这个答案,你的代码唯一缺少的部分是 df 之后的 ['First Season'] ,只需删除里面的大括号。