Python Pandas DataFrame:根据条件替换列中的所有值

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/31511997/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 10:07:35  来源:igfitidea点击:

Pandas DataFrame: replace all values in a column, based on condition

pythonpandasdataframe

提问by ichimok

I have a simple DataFrame like the following:

我有一个简单的 DataFrame,如下所示:

Pandas DataFrame

熊猫数据框

I want to select all values from the 'First Season' column and replace those that are over 1990 by 1. In this example, only Baltimore Ravens would have the 1996 replaced by 1 (keeping the rest of the data intact).

我想从“第一季”列中选择所有值,并将 1990 年以上的值替换为 1。在此示例中,只有巴尔的摩乌鸦队会将 1996 年替换为 1(保持其余数据不变)。

I have used the following:

我使用了以下内容:

df.loc[(df['First Season'] > 1990)] = 1

But, it replaces all the values in that row by 1, and not just the values in the 'First Season' column.

但是,它将该行中的所有值替换为 1,而不仅仅是“第一季”列中的值。

How can I replace just the values from that column?

如何仅替换该列中的值?

采纳答案by EdChum

You need to select that column:

您需要选择该列:

In [41]:
df.loc[df['First Season'] > 1990, 'First Season'] = 1
df

Out[41]:
                 Team  First Season  Total Games
0      Dallas Cowboys          1960          894
1       Chicago Bears          1920         1357
2   Green Bay Packers          1921         1339
3      Miami Dolphins          1966          792
4    Baltimore Ravens             1          326
5  San Franciso 49ers          1950         1003

So the syntax here is:

所以这里的语法是:

df.loc[<mask>(here mask is generating the labels to index) , <optional column(s)> ]

You can check the docsand also the 10 minutes to pandaswhich shows the semantics

您可以查看文档以及显示语义的10 分钟到 Pandas

EDIT

编辑

If you want to generate a boolean indicator then you can just use the boolean condition to generate a boolean Series and cast the dtype to intthis will convert Trueand Falseto 1and 0respectively:

如果你想生成一个布尔指标,那么你可以只使用布尔条件来生成一个布尔系列并将 dtypeint转换为this will convert Trueand Falseto10分别:

In [43]:
df['First Season'] = (df['First Season'] > 1990).astype(int)
df

Out[43]:
                 Team  First Season  Total Games
0      Dallas Cowboys             0          894
1       Chicago Bears             0         1357
2   Green Bay Packers             0         1339
3      Miami Dolphins             0          792
4    Baltimore Ravens             1          326
5  San Franciso 49ers             0         1003

回答by Amir F

A bit late to the party but still - I prefer using numpy where:

聚会有点晚了,但仍然 - 我更喜欢使用 numpy ,其中:

import numpy as np
df['First Season'] = np.where(df['First Season'] > 1990, 1, df['First Season'])

回答by Amal Varghese

df.First Season.loc[(df['First Season'] > 1990)] = 1

回答by Odz

df['First Season'].loc[(df['First Season'] > 1990)] = 1

strange that nobody has this answer, the only missing part of your code is the ['First Season'] right after df and just remove your curly brackets inside.

奇怪的是没有人有这个答案,你的代码唯一缺少的部分是 df 之后的 ['First Season'] ,只需删除里面的大括号。