pandas 使用来自python中另一列的值根据条件创建一个新列

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/37683997/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 01:20:43  来源:igfitidea点击:

Creating a new column based on condition with values from another column in python

pythonif-statementpandasdataframe

提问by Muhammad

I have a Dataframe and would like to create a new column based on condition, in this new column if a certain condition is met then the value will be from another column otherwise it needs to be zero. The Orginal DataFrame is;

我有一个数据框,想根据条件创建一个新列,在这个新列中,如果满足某个条件,则该值将来自另一列,否则它需要为零。原始数据帧是;

df2 = pd.read_csv('C:\Users\ABC.csv')
df2['Date'] = pd.to_datetime(df2['Date'])
df2['Hour'] = df2.Date.dt.hour
df2['Occupied'] = ''
Date                 Value  Hour    Occupied
2016-02-02 21:00:00  0.6    21  
2016-02-02 22:00:00  0.4    22  
2016-02-02 23:00:00  0.4    23  
2016-02-03 00:00:00  0.3    0   
2016-02-03 01:00:00  0.2    1   
2016-02-03 02:00:00  0.2    2   
2016-02-03 03:00:00  0.1    3   
2016-02-03 04:00:00  0.2    4   
2016-02-03 05:00:00  0.1    5   
2016-02-03 06:00:00  0.4    6

I would like to have same values as df2.Value in the Occupied column if df2.Hour is greater than or equal to 9, otherwise the values will be zero in the Occupied column. I have tried the following code but it does not work as I would like to (it prints same values as df2.Value without considering else statement);

如果 df2.Hour 大于或等于 9,我希望在 Occupied 列中具有与 df2.Value 相同的值,否则 Occupied 列中的值将为零。我已经尝试了以下代码,但它没有像我想的那样工作(它打印与 df2.Value 相同的值而不考虑 else 语句);

for i in df2['Hour']:
    if i >= 9:
        df2['Occupied'] = df2.Value
    else:
        df2['Occupied'] = 0

Any idea what is wrong with this?

知道这有什么问题吗?

回答by EdChum

use wherewith your boolean condition, this will set all row values rather than iterating row-wise:

使用where您的布尔条件,这将设置所有列值,而不是迭代行方式:

In [120]:
df2['Occupied'] = df2['Value'].where(df2['Hour'] >= 9, 0)
df2

Out[120]:
                 Date  Value  Hour  Occupied
0 2016-02-02 21:00:00    0.6    21       0.6
1 2016-02-02 22:00:00    0.4    22       0.4
2 2016-02-02 23:00:00    0.4    23       0.4
3 2016-02-03 00:00:00    0.3     0       0.0
4 2016-02-03 01:00:00    0.2     1       0.0
5 2016-02-03 02:00:00    0.2     2       0.0
6 2016-02-03 03:00:00    0.1     3       0.0
7 2016-02-03 04:00:00    0.2     4       0.0
8 2016-02-03 05:00:00    0.1     5       0.0
9 2016-02-03 06:00:00    0.4     6       0.0