Python 基于条件将 Pandas DataFrame 列从 String 转换为 Int

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/31790287/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 10:32:23  来源:igfitidea点击:

Convert Pandas DataFrame Column From String to Int Based on Conditional

pythonpandasdataframe

提问by Adam_G

I have a dataframe that looks like

我有一个看起来像的数据框

df

df

viz  a1_count  a1_mean     a1_std
n         3        2   0.816497
y         0      NaN        NaN 
n         2       51  50.000000

I want to convert the "viz" column to 0 and 1, based on a conditional. I've tried:

我想根据条件将“viz”列转换为 0 和 1。我试过了:

df['viz'] = 0 if df['viz'] == "n" else 1

but I get:

但我得到:

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

采纳答案by EdChum

You're trying to compare a scalar with the entire series which raise the ValueErroryou saw. A simple method would be to cast the boolean series to int:

您正在尝试将标量与提高ValueError您所看到的的整个系列进行比较。一个简单的方法是将布尔系列转换为int

In [84]:
df['viz'] = (df['viz'] !='n').astype(int)
df

Out[84]:
   viz  a1_count  a1_mean     a1_std
0    0         3        2   0.816497
1    1         0      NaN        NaN
2    0         2       51  50.000000

You can also use np.where:

您还可以使用np.where

In [86]:
df['viz'] = np.where(df['viz'] == 'n', 0, 1)
df

Out[86]:
   viz  a1_count  a1_mean     a1_std
0    0         3        2   0.816497
1    1         0      NaN        NaN
2    0         2       51  50.000000

Output from the boolean comparison:

布尔比较的输出:

In [89]:
df['viz'] !='n'

Out[89]:
0    False
1     True
2    False
Name: viz, dtype: bool

And then casting to int:

然后投射到int

In [90]:
(df['viz'] !='n').astype(int)

Out[90]:
0    0
1    1
2    0
Name: viz, dtype: int32