Pandas 基于布尔数组修改列值
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/23400743/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Pandas modify column values in place based on boolean array
提问by prooffreader
I know how to create a new column with applyor np.wherebased on the values of another column, but a way of selectively changing the values of an existing column is escaping me; I suspect df.ixis involved? Am I close?
我知道如何使用apply或np.where基于另一列的值创建新列,但是有选择地更改现有列的值的方法正在逃避我;我怀疑df.ix有牵连?我很亲近吗?
For example, here's a simple dataframe (mine has tens of thousands of rows). I would like to change the value in the 'flag' column (let's say to 'Blue') if the name ends with the letter 'e':
例如,这是一个简单的数据框(我的有数万行)。如果名称以字母“e”结尾,我想更改“flag”列中的值(假设为“Blue”):
>>> import pandas as pd
>>> df = pd.DataFrame({'name':['Mick', 'John', 'Christine', 'Stevie', 'Lindsey'], \
'flag':['Purple', 'Red', nan, nan, nan]})[['name', 'flag']]
>>> print df
name flag
0 Mick Purple
1 John Red
2 Christine NaN
3 Stevie NaN
4 Lindsey NaN
[5 rows x 2 columns]
I can make a boolean series from my criteria:
我可以根据我的标准制作一个布尔系列:
>boolean_result = df.name.str.contains('e$')
>print boolean_result
0 False
1 False
2 True
3 True
4 False
Name: name, dtype: bool
I just need the crucial step to get the following result:
我只需要获得以下结果的关键步骤:
>>> print result_wanted
name flag
0 Mick Purple
1 John Red
2 Christine Blue
3 Stevie Blue
4 Lindsey NaN
回答by U2EF1
df['flag'][df.name.str.contains('e$')] = 'Blue'

