pandas 更改熊猫数据框列中的值
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/46830231/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Change Values in pandas dataframe column
提问by sataide
I have a dataframe filled with several columns. I need to change the values of a column for data normalization like in the following example:
我有一个填充了几列的数据框。我需要更改列的值以进行数据规范化,如下例所示:
User_id
751730951
751730951
0
163526844
...and so on
I need to replace every value in the column that is not 0 (string) in a into something like "is not empty". I have tried it now for hours but still cannot change every value that is not 0 into something else. Replace()-function don't work really good for that. Some good ideas?
我需要将 a 中非 0(字符串)列中的每个值替换为“不为空”之类的值。我已经尝试了几个小时,但仍然无法将不是 0 的每个值更改为其他值。Replace() 函数对此不起作用。一些好主意?
EDIT (my solution):
编辑(我的解决方案):
finalResult.loc[finalResult['update_user'] == '0', 'update_user'] = 'empty'
finalResult.loc[finalResult['update_user'] != 'empty', 'update_user'] = 'not empty'
回答by Aziz Javed
df.loc[df['mycolumn'] != '0', 'mycolumn'] = 'not empty'
or if the value is an int,
或者如果值是一个整数,
df.loc[df['mycolumn'] != 0, 'mycolumn'] = 'not empty'
df.loc[rows, cols]
allows you to get or set a range of values in your DataFrame. First parameter is rows, in which case I'm using a boolean mask to get all rows that don't have a 0 in mycolumn
. The second parameter is the column you want to get/set. Since I'm replacing the same column I queried from, it is also mycolumn
.
df.loc[rows, cols]
允许您在 DataFrame 中获取或设置一系列值。第一个参数是行,在这种情况下,我使用布尔掩码来获取所有在mycolumn
. 第二个参数是您要获取/设置的列。由于我要替换查询的同一列,因此它也是mycolumn
.
I then simply using the assignment operator to assign the value of 'not empty' like you wanted.
然后我简单地使用赋值运算符来分配你想要的“非空”的值。
New column containing 'not empty'
包含“非空”的新列
If you want a new column to contain the 'not empty' so you're not contaminating your original data in mycolumn
, you can do:
如果您希望新列包含“非空”,以免污染 中的原始数据mycolumn
,您可以执行以下操作:
df.loc[df['mycolumn'] != 0, 'myNewColumnsName'] = 'not empty'
回答by jezrael
Simpliest is use:
最简单的是使用:
df['User_id'] = df['User_id'].replace('0', 'is not empty')
If 0
is int
:
如果0
是int
:
df['User_id'] = df['User_id'].replace(0, 'is not empty')
回答by karen
Suppose we use a Series with the data specified in the question, named user_id, with a single line you do what you need:
假设我们使用一个带有问题中指定的数据的系列,名为 user_id,只需一行即可完成所需的操作:
user_id.where(user_id == 0).fillna('is not empty')
I don't like loc very much since I think it complicates the reading.
我不太喜欢 loc ,因为我认为它使阅读复杂化。
It might be better than replace because it allows the opposite case:
它可能比替换更好,因为它允许相反的情况:
user_id.where(user_id != 0).fillna('is empty')