pandas 熊猫如何在“loc”之后“替换”工作?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/48314971/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Pandas how can 'replace' work after 'loc'?
提问by Jonathan Zhou
I have tried many times, but seems the 'replace' can NOT work well after use 'loc'. For example I want to replace the 'conlumn_b' with an regex for the row that the 'conlumn_a' value is 'apple'.
我已经尝试了很多次,但似乎在使用“loc”后“replace”不能很好地工作。例如,我想将“conlumn_b”替换为“conlumn_a”值为“apple”的行的正则表达式。
Here is my sample code :
这是我的示例代码:
df.loc[df['conlumn_a'] == 'apple', 'conlumn_b'].replace(r'^11*', 'XXX',inplace=True, regex=True)
Example:
例子:
conlumn_a conlumn_b
apple 123
banana 11
apple 11
orange 33
The result that I expected for the 'df' is:
我对“df”的预期结果是:
conlumn_a conlumn_b
apple 123
banana 11
apple XXX
orange 33
Anyone has meet this issue that needs 'replace' with regex after 'loc' ?
任何人都遇到过这个需要在“loc”之后用正则表达式“替换”的问题?
OR you guys has some other good solutions ?
或者你们有其他一些好的解决方案?
Thank you so much for your help!
非常感谢你的帮助!
采纳答案by jezrael
I think you need filter in both sides:
我认为你需要两边过滤:
m = df['conlumn_a'] == 'apple'
df.loc[m,'conlumn_b'] = df.loc[m,'conlumn_b'].astype(str).replace(r'^(11+)','XXX',regex=True)
print (df)
conlumn_a conlumn_b
0 apple 123
1 banana 11
2 apple XXX
3 orange 33
回答by cs95
inplace=True
works on the object that it was applied on.
inplace=True
适用于应用它的对象。
When you call .loc
, you're slicing your dataframe object to return a newone.
当您调用 时.loc
,您正在对数据框对象进行切片以返回一个新对象。
>>> id(df)
4587248608
And,
和,
>>> id(df.loc[df['conlumn_a'] == 'apple', 'conlumn_b'])
4767716968
Now, calling an in-place replace
on this new slice will apply the replace operation, updating the new slice itself, and not the original.
现在,replace
在这个新切片上就地调用将应用替换操作,更新新切片本身,而不是原始切片。
Now, note that you're calling replace
on a column of int
, and nothing is going to happen, because regular expressions work on strings.
现在,请注意您正在调用replace
的列int
,并且不会发生任何事情,因为正则表达式适用于字符串。
Here's what I offer you as a workaround. Don't use regex at all.
这是我为您提供的解决方法。根本不要使用正则表达式。
m = df['conlumn_a'] == 'apple'
df.loc[m, 'conlumn_b'] = df.loc[m, 'conlumn_b'].replace(11, 'XXX')
df
conlumn_a conlumn_b
0 apple 123
1 banana 11
2 apple XXX
3 orange 33
Or, if you needregex based substitution, then -
或者,如果您需要基于正则表达式的替换,则 -
df.loc[m, 'conlumn_b'] = df.loc[m, 'conlumn_b']\
.astype(str).replace('^11$', 'XXX', regex=True)
Although, this converts your column to an object column.
虽然,这会将您的列转换为对象列。
回答by piRSquared
I'm going to borrow from a recent answer of mine. This technique is a general purpose strategy for updating a dataframe in place:
我要借用我最近的一个回答。此技术是一种用于就地更新数据帧的通用策略:
df.update(
df.loc[df['conlumn_a'] == 'apple', 'conlumn_b']
.replace(r'^11$', 'XXX', regex=True)
)
df
conlumn_a conlumn_b
0 apple 123
1 banana 11
2 apple XXX
3 orange 33
Note that all I did was remove the inplace=True
and instead wrapped it in the pd.DataFrame.update
method.
请注意,我所做的只是删除了inplace=True
,而是将其包装在pd.DataFrame.update
方法中。