pandas 熊猫:Dataframe.replace() 与正则表达式

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/32201222/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-13 23:48:30  来源:igfitidea点击:

pandas: Dataframe.replace() with regex

pythonstringpandasreplacefloating-point

提问by Boosted_d16

I have a table which looks like this:

我有一张看起来像这样的表:

df_raw = pd.DataFrame(dict(A = pd.Series(['1.00','-1']), B = pd.Series(['1.0','-45.00','-'])))

    A       B
0   1.00    1.0
1   -1      -45.00
2   NaN     -

I would like to replace '-' to '0.00' using dataframe.replace() but it struggles because of the negative values, '-1', '-45.00'.

我想使用 dataframe.replace() 将“-”替换为“0.00”,但由于负值“-1”、“-45.00”而难以解决。

How can I ignore the negative values and replace only '-' to '0.00' ?

如何忽略负值并仅将 '-' 替换为 '0.00' ?

my code:

我的代码:

df_raw = df_raw.replace(['-','\*'], ['0.00','0.00'], regex=True).astype(np.float64)

error code:

错误代码:

ValueError: invalid literal for float(): 0.0045.00

回答by EdChum

Your regex is matching on all -characters:

您的正则表达式匹配所有-字符:

In [48]:
df_raw.replace(['-','\*'], ['0.00','0.00'], regex=True)

Out[48]:
       A          B
0   1.00        1.0
1  0.001  0.0045.00
2    NaN       0.00

If you put additional boundaries so that it only matches that single character with a termination then it works as expected:

如果您放置了额外的边界,以便它只匹配具有终止的单个字符,那么它会按预期工作:

In [47]:
df_raw.replace(['^-$'], ['0.00'], regex=True)

Out[47]:
      A       B
0  1.00     1.0
1    -1  -45.00
2   NaN    0.00

Here ^means start of string and $means end of string so it will only match on that single character.

这里^表示字符串的开头,表示字符串的$结尾,因此它只会匹配该单个字符。

Or you can just use replacewhich will only match on exact matches:

或者你可以只使用replacewhich 只匹配完全匹配:

In [29]:

df_raw.replace('-',0)
Out[29]:
      A       B
0  1.00     1.0
1    -1  -45.00
2   NaN       0