替换熊猫数据帧中所有出现的字符串(Python)

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/25698710/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-18 23:30:52  来源:igfitidea点击:

Replace all occurrences of a string in a pandas dataframe (Python)

pythonreplacepandasdataframe

提问by nauti

I have a pandas dataframe with about 20 columns.

我有一个大约有 20 列的熊猫数据框。

It is possible to replace all occurrences of a string (here a newline) by manually writing all column names:

可以通过手动编写所有列名来替换所有出现的字符串(这里是换行符):

df['columnname1'] = df['columnname1'].str.replace("\n","<br>")
df['columnname2'] = df['columnname2'].str.replace("\n","<br>")
df['columnname3'] = df['columnname3'].str.replace("\n","<br>")
...
df['columnname20'] = df['columnname20'].str.replace("\n","<br>")

This unfortunately does not work:

不幸的是,这不起作用:

df = df.replace("\n","<br>")

Is there any other, more elegant solution?

还有其他更优雅的解决方案吗?

采纳答案by Alex Riley

You can use replaceand pass the strings to find/replace as dictionary keys/items:

您可以使用replace和传递字符串作为字典键/项目来查找/替换:

df.replace({'\n': '<br>'}, regex=True)

For example:

例如:

>>> df = pd.DataFrame({'a': ['1\n', '2\n', '3'], 'b': ['4\n', '5', '6\n']})
>>> df
   a    b
0  1\n  4\n
1  2\n  5
2  3    6\n

>>> df.replace({'\n': '<br>'}, regex=True)
   a      b
0  1<br>  4<br>
1  2<br>  5
2  3      6<br>

回答by Yichuan Wang

It seems Pandas has change its API to avoid ambiguity when handling regex. Now you should use:

Pandas 似乎已经改变了它的 API 以避免在处理正则表达式时出现歧义。现在你应该使用:

df.replace({'\n': '<br>'}, regex=True)

For example:

例如:

>>> df = pd.DataFrame({'a': ['1\n', '2\n', '3'], 'b': ['4\n', '5', '6\n']})
>>> df
   a    b
0  1\n  4\n
1  2\n  5
2  3    6\n

>>> df.replace({'\n': '<br>'}, regex=True)
   a      b
0  1<br>  4<br>
1  2<br>  5
2  3      6<br>

回答by Jasper Kinoti

This will remove all newlines and unecessary spaces. You can edit the ' '.jointo specify a replacement character

这将删除所有换行符和不必要的空格。您可以编辑' '.join以指定替换字符

    df['columnname'] = [''.join(c.split()) for c in df['columnname'].astype(str)]