pandas 删除熊猫数据框中的特殊字符

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/45871731/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 04:19:02  来源:igfitidea点击:

Removing special characters in a pandas dataframe

pythonpython-3.xpandasdataframejupyter-notebook

提问by SKlein

I have found information on how this could be done, but nothing has worked for me. I am trying to replace the special character 'e'. I imported my data from a csv file and I used encoding='latin1' or else I kept getting errors. However, a simple DF['Column'].str.replace('e', '') will not do the trick. I also tried decoding and using the hex value for that character which was recommended on another post, but that still won't work for me. Help is very much appreciated, and I am willing to post code if necessary.

我找到了有关如何做到这一点的信息,但对我来说没有任何效果。我正在尝试替换特殊字符“e”。我从一个 csv 文件导入了我的数据,我使用了 encoding='latin1' 否则我一直收到错误。然而,一个简单的 DF['Column'].str.replace('e', '') 不会成功。我还尝试解码并使用另一篇文章中推荐的该字符的十六进制值,但这对我仍然不起作用。非常感谢帮助,如有必要,我愿意发布代码。

回答by cs95

Call str.encodefollowed by str.decode:

调用str.encode后跟str.decode

df.YourCol.str.encode('utf-8').str.decode('ascii', 'ignore')

If you want to do this for multiple columns, you can slice and call df.applymap:

如果要对多列执行此操作,可以切片并调用df.applymap

df[col_list].applymap(lambda x: x.encode('utf-8').decode('ascii', 'ignore'))

Remember that these operations are not in-place. So, you'll have to assign those columns back to their rightful place.

请记住,这些操作不是就地的。因此,您必须将这些列分配回其应有的位置。