Python 替换字符串中的所有非字母数字字符

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/12985456/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-18 12:19:13  来源:igfitidea点击:

Replace all non-alphanumeric characters in a string

python

提问by tchadwik

I have a string with which i want to replace any character that isn't a standard character or number such as (a-z or 0-9) with an asterisk. For example, "h^&ell`.,|o w]{+orld" is replaced with "h*ell*o*w*orld". Note that multiple characters such as "^&" get replaced with one asterisk. How would I go about doing this?

我有一个字符串,我想用它替换任何不是标准字符或数字的字符,例如(az 或 0-9)用星号。例如,“h^&ell`.,|ow]{+orld”被替换为“h*ell*o*w*orld”。请注意,多个字符(例如“^&”)将替换为一个星号。我该怎么做呢?

采纳答案by nneonneo

Regex to the rescue!

正则表达式来救援!

import re

s = re.sub('[^0-9a-zA-Z]+', '*', s)

Example:

例子:

>>> re.sub('[^0-9a-zA-Z]+', '*', 'h^&ell`.,|o w]{+orld')
'h*ell*o*w*orld'

回答by baloan

The pythonic way.

蟒蛇式的方式。

print "".join([ c if c.isalnum() else "*" for c in s ])

This doesn't deal with grouping multiple consecutive non-matching characters though, i.e.

不过,这并不涉及对多个连续的不匹配字符进行分组,即

"h^&i => "h**inot "h*i"as in the regex solutions.

"h^&i => "h**i不像"h*i"在正则表达式解决方案中那样。

回答by Don

Try:

尝试:

s = filter(str.isalnum, s)

in Python3:

在 Python3 中:

s = ''.join(filter(str.isalnum, s))

Edit: realized that the OP wants to replace non-chars with '*'. My answer does not fit

编辑:意识到 OP 想要用“*”替换非字符。我的答案不合适

回答by psun

Use \Wwhich is equivalent to [^a-zA-Z0-9_]. Check the documentation, https://docs.python.org/2/library/re.html

使用\W等效于[^a-zA-Z0-9_]. 检查文档,https://docs.python.org/2/library/re.html

Import re
s =  'h^&ell`.,|o w]{+orld'
replaced_string = re.sub(r'\W+', '*', s)
output: 'h*ell*o*w*orld'

update: This solution will exclude underscore as well. If you want only alphabets and numbers to be excluded, then solution by nneonneo is more appropriate.

更新:此解决方案也将排除下划线。如果您只想排除字母和数字,则 nneonneo 的解决方案更合适。