Javascript 如何从字符串中删除除字母、数字、空格、感叹号和问号以外的所有内容?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/12343451/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-23 07:45:43  来源:igfitidea点击:

How to remove everything but letters, numbers, space, exclamation and question mark from string?

javascriptreplaceutf-8

提问by Tomasz Smykowski

How to remove everything but:

如何删除所有内容,但:

letters, numbers, spaces, exclamation marks, question marks from a string?

字符串中的字母、数字、空格、感叹号、问号?

It's important that the method supports international languages (UTF-8).

该方法支持国际语言 (UTF-8) 很重要。

回答by sachleen

You can use regex

您可以使用正则表达式

myString.replace(/[^\w\s!?]/g,'');

This will replace everything but a word character, space, exclamation mark, or question.

这将替换除单词字符、空格、感叹号或问题之外的所有内容。

Character Class: \wstands for "word character", usually [A-Za-z0-9_]. Notice the inclusion of the underscore and digits.

\sstands for "whitespace character". It includes [ \t\r\n].

字符类\w代表“单词字符”,通常为[A-Za-z0-9_]. 请注意包含下划线和数字。

\s代表“空白字符”。它包括[ \t\r\n].

If you don't want the underscore, you can use just [A-Za-z0-9].

如果你不想要下划线,你可以只使用[A-Za-z0-9].

myString.replace(/[^A-Za-z0-9\s!?]/g,'');

For unicode characters, you can add something like \u0000-\u0080to the expression. That will exclude all characters within that unicode range. You'll have to specify the range for the characters you don'twant removed. You can see all the codes on Unicode Map. Just add in the characters you want kept or a range of characters.

对于 unicode 字符,您可以\u0000-\u0080在表达式中添加类似内容。这将排除该 unicode 范围内的所有字符。你必须指定范围内为你的人物希望删除。您可以在Unicode Map上看到所有代码。只需添加您想要保留的字符或一系列字符。

For example:

例如:

myString.replace(/[^A-Za-z0-9\s!?\u0000-\u0080\u0082]/g,'');

This will allow all the previously mentioned characters, the range from \u0000-\u0080and \u0082. It will remove \u0081.

这将允许前面提到的所有字符,范围从\u0000-\u0080\u0082。它将删除\u0081.

回答by Kelvin

Both answers posted so far left out the question mark. I would comment on them, but don't have enough rep yet.

到目前为止发布的两个答案都没有留下问号。我会评论他们,但还没有足够的代表。

David is correct, sachleen's regex will leave underscores behind. rcdmk's regex, modified as follows, will do the trick, although if you care about international characters things might get a lot more complicated.

大卫是对的,sachleen 的正则表达式会留下下划线。rcdmk 的正则表达式,修改如下,可以解决问题,但如果你关心国际字符,事情可能会变得更加复杂。

var result = text.replace(/[^a-zA-Z0-9\s!?]+/g, '');

This will leave behind new lines and tabs as well as spaces. If you want to get rid of new lines and tabs as well, change it to:

这将留下新的行和制表符以及空格。如果您还想删除新行和标签,请将其更改为:

var result = text.replace(/[^a-zA-Z0-9 !?]+/g, '');

回答by Xeiad Ahmid Whd Amerr

text = "A(B){C};:a.b*!c??!1<>2@#3"
result = text.replace(/[^a-zA-Z0-9]/g, '')

Should return ABCabc123

应该返回 ABCabc123

First, we define text as A B C a b c 1 2 3but with random characters set the resultas:

text.replace(...)where the parameters are:

/.../g, /.../: ^means to reverse; not to remove the letters which are:

a-z(lowercase letters), A-Z(UPPERCASE letters) and 0-9(digits)

gmeans global, to remove allmatches not just the first match

The second parameter is the replacement character, we set it to an empty string so that it just keeps the specified string. if is specified, it will return this: "A B C a b c 1 2 3"

首先,我们将文本定义为,A B C a b c 1 2 3但随机字符设置result为:

text.replace(...)其中参数是:

/.../g, /.../:^表示反转;不要删除以下字母:

a-z(小写字母)、A-Z(大写字母)和0-9(数字)

g意味着全局,删除所有匹配,而不仅仅是第一个匹配

第二个参数是替换字符,我们将它设置为一个空字符串,这样它就只保留指定的字符串。如果指定,它将返回:"A B C a b c 1 2 3"

回答by Ricardo Souza

You can try with a regular expression like: var cleaned = someString.replace(/[^a-zA-Z0-9! ]+/g, "");

您可以尝试使用正则表达式,例如: var cleaned = someString.replace(/[^a-zA-Z0-9! ]+/g, "");