JavaScript 中所有可打印字符的正则表达式

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/12052825/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-10-26 15:11:12  来源:igfitidea点击:

Regular expression for all printable characters in JavaScript

regexjavascript-eventsjavascript

提问by AurA

Looking for a regular expression for that validates all printable characters. The regex needs to be used in JavaScript only. I have gone through thispost but it mostly talks about .net, Java and C but not JavaScript.

寻找用于验证所有可打印字符的正则表达式。正则表达式只需要在 JavaScript 中使用。我已经阅读了这篇文章,但它主要讨论了 .net、Java 和 C,而不是 JavaScript。

You have to allow only these printable characters :

您必须只允许这些可打印字符:

a-z, A-Z, 0-9, and the thirty-two symbols: !"#$%&'()*+,-./:;<=>?@[] ^_`{|}~ and space

az、AZ、0-9 和三十二个符号:!"#$%&'()*+,-./:;<=>?@[] ^_`{|}~ 和空格

Need a JavaScript regex to validate the input characters is one of the above and discard the rest.

需要一个 JavaScript 正则表达式来验证输入字符是上述字符之一并丢弃其余字符。

回答by Tim Pietzcker

If you want to match all printable characters in the UTF-8 set (as indicated by your comment on Aug 21), you're going to have a hard time doing this yourself. JavaScript's native regexes have abysmal Unicode support. But you can use XRegExpwith the regex ^\P{C}*$.

如果您想匹配 UTF-8 集中的所有可打印字符(如您在 8 月 21 日的评论所示),您自己将很难做到这一点。JavaScript 的原生正则表达式对 Unicode 的支持非常糟糕。但是您可以将XRegExp与 regex 一起使用^\P{C}*$

If you only want to match those few ASCII letters you mentioned in the edit to your post from Aug 22, then the regex is trivial:

如果您只想将编辑中提到的那几个 ASCII 字母与 8 月 22 日的帖子相匹配,那么正则表达式很简单:

/^[a-z0-9!"#$%&'()*+,.\/:;<=>?@\[\] ^_`{|}~-]*$/i

回答by ?mega

For non-unicode use regex pattern ^[^\x00-\x1F\x80-\x9F]+$

对于非 unicode 使用正则表达式模式 ^[^\x00-\x1F\x80-\x9F]+$



If you want to work with unicode, first read Javascript + Unicode regexes.

如果您想使用 unicode,请先阅读Javascript + Unicode regexes

I would suggest then to use regex pattern ^[^\p{Cc}\p{Cf}\p{Zl}\p{Zp}]*$

我建议然后使用正则表达式模式 ^[^\p{Cc}\p{Cf}\p{Zl}\p{Zp}]*$

  • \p{Cc}or \p{Control}: an ASCII 0x00..0x1F or Latin-1 0x80..0x9F control character.
  • \p{Cf}or \p{Format}: invisible formatting indicator.
  • \p{Zl}or \p{Line_Separator}: line separator character U+2028.
  • \p{Zp}or \p{Paragraph_Separator}: paragraph separator character U+2029.
  • \p{Cc}\p{Control}:ASCII 0x00..0x1F 或 Latin-1 0x80..0x9F 控制字符。
  • \p{Cf}\p{Format}: 不可见的格式指示符。
  • \p{Zl}\p{Line_Separator}:行分隔符 U+2028。
  • \p{Zp}\p{Paragraph_Separator}:段落分隔符 U+2029。

For more information see http://www.regular-expressions.info/unicode.html

有关更多信息,请参阅http://www.regular-expressions.info/unicode.html

回答by RevelationX

Looks like JavaScript has changed to some degree since this question was posted?

自从发布这个问题以来,JavaScript 似乎在某种程度上发生了变化?

I'm using this one:

我正在使用这个:

var regex = /^[\u0020-\u007e\u00a0-\u00ff]*$/;
console.log( regex.test("!\"#$%&'()*+,-./:;<=>?@[] ^_`{|}~")); //should output "true" 
console.log( regex.test("I?t?rnati?nàliz?ti?n")); //should output "true"
console.log( regex.test("?")); //should output "false" 

回答by Wiktor Stribi?ew

To validate a string only consists of printable ASCIIcharacters, use a simple regex like

要验证字符串仅包含可打印的ASCII字符,请使用简单的正则表达式,如

/^[ -~]+$/

It matches

它匹配

  • ^- the start of string anchor
  • [ -~]+- one or more (due to +quantifier) characters that are within a range from space till a tilde in the ASCII table:
  • ^- 字符串锚的开始
  • [ -~]+- 一个或多个(由于+量词)在 ASCII 表中从空格到波浪号的范围内的字符:

enter image description here
- $- end of string anchor

在此处输入图片说明
--$字符串锚点的结尾

For Unicode printable chars, use \PCUnicode category (matching any char but a control char) from XRegExp, as has already been mentioned:

对于 Unicode 可打印字符,使用\PCUnicode 类别(匹配任何字符,但控制字符) from XRegExp,正如已经提到的:

^\PC+$

See regex demos:

请参阅正则表达式演示:

// ASCII only
var ascii_print_rx = /^[ -~]+$/;
console.log(ascii_print_rx.test("It's all right.")); // true
console.log(ascii_print_rx.test('\f ')); // false, \f is an ASCII form feed char
console.log(ascii_print_rx.test("demásiado tarde")); // false, no Unicode printable char support
// Unicode support
console.log(XRegExp.test('demásiado tarde', XRegExp("^\PC+$"))); // true
console.log(XRegExp.test('? ', XRegExp("^\PC+$"))); // false, \u200C is a Unicode zero-width joiner
console.log(XRegExp.test('\f ', XRegExp("^\PC+$"))); // false, \f is an ASCII form feed char
<script src="http://cdnjs.cloudflare.com/ajax/libs/xregexp/3.1.1/xregexp-all.min.js"></script>