Javascript 正则表达式匹配 JSON 字符串

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/32155133/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-23 07:41:22  来源:igfitidea点击:

Regex to match a JSON String

javascriptjsonregexunicode

提问by Sietse

I am building a JSON validator from scratch, but I am quite stuck with the string part. My hope was building a regex which would match the following sequence found on JSON.org:

我正在从头开始构建一个 JSON 验证器,但我对字符串部分很感兴趣。我希望构建一个与 JSON.org 上的以下序列匹配的正则表达式:

JSON.org String Sequence

JSON.org 字符串序列

My regex so far is:

到目前为止,我的正则表达式是:

/^\"((?=\)\(\"|\/|\|b|f|n|r|t|u[0-9a-f]{4}))*\"$/

It does match the criteria with a backslash following by a character and an empty string. But I'm not sure how to use the UNICODE part.

它确实通过反斜杠后跟一个字符和一个空字符串来匹配条件。但我不确定如何使用 UNICODE 部分。

Is there a regex to match any UNICODE character expert " or \ or control character? And will it match a newline or horizontal tab?

是否有正则表达式匹配任何 UNICODE 字符专家 " 或 \ 或控制字符?它会匹配换行符或水平制表符吗?

The last question is because the regex match the string "\t", but not " " (four spaces, but the idea is to be a tab). Otherwise I will need to expand the regex with it, which is not a problem, but my guess is the horizontal tab is a UNICODE character.

最后一个问题是因为正则表达式匹配字符串“\t”,但不匹配“”(四个空格,但想法是一个制表符)。否则我需要用它扩展正则表达式,这不是问题,但我猜测水平制表符是一个 UNICODE 字符。

Thanks to Jaeger Kor, I now have the following regex:

感谢 Jaeger Kor,我现在有了以下正则表达式:

/^\"((?=\)\(\"|\/|\|b|f|n|r|t|u[0-9a-f]{4})|[^\"]*)*\"$/

It appears to be correct, but is there any way to check for control characters or is this unneeded as they appear on the non-printable characters on regular-expressions.info? The input to validate is always text from a textarea.

它似乎是正确的,但是有没有办法检查控制字符,或者这是不需要的,因为它们出现在regular-expressions.info 上的不可打印字符上?要验证的输入始终是来自 textarea 的文本。

Update: the regex is as following in case anyone needs it:

更新:正则表达式如下,以防有人需要它:

/^("(((?=\)\(["\\/bfnrt]|u[0-9a-fA-F]{4}))|[^"\
# Matches any character that isn't a \ or "
/[^\"]/
-\x1F\x7F]+)*")$/

采纳答案by Jaeger Kor

For your exact question create a character class

对于您的确切问题,请创建一个字符类

/[^\"]*/

And then you can just add * on the end to get 0 or unlimited number of them or alternatively 1 or an unlimited number with +

然后你可以在末尾添加 * 以获得 0 或无限数量的它们,或者 1 或无限数量的 +

/[^\"]+/

or

或者

/(?(DEFINE)
# Note that everything is atomic, JSON does not need backtracking if it's valid
# and this prevents catastrophic backtracking
(?<json>(?>\s*(?&object)\s*|\s*(?&array)\s*))
(?<object>(?>\{\s*(?>(?&pair)(?>\s*,\s*(?&pair))*)?\s*\}))
(?<pair>(?>(?&STRING)\s*:\s*(?&value)))
(?<array>(?>\[\s*(?>(?&value)(?>\s*,\s*(?&value))*)?\s*\]))
(?<value>(?>true|false|null|(?&STRING)|(?&NUMBER)|(?&object)|(?&array)))
(?<STRING>(?>"(?>\(?>["\\/bfnrt]|u[a-fA-F0-9]{4})|[^"\##代码##-\x1F\x7F]+)*"))
(?<NUMBER>(?>-?(?>0|[1-9][0-9]*)(?>\.[0-9]+)?(?>[eE][+-]?[0-9]+)?))
)
\A(?&json)\z/x

Also there is this below, found at https://regex101.com/under the library tab when searching for json

下面还有这个,在搜索 json 时在库选项卡下的https://regex101.com/找到

##代码##

This should match any valid json, you can also test it at the website above

这应该匹配任何有效的 json,你也可以在上面的网站上测试它

EDIT:

编辑:

Link to the regex

链接到正则表达式