.net 用于验证 URI 的正则表达式
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/30847/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Regex to validate URIs
提问by alumb
How do you produce a regex that matches only valid URI. The description for URIs can be found here: http://en.wikipedia.org/wiki/URI_scheme. It doesn't need to extract any parts, just test if a URI is valid.
您如何生成仅匹配有效 URI 的正则表达式。URI 的描述可以在这里找到:http: //en.wikipedia.org/wiki/URI_scheme。它不需要提取任何部分,只需测试 URI 是否有效。
(preferred format is .Net RegularExpression) (.Net Version 1.1)
(首选格式是 .Net RegularExpression)(.Net 1.1 版)
- Doesn't neet to check for a known protocol, just a valid one.
- 不需要检查已知协议,只需检查有效协议。
Current Solution:
当前解决方案:
^([a-zA-Z0-9+.-]+):(//([a-zA-Z0-9-._~!$&'()*+,;=:]*)@)?([a-zA-Z0-9-._~!$&'()*+,;=]+)(:(\d*))?(/?[a-zA-Z0-9-._~!$&'()*+,;=:/]+)?(\?[a-zA-Z0-9-._~!$&'()*+,;=:/?@]+)?(#[a-zA-Z0-9-._~!$&'()*+,;=:/?@]+)?$(:(\d*))?(/?[a-zA-Z0-9-._~!$&'()*+,;=:/]+)?(\?[a-zA-Z0-9-._~!$&'()*+,;=:/?@]+)?(\#[a-zA-Z0-9-._~!$&'()*+,;=:/?@]+)?$
采纳答案by Daren Thomas
This site looks promising: http://snipplr.com/view/6889/regular-expressions-for-uri-validationparsing/
这个网站看起来很有希望:http: //snipplr.com/view/6889/regular-expressions-for-uri-validationparsing/
They propose following regex:
他们提出以下正则表达式:
/^([a-z0-9+.-]+):(?://(?:((?:[a-z0-9-._~!$&'()*+,;=:]|%[0-9A-F]{2})*)@)?((?:[a-z0-9-._~!$&'()*+,;=]|%[0-9A-F]{2})*)(?::(\d*))?(/(?:[a-z0-9-._~!$&'()*+,;=:@/]|%[0-9A-F]{2})*)?|(/?(?:[a-z0-9-._~!$&'()*+,;=:@]|%[0-9A-F]{2})+(?:[a-z0-9-._~!$&'()*+,;=:@/]|%[0-9A-F]{2})*)?)(?:\?((?:[a-z0-9-._~!$&'()*+,;=:/?@]|%[0-9A-F]{2})*))?(?:#((?:[a-z0-9-._~!$&'()*+,;=:/?@]|%[0-9A-F]{2})*))?$/i
回答by bdukes
Does Uri.IsWellFormedUriStringwork for you?
是否Uri.IsWellFormedUriString为你工作?
回答by jcsahnwaldt says GoFundMonica
该URI规范说:
The following line is the regular expression for breaking-down a well-formed URI reference into its components.
^(([^:/?#]+):)?(//([^/?#]*))?([^?#]*)(\?([^#]*))?(#(.*))?
以下行是将格式良好的 URI 引用分解为其组件的正则表达式。
^(([^:/?#]+):)?(//([^/?#]*))?([^?#]*)(\?([^#]*))?(#(.*))?
(I guess that's the same regex as in the STD66 link given in another answer.)
(我想这与另一个答案中给出的 STD66 链接中的正则表达式相同。)
But breaking-downis not validating. To correctly validate a URI, one would have to translate the BNF for URIsto a regex. While some BNFs cannotbe expressed as regular expressions, I think with this one it couldbe done. But it shouldn't be done - it would be a huge mess. It's better to use a library function.
但故障不是验证. 要正确验证 URI,必须将URI的BNF转换为正则表达式。虽然一些BNFs不能表示为正则表达式,我觉得这一个就可以完成。但这不应该这样做 - 这将是一个巨大的混乱。最好使用库函数。
回答by papercowboy
The best and most definitive guide to this I have found is here: http://jmrware.com/articles/2009/uri_regexp/URI_regex.html(In answer to your question, see the URItable entry)
我找到的最好和最权威的指南在这里:http: //jmrware.com/articles/2009/uri_regexp/URI_regex.html(要回答你的问题,请参阅URI表条目)
All of these rules from RFC3986 are reproduced in Table 2 along with a regular expression implementation for each rule.
表 2 中复制了 RFC3986 中的所有这些规则以及每个规则的正则表达式实现。
A javascript implementation of this is available here: https://github.com/jhermsmeier/uri.regex
此处提供了一个 javascript 实现:https: //github.com/jhermsmeier/uri.regex
For reference, the URI regex is repeated below:
作为参考,URI 正则表达式在下面重复:
# RFC-3986 URI component: URI
[A-Za-z][A-Za-z0-9+\-.]* : # scheme ":"
(?: // # hier-part
(?: (?:[A-Za-z0-9\-._~!$&'()*+,;=:]|%[0-9A-Fa-f]{2})* @)?
(?:
\[
(?:
(?:
(?: (?:[0-9A-Fa-f]{1,4}:) {6}
| :: (?:[0-9A-Fa-f]{1,4}:) {5}
| (?: [0-9A-Fa-f]{1,4})? :: (?:[0-9A-Fa-f]{1,4}:) {4}
| (?: (?:[0-9A-Fa-f]{1,4}:){0,1} [0-9A-Fa-f]{1,4})? :: (?:[0-9A-Fa-f]{1,4}:) {3}
| (?: (?:[0-9A-Fa-f]{1,4}:){0,2} [0-9A-Fa-f]{1,4})? :: (?:[0-9A-Fa-f]{1,4}:) {2}
| (?: (?:[0-9A-Fa-f]{1,4}:){0,3} [0-9A-Fa-f]{1,4})? :: [0-9A-Fa-f]{1,4}:
| (?: (?:[0-9A-Fa-f]{1,4}:){0,4} [0-9A-Fa-f]{1,4})? ::
) (?:
[0-9A-Fa-f]{1,4} : [0-9A-Fa-f]{1,4}
| (?: (?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?) \.){3}
(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)
)
| (?: (?:[0-9A-Fa-f]{1,4}:){0,5} [0-9A-Fa-f]{1,4})? :: [0-9A-Fa-f]{1,4}
| (?: (?:[0-9A-Fa-f]{1,4}:){0,6} [0-9A-Fa-f]{1,4})? ::
)
| [Vv][0-9A-Fa-f]+\.[A-Za-z0-9\-._~!$&'()*+,;=:]+
)
\]
| (?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}
(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)
| (?:[A-Za-z0-9\-._~!$&'()*+,;=]|%[0-9A-Fa-f]{2})*
)
(?: : [0-9]* )?
(?:/ (?:[A-Za-z0-9\-._~!$&'()*+,;=:@]|%[0-9A-Fa-f]{2})* )*
| /
(?: (?:[A-Za-z0-9\-._~!$&'()*+,;=:@]|%[0-9A-Fa-f]{2})+
(?:/ (?:[A-Za-z0-9\-._~!$&'()*+,;=:@]|%[0-9A-Fa-f]{2})* )*
)?
| (?:[A-Za-z0-9\-._~!$&'()*+,;=:@]|%[0-9A-Fa-f]{2})+
(?:/ (?:[A-Za-z0-9\-._~!$&'()*+,;=:@]|%[0-9A-Fa-f]{2})* )*
|
)
(?:\? (?:[A-Za-z0-9\-._~!$&'()*+,;=:@/?]|%[0-9A-Fa-f]{2})* )? # [ "?" query ]
(?:\# (?:[A-Za-z0-9\-._~!$&'()*+,;=:@/?]|%[0-9A-Fa-f]{2})* )? # [ "#" fragment ]
回答by Lostfields
The best regex I came up with according to RFC 3986 (https://tools.ietf.org/html/rfc3986) was the following:
我根据 RFC 3986 ( https://tools.ietf.org/html/rfc3986)提出的最佳正则表达式如下:
// named groups
/^(?<scheme>[a-z][a-z0-9+.-]+):(?<authority>\/\/(?<user>[^@]+@)?(?<host>[a-z0-9.\-_~]+)(?<port>:\d+)?)?(?<path>(?:[a-z0-9-._~]|%[a-f0-9]|[!$&'()*+,;=:@])+(?:\/(?:[a-z0-9-._~]|%[a-f0-9]|[!$&'()*+,;=:@])*)*|(?:\/(?:[a-z0-9-._~]|%[a-f0-9]|[!$&'()*+,;=:@])+)*)?(?<query>\?(?:[a-z0-9-._~]|%[a-f0-9]|[!$&'()*+,;=:@]|[/?])+)?(?<fragment>\#(?:[a-z0-9-._~]|%[a-f0-9]|[!$&'()*+,;=:@]|[/?])+)?$/i
// unnamed groups
/^([a-z][a-z0-9+.-]+):(\/\/([^@]+@)?([a-z0-9.\-_~]+)(:\d+)?)?((?:[a-z0-9-._~]|%[a-f0-9]|[!$&'()*+,;=:@])+(?:\/(?:[a-z0-9-._~]|%[a-f0-9]|[!$&'()*+,;=:@])*)*|(?:\/(?:[a-z0-9-._~]|%[a-f0-9]|[!$&'()*+,;=:@])+)*)?(\?(?:[a-z0-9-._~]|%[a-f0-9]|[!$&'()*+,;=:@]|[/?])+)?(\#(?:[a-z0-9-._~]|%[a-f0-9]|[!$&'()*+,;=:@]|[/?])+)?$/i
capture groups
捕获组
- scheme
- authority
- userinfo
- host
- port
- path
- query
- fragment
- 方案
- 权威
- 用户信息
- 主持人
- 港口
- 小路
- 询问
- 分段
回答by Mark Biek
Are there some specific URIs you care about or are you trying to find a single regex that validates STD66?
是否有一些您关心的特定 URI,或者您是否试图找到一个验证STD66 的正则表达式?
I was going to point you to this regexfor parsing a URI. You could then, in theory, check to see if all of the elements you care about are there.
我将指向您使用此正则表达式来解析 URI。然后,理论上,您可以检查您关心的所有元素是否都存在。
But I think bdukesanswer is better.
但我认为bdukes 的回答更好。


