用于验证 URL 的 Javascript 正则表达式

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/8667070/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-24 06:58:45  来源:igfitidea点击:

Javascript regular expression to validate URL

javascripthtmlregexvalidation

提问by Muhammad Imran Tariq

I am validating URL with following regular expression. I want to validate google.comalso but it returns false. What can be changed in R.E below to validate google.com.

我正在使用以下正则表达式验证 URL。我也想验证google.com,但它返回 false。可以在下面的 RE 中更改以验证google.com 的内容

console.log(learnRegExp('http://www.google-com.123')); // false
console.log(learnRegExp('https://www.google-com.com')); // true
console.log(learnRegExp('http://google-com.com')); // true
console.log(learnRegExp('http://google.com')); //true
console.log(learnRegExp('google.com')); //false

function learnRegExp(){
  return /^(ftp|https?):\/\/+(www\.)?[a-z0-9\-\.]{3,}\.[a-z]{3}$/.test(learnRegExp.arguments[0]);
}

回答by Christian David

This validate the URL in general

这通常验证 URL

console.log('http://www.google-com.123.com', validateUrl('http://www.google-com.123.com')); // true 
console.log('http://www.google-com.123', validateUrl('http://www.google-com.123')); // false 
console.log('https://www.google-com.com', validateUrl('https://www.google-com.com')); // true 
console.log('http://google-com.com', validateUrl('http://google-com.com')); // true 
console.log('http://google.com', validateUrl('http://google.com')); //true 
console.log('google.com', validateUrl('google.com')); //false
console.log('http://www.gfh.', validateUrl('http://www.gfh.')); //false
console.log('http://www.gfh.c', validateUrl('http://www.gfh.c')); //false
console.log('http://www.gfh:800000', validateUrl('http://www.gfh:800000')); //false
console.log('www.google.com ', validateUrl('www.google.com ')); //false
console.log('http://google', validateUrl('http://google')); //false
console.log('//cdnblabla.cloudfront.net/css/app.css', validateUrl('//cdnblabla.cloudfront.net/css/app.css')); //true

function validateUrl(value) {
  return /^(?:(?:(?:https?|ftp):)?\/\/)(?:\S+(?::\S*)?@)?(?:(?!(?:10|127)(?:\.\d{1,3}){3})(?!(?:169\.254|192\.168)(?:\.\d{1,3}){2})(?!172\.(?:1[6-9]|2\d|3[0-1])(?:\.\d{1,3}){2})(?:[1-9]\d?|1\d\d|2[01]\d|22[0-3])(?:\.(?:1?\d{1,2}|2[0-4]\d|25[0-5])){2}(?:\.(?:[1-9]\d?|1\d\d|2[0-4]\d|25[0-4]))|(?:(?:[a-z\u00a1-\uffff0-9]-*)*[a-z\u00a1-\uffff0-9]+)(?:\.(?:[a-z\u00a1-\uffff0-9]-*)*[a-z\u00a1-\uffff0-9]+)*(?:\.(?:[a-z\u00a1-\uffff]{2,})))(?::\d{2,5})?(?:[/?#]\S*)?$/i.test(value);
}

Should Match

应该匹配

["//www.google.com", "//cdnblabla.cloudfront.net/css/app.css", "http://?df.ws/123", "http://userid:[email protected]:8080", "http://userid:[email protected]:8080/", "http://[email protected]", "http://[email protected]/", "http://[email protected]:8080", "http://[email protected]:8080/", "http://userid:[email protected]", "http://userid:[email protected]/", "http://142.42.1.1/", "http://142.42.1.1:8080/", "http://?.ws/?", "http://?.ws", "http://?.ws/", "http://foo.com/blah_(wikipedia)#cite-1", "http://foo.com/blah_(wikipedia)_blah#cite-1", "http://foo.com/unicode_(?)_in_parens", "http://foo.com/(something)?after=parens", "http://?.damowmow.com/", "http://code.google.com/events/#&product=browser", "http://j.mp", "ftp://foo.bar/baz", "http://foo.bar/?q=Test%20URL-encoded%20stuff", "http://????.??????", "http://例子.测试"].map(function(url) {
  console.log(url, validateUrl(url));
});

function validateUrl(value) {
  return /^(?:(?:(?:https?|ftp):)?\/\/)(?:\S+(?::\S*)?@)?(?:(?!(?:10|127)(?:\.\d{1,3}){3})(?!(?:169\.254|192\.168)(?:\.\d{1,3}){2})(?!172\.(?:1[6-9]|2\d|3[0-1])(?:\.\d{1,3}){2})(?:[1-9]\d?|1\d\d|2[01]\d|22[0-3])(?:\.(?:1?\d{1,2}|2[0-4]\d|25[0-5])){2}(?:\.(?:[1-9]\d?|1\d\d|2[0-4]\d|25[0-4]))|(?:(?:[a-z\u00a1-\uffff0-9]-*)*[a-z\u00a1-\uffff0-9]+)(?:\.(?:[a-z\u00a1-\uffff0-9]-*)*[a-z\u00a1-\uffff0-9]+)*(?:\.(?:[a-z\u00a1-\uffff]{2,})))(?::\d{2,5})?(?:[/?#]\S*)?$/i.test(value);
}

Should Fail

应该失败

["http://", "http://.", "http://..", "http://../", "http://?", "http://??", "http://??/", "http://#", "http://##", "http://##/", "http://foo.bar?q=Spaces should be encoded", "//", "//a", "///a", "///", "http:///a", "foo.com", "rdar://1234", "h://test", "http:// shouldfail.com", ":// should fail", "http://foo.bar/foo(bar)baz quux", "ftps://foo.bar/", "http://-error-.invalid/", "http://-a.b.co", "http://a.b-.co", "http://0.0.0.0", "http://10.1.1.0", "http://10.1.1.255", "http://224.1.1.1", "http://1.1.1.1.1", "http://123.123.123", "http://3628126748", "http://.www.foo.bar/", "http://www.foo.bar./", "http://.www.foo.bar./", "http://10.1.1.1", "http://10.1.1.254"].map(function(url) {
  console.log(url, validateUrl(url));
});

function validateUrl(value) {
  return /^(?:(?:(?:https?|ftp):)?\/\/)(?:\S+(?::\S*)?@)?(?:(?!(?:10|127)(?:\.\d{1,3}){3})(?!(?:169\.254|192\.168)(?:\.\d{1,3}){2})(?!172\.(?:1[6-9]|2\d|3[0-1])(?:\.\d{1,3}){2})(?:[1-9]\d?|1\d\d|2[01]\d|22[0-3])(?:\.(?:1?\d{1,2}|2[0-4]\d|25[0-5])){2}(?:\.(?:[1-9]\d?|1\d\d|2[0-4]\d|25[0-4]))|(?:(?:[a-z\u00a1-\uffff0-9]-*)*[a-z\u00a1-\uffff0-9]+)(?:\.(?:[a-z\u00a1-\uffff0-9]-*)*[a-z\u00a1-\uffff0-9]+)*(?:\.(?:[a-z\u00a1-\uffff]{2,})))(?::\d{2,5})?(?:[/?#]\S*)?$/i.test(value);
}

How it works

这个怎么运作

// protocol identifier
"(?:(?:(?:https?|ftp):)?//)"
// user:pass authentication
"(?:\S+(?::\S*)?@)?"
"(?:"
// IP address exclusion
// private & local networks
"(?!(?:10|127)(?:\.\d{1,3}){3})"
"(?!(?:169\.254|192\.168)(?:\.\d{1,3}){2})"
"(?!172\.(?:1[6-9]|2\d|3[0-1])(?:\.\d{1,3}){2})"
// IP address dotted notation octets
// excludes loopback network 0.0.0.0
// excludes reserved space >= 224.0.0.0
// excludes network & broacast addresses
// (first & last IP address of each class)
"(?:[1-9]\d?|1\d\d|2[01]\d|22[0-3])"
"(?:\.(?:1?\d{1,2}|2[0-4]\d|25[0-5])){2}"
"(?:\.(?:[1-9]\d?|1\d\d|2[0-4]\d|25[0-4]))"
"|"
// host name
"(?:(?:[a-z\u00a1-\uffff0-9]-*)*[a-z\u00a1-\uffff0-9]+)"
// domain name
"(?:\.(?:[a-z\u00a1-\uffff0-9]-*)*[a-z\u00a1-\uffff0-9]+)*"
// TLD identifier
"(?:\.(?:[a-z\u00a1-\uffff]{2,})))"
// port number
"(?::\d{2,5})?"
// resource path
"(?:[/?#]\S*)?"

All this comes from this gist, i hope that this fill all your needs

所有这一切都来自这个要点,我希望这能满足您的所有需求

回答by Rorshack

This is perfect for me. I hope it will be perfect for someone else! :)

这对我来说是完美的。我希望它对别人来说是完美的!:)

/^((https?):\/\/)?([w|W]{3}\.)+[a-zA-Z0-9\-\.]{3,}\.[a-zA-Z]{2,}(\.[a-zA-Z]{2,})?$/

/^((https?):\/\/)?([w|W]{3}\.)+[a-zA-Z0-9\-\.]{3,}\.[a-zA-Z]{2,}(\.[a-zA-Z]{2,})?$/

回答by Acn

/(http|https):\/\/(\w+:{0,1}\w*@)?(\S+)(:[0-9]+)?(\/|\/([\w#!:.?+=&%@!\-\/]))?/

回答by ArVan

Here you go, you need to make "ftp/http(s)://" not a MUST. Use "?" for this.

在这里,您需要使“ftp/http(s)://”不是必须的。用 ”?” 为了这。

function learnRegExp(){
  return /((ftp|https?):\/\/)?(www\.)?[a-z0-9\-\.]{3,}\.[a-z]{3}$/.test(learnRegExp.arguments[0]);
}

回答by T I

^...symbol is saying begins with so the final log makes sense, i.e. the string does not begin with ftp or http(s). Youre also saying with ...$that the end that the string must finish with three letters which again where it fails (line 2) it does not end like this. Some minor adjustments and you should be there.

^...符号是说开始于所以最后的日志是有道理的,即字符串不以 ftp 或 http(s) 开头。你还说...$字符串必须以三个字母结尾,而在它失败的地方(第 2 行),它不会像这样结束。一些小的调整,你应该在那里。

回答by Luciano Graziani

Why not have multiple regex for each case?

为什么不为每个案例有多个正则表达式?

  1. Valid alphanumeric url: /^https?:\/\/([\w\d\-]+\.)+\w{2,}(\/.+)?$/

    Works with http://sub_do-main.a.coto things like https://a.a.a.a.aa/my-awesome_url?asd=12. You can try it at: https://regex101.com/r/oXFuGy/2

  2. IPv4, short and fast but not 100% accurate (it is used by validator.js): /^(\d{1,3}(\.|$)){4}/. It allows things like 999.999.999.999.

  3. IPv4, larger, slower but 100% accurate: ^(((25[0-5])|(2[0-4]\d)|(1\d{2})|(\d{1,2}))\.){3}(((25[0-5])|(2[0-4]\d)|(1\d{2})|(\d{1,2})))$. Found it here: https://stackoverflow.com/a/50650510/2862917.
  4. IPv6 (I don't know exactly if this is the best/accurate/faster approach: ^(([\da-fA-F]{0,4}:){1,7}[\da-fA-F]{0,4})$. Found it here: Regular expression that matches valid IPv6 addresseswhich has as a first answer why would be better to stop using regex to try to validate IP's (at least those v6).
  1. 有效的字母数字网址: /^https?:\/\/([\w\d\-]+\.)+\w{2,}(\/.+)?$/

    适用http://sub_do-main.a.co于诸如https://a.a.a.a.aa/my-awesome_url?asd=12. 你可以试试:https: //regex101.com/r/oXFuGy/2

  2. IPv4中,短而快,但不是100%准确的(它由validator.js) /^(\d{1,3}(\.|$)){4}/。它允许诸如999.999.999.999.

  3. IPv4,更大、更慢但 100% 准确:^(((25[0-5])|(2[0-4]\d)|(1\d{2})|(\d{1,2}))\.){3}(((25[0-5])|(2[0-4]\d)|(1\d{2})|(\d{1,2})))$. 在这里找到:https: //stackoverflow.com/a/50650510/2862917
  4. IPv6(我不知道这是否是最好/准确/更快的方法:^(([\da-fA-F]{0,4}:){1,7}[\da-fA-F]{0,4})$。在这里找到:匹配有效 IPv6 地址的正则表达式作为第一个答案为什么最好停止使用正则表达式来尝试验证 IP (至少那些 v6)。