Javascript 什么是匹配 URL 的好的正则表达式?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/3809401/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
What is a good regular expression to match a URL?
提问by bigbob
Currently I have an input box which will detect the URL and parse the data.
目前我有一个输入框,它将检测 URL 并解析数据。
So right now, I am using:
所以现在,我正在使用:
var urlR = /^(?:([A-Za-z]+):)?(\/{0,3})([0-9.\-A-Za-z]+)
(?::(\d+))?(?:\/([^?#]*))?(?:\?([^#]*))?(?:#(.*))?$/;
var url= content.match(urlR);
The problem is, when I enter a URL like www.google.com
, its not working. when I entered http://www.google.com
, it is working.
问题是,当我输入类似 的 URL 时www.google.com
,它不起作用。当我进入时http://www.google.com
,它正在工作。
I am not very fluent in regular expressions. Can anyone help me?
我对正则表达式不是很流利。谁能帮我?
回答by Daveo
Regex if you want to ensure URL starts with HTTP/HTTPS:
如果要确保 URL 以 HTTP/HTTPS 开头,则使用正则表达式:
https?:\/\/(www\.)?[-a-zA-Z0-9@:%._\+~#=]{1,256}\.[a-zA-Z0-9()]{1,6}\b([-a-zA-Z0-9()@:%_\+.~#?&//=]*)
If you do not require HTTP protocol:
如果您不需要 HTTP 协议:
[-a-zA-Z0-9@:%._\+~#=]{1,256}\.[a-zA-Z0-9()]{1,6}\b([-a-zA-Z0-9()@:%_\+.~#?&//=]*)
To try this out see http://regexr.com?37i6s, or for a version which is less restrictive http://regexr.com/3e6m0.
要尝试此操作,请参阅http://regexr.com?37i6s,或查看限制较少的版本http://regexr.com/3e6m0。
Example JavaScript implementation:
JavaScript 实现示例:
var expression = /[-a-zA-Z0-9@:%._\+~#=]{1,256}\.[a-zA-Z0-9()]{1,6}\b([-a-zA-Z0-9()@:%_\+.~#?&//=]*)?/gi;
var regex = new RegExp(expression);
var t = 'www.google.com';
if (t.match(regex)) {
alert("Successful match");
} else {
alert("No match");
}
回答by foufos
(https?:\/\/(?:www\.|(?!www))[a-zA-Z0-9][a-zA-Z0-9-]+[a-zA-Z0-9]\.[^\s]{2,}|www\.[a-zA-Z0-9][a-zA-Z0-9-]+[a-zA-Z0-9]\.[^\s]{2,}|https?:\/\/(?:www\.|(?!www))[a-zA-Z0-9]+\.[^\s]{2,}|www\.[a-zA-Z0-9]+\.[^\s]{2,})
Will match the following cases
将匹配以下情况
http://www.foufos.gr
https://www.foufos.gr
http://foufos.gr
http://www.foufos.gr/kino
http://werer.gr
www.foufos.gr
www.mp3.com
www.t.co
http://t.co
http://www.t.co
https://www.t.co
www.aa.com
http://aa.com
http://www.aa.com
https://www.aa.com
http://www.foufos.gr
https://www.foufos.gr
http://foufos.gr
http://www.foufos.gr/kino
http://werer.gr
www.foufos.gr
www.mp3.com
www.t.co
http://t.co
http://www.t.co
https://www.t.co
www.aa.com
http://aa.com
http://www.aa.com
https://www.aa.com
Will NOT match the following
将不匹配以下内容
www.foufos
www.foufos-.gr
www.-foufos.gr
foufos.gr
http://www.foufos
http://foufos
www.mp3#.com
www.foufos
www.foufos-.gr
www.-foufos.gr
foufos.gr
http://www.foufos
http://foufos
www.mp3#.com
var expression = /(https?:\/\/(?:www\.|(?!www))[a-zA-Z0-9][a-zA-Z0-9-]+[a-zA-Z0-9]\.[^\s]{2,}|www\.[a-zA-Z0-9][a-zA-Z0-9-]+[a-zA-Z0-9]\.[^\s]{2,}|https?:\/\/(?:www\.|(?!www))[a-zA-Z0-9]+\.[^\s]{2,}|www\.[a-zA-Z0-9]+\.[^\s]{2,})/gi;
var regex = new RegExp(expression);
var check = [
'http://www.foufos.gr',
'https://www.foufos.gr',
'http://foufos.gr',
'http://www.foufos.gr/kino',
'http://werer.gr',
'www.foufos.gr',
'www.mp3.com',
'www.t.co',
'http://t.co',
'http://www.t.co',
'https://www.t.co',
'www.aa.com',
'http://aa.com',
'http://www.aa.com',
'https://www.aa.com',
'www.foufos',
'www.foufos-.gr',
'www.-foufos.gr',
'foufos.gr',
'http://www.foufos',
'http://foufos',
'www.mp3#.com'
];
check.forEach(function(entry) {
if (entry.match(regex)) {
$("#output").append( "<div >Success: " + entry + "</div>" );
} else {
$("#output").append( "<div>Fail: " + entry + "</div>" );
}
});
<script src="https://cdnjs.cloudflare.com/ajax/libs/jquery/3.3.1/jquery.min.js"></script>
<div id="output"></div>
Check it in rubular - NEW version
回答by Michael Connor
These are the droids you're looking for. This is taken from validator.jswhich is the library you should really use to do this. But if you want to roll your own, who am I to stop you? If you want pure regex then you can just take out the length check. I think it's a good idea to test the length of the URL though if you really want to determine compliance with the spec.
这些就是你要找的机器人。这取自validator.js,这是您真正应该用来执行此操作的库。但如果你想自己动手,我是谁来阻止你?如果你想要纯正则表达式,那么你可以去掉长度检查。我认为测试 URL 的长度是个好主意,但如果您真的想确定是否符合规范。
function isURL(str) {
var urlRegex = '^(?!mailto:)(?:(?:http|https|ftp)://)(?:\S+(?::\S*)?@)?(?:(?:(?:[1-9]\d?|1\d\d|2[01]\d|22[0-3])(?:\.(?:1?\d{1,2}|2[0-4]\d|25[0-5])){2}(?:\.(?:[0-9]\d?|1\d\d|2[0-4]\d|25[0-4]))|(?:(?:[a-z\u00a1-\uffff0-9]+-?)*[a-z\u00a1-\uffff0-9]+)(?:\.(?:[a-z\u00a1-\uffff0-9]+-?)*[a-z\u00a1-\uffff0-9]+)*(?:\.(?:[a-z\u00a1-\uffff]{2,})))|localhost)(?::\d{2,5})?(?:(/|\?|#)[^\s]*)?$';
var url = new RegExp(urlRegex, 'i');
return str.length < 2083 && url.test(str);
}
回答by Amar Palsapure
Another possible solution, above solution failed for me in parsing query string params.
另一个可能的解决方案,上述解决方案在解析查询字符串参数时对我来说失败了。
var regex = new RegExp("^(http[s]?:\/\/(www\.)?|ftp:\/\/(www\.)?|www\.){1}([0-9A-Za-z-\.@:%_\+~#=]+)+((\.[a-zA-Z]{2,3})+)(/(.)*)?(\?(.)*)?");
if(regex.test("http://google.com")){
alert("Successful match");
}else{
alert("No match");
}
In this solution please feel free to modify [-0-9A-Za-z\.@:%_\+~#=
, to match the domain/sub domain name. In this solution query string parameters are also taken care.
在这个解决方案中,请随意修改[-0-9A-Za-z\.@:%_\+~#=
,以匹配域/子域名。在这个解决方案中,查询字符串参数也得到了照顾。
If you are not using RegEx
, then from the expression replace \\
by \
.
如果您没有使用RegEx
,则从表达式中替换\\
为\
。
Hope this helps.
希望这可以帮助。
回答by Roman
try this
尝试这个
(ftp|http|https):\/\/(\w+:{0,1}\w*@)?(\S+)(:[0-9]+)?(\/|\/([\w#!:.?+=&%@!\-\/]))?
回答by Eric Heikkinen
I was trying to put together some JavaScript to validate a domain name (ex. google.com) and if it validates enable a submit button. I thought that I would share my code for those who are looking to accomplish something similar. It expects a domain without any http:// or www. value. The script uses a stripped down regular expression from above for domain matching, which isn't strict about fake TLD.
我试图将一些 JavaScript 放在一起来验证域名(例如 google.com),如果它验证启用提交按钮。我想我会为那些希望完成类似事情的人分享我的代码。它需要一个没有任何 http:// 或 www 的域。价值。该脚本使用从上面剥离的正则表达式进行域匹配,这对假 TLD 并不严格。
$(function () {
$('#whitelist_add').keyup(function () {
if ($(this).val() == '') { //Check to see if there is any text entered
//If there is no text within the input, disable the button
$('.whitelistCheck').attr('disabled', 'disabled');
} else {
// Domain name regular expression
var regex = new RegExp("^([0-9A-Za-z-\.@:%_\+~#=]+)+((\.[a-zA-Z]{2,3})+)(/(.)*)?(\?(.)*)?");
if (regex.test($(this).val())) {
// Domain looks OK
//alert("Successful match");
$('.whitelistCheck').removeAttr('disabled');
} else {
// Domain is NOT OK
//alert("No match");
$('.whitelistCheck').attr('disabled', 'disabled');
}
}
});
});
HTML FORM:
HTML 表格:
<form action="domain_management.php" method="get">
<input type="text" name="whitelist_add" id="whitelist_add" placeholder="domain.com">
<button type="submit" class="btn btn-success whitelistCheck" disabled='disabled'>Add to Whitelist</button>
</form>