JAVA 中的网站/URL 验证正则表达式
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/24924072/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Website/URL Validation Regex in JAVA
提问by Hao Ting
I need a regex string to match URL starting with "http://", "https://", "www.", "google.com"
我需要一个正则表达式字符串来匹配以“http://”、“https://”、“www.”、“google.com”开头的 URL
the code i tried using is:
我尝试使用的代码是:
//Pattern to check if this is a valid URL address
Pattern p = Pattern.compile("(http://|https://)(www.)?([a-zA-Z0-9]+).[a-zA-Z0-9]*.[a-z]{3}.?([a-z]+)?");
Matcher m;
m=p.matcher(urlAddress);
but this code only can match url such as "http://www.google.com"
但此代码只能匹配诸如“ http://www.google.com”之类的网址
I know this ma be a dupicate question but i have tried all of the regex provided and it does not suit my requirement. Willl someone please help me? Thank you.
我知道这可能是一个重复的问题,但我已经尝试了所有提供的正则表达式,但它不符合我的要求。有人会帮助我吗?谢谢你。
采纳答案by Avinash Raj
回答by raghavsood33
Java compatible version of @Avinash's answer would be
@Avinash 答案的 Java 兼容版本是
//Pattern to check if this is a valid URL address
Pattern p = Pattern.compile("^(http://|https://)?(www.)?([a-zA-Z0-9]+).[a-zA-Z0-9]*.[a-z]{3}.?([a-z]+)?$");
Matcher m;
m=p.matcher(urlAddress);
boolean matches = m.matches();
回答by Raj Hassani
You can use the Apache commons library(org.apache.commons.validator.UrlValidator) for validating a url:
您可以使用 Apache 公共库(org.apache.commons.validator.UrlValidator)来验证 url:
String[] schemes = {"http","https"}.
UrlValidator urlValidator = new UrlValidator(schemes);
And use :-
并使用:-
urlValidator.isValid(your url)
Then there is no need of regex.
那么就不需要正则表达式了。
回答by KnechtRootrecht
If you use Java, I recommend use this RegEx (I wrote it by myself):
如果你使用 Java,我推荐使用这个 RegEx(我自己写的):
^(https?:\/\/)?(www\.)?([\w]+\.)+[??\w]{2,63}\/?$
"^(https?:\/\/)?(www\.)?([\w]+\.)+[??\w]{2,63}\/?$" // as Java-String
to explain:
解释:
- ^ = line start
- (https?://)? = "http://" or "https://" may occur.
- (www.)? = "www." may orrur.
- ([\w]+.)+ = a word ([a-zA-Z0-9]) has to occur one or more times. (extend here if you need special characters like ü, ?, ? or others in your URL - remember to use IDN.toASCII(url) if you use special characters. If you need to know which characters are legal in general: https://kb.ucla.edu/articles/what-characters-can-go-into-a-valid-http-url
- [??\w]{2,63} = a word ([a-zA-Z0-9]) with 2 to 63 characters has to occur exactly one time. (a TLD (top level domain (for example .com) can not be shorter than 2 or longer than 63 characters)
- /? = a "/"-character may occur. (some people or servers put a / at the end... whatever)
- $ = line end
- ^ = 行开始
- (https?://)?= "http://" 或 "https://" 可能会出现。
- (万维网。)?=“万维网。” 可能orrur。
- ([\w]+.)+ = 一个词 ([a-zA-Z0-9]) 必须出现一次或多次。(如果您的 URL 中需要特殊字符,如 ü、?、? 或其他字符,请在此处扩展 - 如果您使用特殊字符,请记住使用 IDN.toASCII(url)。如果您需要知道哪些字符通常是合法的:https:/ /kb.ucla.edu/articles/what-characters-can-go-into-a-valid-http-url
- [??\w]{2,63} = 2 到 63 个字符的单词 ([a-zA-Z0-9]) 必须恰好出现一次。(TLD(顶级域(例如 .com)不能短于 2 个或长于 63 个字符)
- /? = 可能会出现“/”字符。(有些人或服务器在最后放了一个 / ......无论如何)
- $ = 行尾
-
——
If you extend it by special characters it could look like this:
如果用特殊字符扩展它,它可能如下所示:
^(https?:\/\/)?(www\.)?([\w\Q$-_+!*'(),%\E]+\.)+[??\w]{2,63}\/?$
"^(https?:\/\/)?(www\.)?([\w\Q$-_+!*'(),%\E]+\.)+[??\w]{2,63}\/?$" // as Java-String
The answer of Avinash Raj is not fully correct.
Avinash Raj 的回答并不完全正确。
^(http:\/\/|https:\/\/)?(www.)?([a-zA-Z0-9]+).[a-zA-Z0-9]*.[a-z]{3}.?([a-z]+)?$
The dots are not escaped what means it matches with any character. Also my version is simpler and I never heard of a domain like "test..com" (which actually matches...)
点不会被转义,这意味着它与任何字符匹配。我的版本也更简单,我从来没有听说过像“test..com”这样的域(实际上匹配......)
Demo: https://regex101.com/r/vM7wT6/279
演示:https: //regex101.com/r/vM7wT6/279
Edit: As I saw some people needing a regex which also matches servers directories I wrote this:
编辑:当我看到有些人需要一个也匹配服务器目录的正则表达式时,我写了这个:
^(https?:\/\/)?([\w\Q$-_+!*'(),%\E]+\.)+(\w{2,63})(:\d{1,4})?([\w\Q/$-_+!*'(),%\E]+\.?[\w])*\/?$
while this may not be the best one, since I didn't spend too much time with it, maybe it helps someone. You can see how it works here: https://regex101.com/r/vM7wT6/700It also matches urls like "hello.to/test/whatever.cgi"
虽然这可能不是最好的,因为我没有花太多时间在它上面,也许它对某人有帮助。你可以在这里看到它是如何工作的:https: //regex101.com/r/vM7wT6/700 它还匹配像“hello.to/test/whatever.cgi”这样的网址
回答by darkwinter
pattern="w{3}\.[a-z]+\.?[a-z]{2,3}(|\.[a-z]{2,3})"
this will only accept addresses like e.g www.google.com & www.google.co.in
这将只接受诸如 www.google.com 和 www.google.co.in 之类的地址
回答by Anabel Berumen
//I use that
//我用那个
static boolean esURL(String cadena){
boolean bandera = false;
bandera = cadena.matches("\b(https://?|ftp://|file://|www.)[-a-zA-Z0-9+&@#/%?=~_|!:,.;]*[-a-zA-Z0-9+&@#/%=~_|]");
return bandera;
}