Java 正则表达式匹配匹配域的所有子域

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/19272892/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-12 15:36:54  来源:igfitidea点击:

Regex to match all subdomains of a matched domains

javaregex

提问by Exception

I have a Regex to match a subdomains of a web page like below

我有一个正则表达式来匹配一个网页的子域,如下所示

 "^https://[^/?]+\.(sub1|sub2\.)domain\.com"

What would be the regex to accept any sub domain of domain.com.

接受 domain.com 的任何子域的正则表达式是什么?

Edit:

编辑:

My question was incomplete, my regex was to accept only

我的问题不完整,我的正则表达式只接受

 https:[any number of sub domain s ].sub1domain.com 

or

或者

 https://[any number of sub domain s ].sub2domain.com

Sorry for posting incomplete question.

很抱歉发布不完整的问题。

采纳答案by sp00m

This one should suit your needs:

这个应该适合您的需求:

https?://([a-z0-9]+[.])*sub[12]domain[.]com

Regular expression visualization

正则表达式可视化

回答by BAD_SEED

Something like:

就像是:

(http|https)://(.*).domain.com

At this point second tag (i.e. \2or $2variable) is what you need. Notice, this regex doesn't validate URL.

此时第二个标签(即\2$2变量)就是你所需要的。请注意,此正则表达式不验证 URL。

Proof: https://www.debuggex.com/r/3KYGmAnlnBq3C_fT

证明:https: //www.debuggex.com/r/3KYGmAnlnBq3C_fT

回答by Rhand

Assuming the sub domains contain only numbers and lowercase letters and you do not want to accept sub subdomains:

假设子域仅包含数字和小写字母,并且您不想接受子子域:

[0-9a-z]*\.domain\.com

update:

更新:

https://.*\.sub[1|2]domain\.com

matches

火柴

https://sub1.sub2.sub1domain.com 
https://sub1.sub1domain.com 

but not

但不是

https://sub1domain.com 

回答by AlexR

You would use

你会用

"^https://[^/?]+\.([^.]+)\.domain\.com"

which boils down to matching

归结为匹配

"[^.]+"

for any subdomain. will match only the last part of the subdomain (www.xxx.domain.com will capture "xxx" in group 1)

对于任何子域。将仅匹配子域的最后一部分(www.xxx.domain.com 将捕获组 1 中的“xxx”)

回答by Josh

Try http://([^.]+\\.)+sub[12]domain.com. A great place for testing out regexes with minimal setup pain is RegexPlanet.

试试http://([^.]+\\.)+sub[12]domain.com。以最小的设置痛苦测试正则表达式的好地方是RegexPlanet

回答by fred02138

I'm assuming that don't want the subdomains to differ simply by a number. Use this regex:

我假设不希望子域仅通过一个数字不同。使用这个正则表达式:

(^https:\/\/(?:[\w\-\_]+\.)+(?:subdomain1|subdomain2).com)

The single capture group is the full URL. Simply replace subdomain1 and subdomain2 with your actual subdomains.

单个捕获组是完整的 URL。只需将 subdomain1 和 subdomain2 替换为您的实际子域。

I tested this on regex101.com

我在regex101.com对此进行了测试

回答by SkateScout

Here is an Regex that match any number of subdomains also allowing IDN domains and check the limit of 63 or less characters. And it check that the - is not at first or last position.

这是一个匹配任意数量的子域的正则表达式,还允许 IDN 域并检查 63 个或更少字符的限制。并检查 - 不在第一个或最后一个位置。

https?://([a-z0-9](?:[a-z0-9-]{1,61}[a-z0-9])?[.])*sub[12][.]domain[.]com/