php 你如何从php中的URL中去除域名?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/176284/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How do you strip out the domain name from a URL in php?
提问by Robert Elwell
Im looking for a method (or function) to strip out the domain.ext part of any URL thats fed into the function. The domain extension can be anything (.com, .co.uk, .nl, .whatever), and the URL thats fed into it can be anything from http://www.domain.comto www.domain.com/path/script.php?=whatever
我正在寻找一种方法(或函数)来去除输入函数的任何 URL 的 domain.ext 部分。域扩展名可以是任何内容(.com、.co.uk、.nl、.whatever),输入的 URL 可以是从http://www.domain.com到 www.domain.com/path 的任何内容/script.php?=随便
Whats the best way to go about doing this?
这样做的最佳方法是什么?
回答by Robert Elwell
parse_urlturns a URL into an associative array:
parse_url将 URL 转换为关联数组:
php > $foo = "http://www.example.com/foo/bar?hat=bowler&accessory=cane";
php > $blah = parse_url($foo);
php > print_r($blah);
Array
(
[scheme] => http
[host] => www.example.com
[path] => /foo/bar
[query] => hat=bowler&accessory=cane
)
回答by DavidM
You can use parse_url()to do this:
您可以使用parse_url()来做到这一点:
$url = 'http://www.example.com';
$domain = parse_url($url, PHP_URL_HOST);
$domain = str_replace('www.','',$domain);
In this example, $domain should contain example.com, irrespective of it having www or not. It also works for a domain such as .co.uk
在这个例子中,$domain 应该包含 example.com,不管它有没有 www。它也适用于 .co.uk 等域
回答by firstresponder
You can also write a regular expression to get exactly what you want.
您还可以编写一个正则表达式来准确获得您想要的内容。
Here is my attempt at it:
这是我的尝试:
$pattern = '/\w+\..{2,3}(?:\..{2,3})?(?:$|(?=\/))/i';
$url = 'http://www.example.com/foo/bar?hat=bowler&accessory=cane';
if (preg_match($pattern, $url, $matches) === 1) {
echo $matches[0];
}
The output is:
输出是:
example.com
This pattern also takes into consideration domains such as 'example.com.au'.
此模式还考虑了诸如“example.com.au”之类的域。
Note: I have not consulted the relevant RFC.
注意:我没有查阅相关的 RFC。
回答by Mark Shust at M.academy
Here are a couple simple functions to get the root domain (example.com) from a normal or long domain (test.sub.domain.com) or url (http://www.example.com).
这里有几个简单的函数可以从普通或长域 (test.sub.domain.com) 或 url (http://www.example.com) 获取根域 (example.com)。
/**
* Get root domain from full domain
* @param string $domain
*/
public function getRootDomain($domain)
{
$domain = explode('.', $domain);
$tld = array_pop($domain);
$name = array_pop($domain);
$domain = "$name.$tld";
return $domain;
}
/**
* Get domain name from url
* @param string $url
*/
public function getDomainFromUrl($url)
{
$domain = parse_url($url, PHP_URL_HOST);
$domain = $this->getRootDomain($domain);
return $domain;
}
回答by z3ro
Solved this...
解决了这个...
Say we're calling dev.mysite.com and we want to extract 'mysite.com'
假设我们正在调用 dev.mysite.com 并且我们想要提取“mysite.com”
$requestedServerName = $_SERVER['SERVER_NAME']; // = dev.mysite.com
$thisSite = explode('.', $requestedServerName); // site name now an array
array_shift($thisSite); //chop off the first array entry eg 'dev'
$thisSite = join('.', $thisSite); //join it back together with dots ;)
echo $thisSite; //outputs 'mysite.com'
Works with mysite.co.uk too so should work everywhere :)
也适用于 mysite.co.uk,所以应该可以在任何地方使用 :)
回答by Oleksandr Fediashov
There is only one correct way to extract domain parts, it's use Public Suffix List(database of TLDs). I recomend TLDExtractpackage, here is sample code:
只有一种正确的方法来提取域部分,它是使用公共后缀列表(TLD 数据库)。我推荐TLDExtract包,这里是示例代码:
$extract = new LayerShifter\TLDExtract\Extract();
$result = $extract->parse('www.domain.com/path/script.php?=whatever');
$result->getSubdomain(); // will return (string) 'www'
$result->getHostname(); // will return (string) 'domain'
$result->getSuffix(); // will return (string) 'com'
回答by Mohamad Hamouday
This function should work:
这个功能应该工作:
function Delete_Domain_From_Url($Url = false)
{
if($Url)
{
$Url_Parts = parse_url($Url);
$Url = isset($Url_Parts['path']) ? $Url_Parts['path'] : '';
$Url .= isset($Url_Parts['query']) ? "?".$Url_Parts['query'] : '';
}
return $Url;
}
To use it:
要使用它:
$Url = "https://stackoverflow.com/questions/176284/how-do-you-strip-out-the-domain-name-from-a-url-in-php";
echo Delete_Domain_From_Url($Url);
# Output:
#/questions/176284/how-do-you-strip-out-the-domain-name-from-a-url-in-php
回答by livingtech
I spent some time thinking about whether it makes sense to use a regular expression for this, but in the end I think not.
我花了一些时间思考为此使用正则表达式是否有意义,但最终我认为没有。
firstresponder's regexp came close to convincing me it was the best way, but it didn't work on anything missing a trailing slash (so http://example.com, for instance). I fixed that with the following: '/\w+\..{2,3}(?:\..{2,3})?(?=[\/\W])/i', but then I realized that matches twice for urls like 'http://example.com/index.htm'. Oops. That wouldn't be so bad (just use the first one), but it also matches twice on something like this: 'http://abc.ed.fg.hij.kl.mn/', and the first match isn't the right one. :(
firstresponder 的 regexp 几乎让我信服这是最好的方法,但它对缺少尾部斜杠的任何东西都不起作用(例如http://example.com)。我用以下方法解决了这个问题:'/\w+\..{2,3}(?:\..{2,3})?(?=[\/\W])/i',但后来我意识到对于像“ http://example.com/index.htm”这样的网址匹配两次。哎呀。那不会那么糟糕(只需使用第一个),但它也匹配两次类似这样的东西:' http: //abc.ed.fg.hij.kl.mn/',第一个匹配是'正确的。:(
A co-worker suggested just getting the host (via parse_url()), and then just taking the last two or three array bits (split()on '.') The two or three would be based on a list of domains, like 'co.uk', etc. Making up that list becomes the hard part.
一位同事建议只获取主机(通过parse_url()),然后只获取最后两个或三个数组位(split()在 '.' 上)这两个或三个将基于域列表,例如“co.uk”,等等。制作该列表成为困难的部分。

