php 从 URL 解析域

Question

提问by zuk1

I need to build a function which parses the domain from a URL.

我需要构建一个从 URL 解析域的函数。

So, with

所以，与

http://google.com/dhasjkdas/sadsdds/sdda/sdads.html

or

或者

http://www.google.com/dhasjkdas/sadsdds/sdda/sdads.html

it should return google.com

它应该返回 google.com

with

和

http://google.co.uk/dhasjkdas/sadsdds/sdda/sdads.html

it should return google.co.uk.

它应该返回google.co.uk。

Answer 1

回答by Owen

Check out parse_url():

退房parse_url()：

$url = 'http://google.com/dhasjkdas/sadsdds/sdda/sdads.html';
$parse = parse_url($url);
echo $parse['host']; // prints 'google.com'

parse_urldoesn't handle really badly mangled urls very well, but is fine if you generally expect decent urls.

parse_url不能很好地处理严重损坏的 url，但如果您通常期望不错的 url，那就没问题了。

Answer 2

回答by Alix Axel

$domain = str_ireplace('www.', '', parse_url($url, PHP_URL_HOST));

This would return the google.comfor both http://google.com/... and http://www.google.com/...

这将返回google.com两个http://google.com/...和http://www.google.com/...

Answer 3

回答by philfreo

From http://us3.php.net/manual/en/function.parse-url.php#93983

来自http://us3.php.net/manual/en/function.parse-url.php#93983

for some odd reason, parse_url returns the host (ex. example.com) as the path when no scheme is provided in the input url. So I've written a quick function to get the real host:

出于某种奇怪的原因，当输入 url 中没有提供方案时，parse_url 返回主机（例如 example.com）作为路径。所以我写了一个快速函数来获取真正的主机：

function getHost($Address) { 
   $parseUrl = parse_url(trim($Address)); 
   return trim($parseUrl['host'] ? $parseUrl['host'] : array_shift(explode('/', $parseUrl['path'], 2))); 
} 

getHost("example.com"); // Gives example.com 
getHost("http://example.com"); // Gives example.com 
getHost("www.example.com"); // Gives www.example.com 
getHost("http://example.com/xyz"); // Gives example.com

Answer 4

回答by Shaun

The code that was meant to work 100% didn't seem to cut it for me, I did patch the example a little but found code that wasn't helping and problems with it. so I changed it out to a couple of functions (to save asking for the list from Mozilla all the time, and removing the cache system). This has been tested against a set of 1000 URLs and seemed to work.

本来可以 100% 工作的代码似乎对我来说并没有削减它，我确实对示例进行了一些修补，但发现代码没有帮助并且存在问题。所以我把它改成了几个功能（为了避免一直从 Mozilla 请求列表，并删除缓存系统）。这已经针对一组 1000 个 URL 进行了测试，并且似乎有效。

function domain($url)
{
    global $subtlds;
    $slds = "";
    $url = strtolower($url);

    $host = parse_url('http://'.$url,PHP_URL_HOST);

    preg_match("/[^\.\/]+\.[^\.\/]+$/", $host, $matches);
    foreach($subtlds as $sub){
        if (preg_match('/\.'.preg_quote($sub).'$/', $host, $xyz)){
            preg_match("/[^\.\/]+\.[^\.\/]+\.[^\.\/]+$/", $host, $matches);
        }
    }

    return @$matches[0];
}

function get_tlds() {
    $address = 'http://mxr.mozilla.org/mozilla-central/source/netwerk/dns/effective_tld_names.dat?raw=1';
    $content = file($address);
    foreach ($content as $num => $line) {
        $line = trim($line);
        if($line == '') continue;
        if(@substr($line[0], 0, 2) == '/') continue;
        $line = @preg_replace("/[^a-zA-Z0-9\.]/", '', $line);
        if($line == '') continue;  //$line = '.'.$line;
        if(@$line[0] == '.') $line = substr($line, 1);
        if(!strstr($line, '.')) continue;
        $subtlds[] = $line;
        //echo "{$num}: '{$line}'"; echo "<br>";
    }

    $subtlds = array_merge(array(
            'co.uk', 'me.uk', 'net.uk', 'org.uk', 'sch.uk', 'ac.uk', 
            'gov.uk', 'nhs.uk', 'police.uk', 'mod.uk', 'asn.au', 'com.au',
            'net.au', 'id.au', 'org.au', 'edu.au', 'gov.au', 'csiro.au'
        ), $subtlds);

    $subtlds = array_unique($subtlds);

    return $subtlds;    
}

Then use it like

然后像这样使用它

$subtlds = get_tlds();
echo domain('www.example.com') //outputs: example.com
echo domain('www.example.uk.com') //outputs: example.uk.com
echo domain('www.example.fr') //outputs: example.fr

I know I should have turned this into a class, but didn't have time.

我知道我应该把它变成一门课，但没有时间。

Answer 5

回答by nikmauro

function get_domain($url = SITE_URL)
{
    preg_match("/[a-z0-9\-]{1,63}\.[a-z\.]{2,6}$/", parse_url($url, PHP_URL_HOST), $_domain_tld);
    return $_domain_tld[0];
}

get_domain('http://www.cdl.gr'); //cdl.gr
get_domain('http://cdl.gr'); //cdl.gr
get_domain('http://www2.cdl.gr'); //cdl.gr

Answer 6

回答by Oleksandr Fediashov

If you want extract host from string http://google.com/dhasjkdas/sadsdds/sdda/sdads.html, usage of parse_url() is acceptable solution for you.

如果你想从 string 中提取主机http://google.com/dhasjkdas/sadsdds/sdda/sdads.html，使用 parse_url() 是你可以接受的解决方案。

But if you want extract domain or its parts, you need package that using Public Suffix List. Yes, you can use string functions arround parse_url(), but it will produce incorrect results sometimes.

但是，如果您想提取域或其部分，则需要使用Public Suffix List 进行打包。是的，您可以在 parse_url() 周围使用字符串函数，但有时会产生不正确的结果。

I recomend TLDExtractfor domain parsing, here is sample code that show diff:

我建议使用TLDExtract进行域解析，这里是显示差异的示例代码：

$extract = new LayerShifter\TLDExtract\Extract();

# For 'http://google.com/dhasjkdas/sadsdds/sdda/sdads.html'

$url = 'http://google.com/dhasjkdas/sadsdds/sdda/sdads.html';

parse_url($url, PHP_URL_HOST); // will return google.com

$result = $extract->parse($url);
$result->getFullHost(); // will return 'google.com'
$result->getRegistrableDomain(); // will return 'google.com'
$result->getSuffix(); // will return 'com'

# For 'http://search.google.com/dhasjkdas/sadsdds/sdda/sdads.html'

$url = 'http://search.google.com/dhasjkdas/sadsdds/sdda/sdads.html';

parse_url($url, PHP_URL_HOST); // will return 'search.google.com'

$result = $extract->parse($url);
$result->getFullHost(); // will return 'search.google.com'
$result->getRegistrableDomain(); // will return 'google.com'

Answer 7

回答by fatih

I've found that @philfreo's solution (referenced from php.net) is pretty well to get fine result but in some cases it shows php's "notice" and "Strict Standards" message. Here a fixed version of this code.

我发现@philfreo 的解决方案（从 php.net 引用）可以很好地获得良好的结果，但在某些情况下，它显示了 php 的“通知”和“严格标准”消息。这是此代码的固定版本。

function getHost($url) { 
   $parseUrl = parse_url(trim($url)); 
   if(isset($parseUrl['host']))
   {
       $host = $parseUrl['host'];
   }
   else
   {
        $path = explode('/', $parseUrl['path']);
        $host = $path[0];
   }
   return trim($host); 
} 

echo getHost("http://example.com/anything.html");           // example.com
echo getHost("http://www.example.net/directory/post.php");  // www.example.net
echo getHost("https://example.co.uk");                      // example.co.uk
echo getHost("www.example.net");                            // example.net
echo getHost("subdomain.example.net/anything");             // subdomain.example.net
echo getHost("example.net");                                // example.net

Answer 8

回答by Kristoffer Bohmann

Please consider replacring the accepted solution with the following:

请考虑使用以下内容替换已接受的解决方案：

parse_url() will always include any sub-domain(s), so this function doesn't parse domain names very well. Here are some examples:

parse_url() 将始终包含任何子域，因此该函数不能很好地解析域名。这里有些例子：

$url = 'http://www.google.com/dhasjkdas/sadsdds/sdda/sdads.html';
$parse = parse_url($url);
echo $parse['host']; // prints 'www.google.com'

echo parse_url('https://subdomain.example.com/foo/bar', PHP_URL_HOST);
// Output: subdomain.example.com

echo parse_url('https://subdomain.example.co.uk/foo/bar', PHP_URL_HOST);
// Output: subdomain.example.co.uk

Instead, you may consider this pragmatic solution. It will cover many, but not all domain names -- for instance, lower-level domains such as 'sos.state.oh.us' are not covered.

相反，您可以考虑这种务实的解决方案。它将涵盖许多域名，但不是所有域名——例如，不包括诸如“sos.state.oh.us”之类的较低级别的域。

function getDomain($url) {
    $host = parse_url($url, PHP_URL_HOST);

    if(filter_var($host,FILTER_VALIDATE_IP)) {
        // IP address returned as domain
        return $host; //* or replace with null if you don't want an IP back
    }

    $domain_array = explode(".", str_replace('www.', '', $host));
    $count = count($domain_array);
    if( $count>=3 && strlen($domain_array[$count-2])==2 ) {
        // SLD (example.co.uk)
        return implode('.', array_splice($domain_array, $count-3,3));
    } else if( $count>=2 ) {
        // TLD (example.com)
        return implode('.', array_splice($domain_array, $count-2,2));
    }
}

// Your domains
    echo getDomain('http://google.com/dhasjkdas/sadsdds/sdda/sdads.html'); // google.com
    echo getDomain('http://www.google.com/dhasjkdas/sadsdds/sdda/sdads.html'); // google.com
    echo getDomain('http://google.co.uk/dhasjkdas/sadsdds/sdda/sdads.html'); // google.co.uk

// TLD
    echo getDomain('https://shop.example.com'); // example.com
    echo getDomain('https://foo.bar.example.com'); // example.com
    echo getDomain('https://www.example.com'); // example.com
    echo getDomain('https://example.com'); // example.com

// SLD
    echo getDomain('https://more.news.bbc.co.uk'); // bbc.co.uk
    echo getDomain('https://www.bbc.co.uk'); // bbc.co.uk
    echo getDomain('https://bbc.co.uk'); // bbc.co.uk

// IP
    echo getDomain('https://1.2.3.45');  // 1.2.3.45

Finally, Jeremy Kendall's PHP Domain Parserallows you to parse the domain name from a url. League URI Hostname Parserwill also do the job.

最后，Jeremy Kendall 的PHP Domain Parser允许您从 url 解析域名。League URI Hostname Parser也将完成这项工作。

Answer 9

回答by Michael

$domain = parse_url($url, PHP_URL_HOST);
echo implode('.', array_slice(explode('.', $domain), -2, 2))

Answer 10

回答by Oleg Matei

You can pass PHP_URL_HOST into parse_url function as second parameter

您可以将 PHP_URL_HOST 作为第二个参数传递给 parse_url 函数

$url = 'http://google.com/dhasjkdas/sadsdds/sdda/sdads.html';
$host = parse_url($url, PHP_URL_HOST);
print $host; // prints 'google.com'

php 从 URL 解析域

提问by zuk1

回答by Owen

回答by Alix Axel

回答by philfreo

回答by Shaun

回答by nikmauro

回答by Oleksandr Fediashov

回答by fatih

回答by Kristoffer Bohmann

回答by Michael

回答by Oleg Matei

相关推荐

最近更新

标签

php 从 URL 解析域

提问by zuk1

回答by Owen

回答by Alix Axel

回答by philfreo

回答by Shaun

回答by nikmauro

回答by Oleksandr Fediashov

回答by fatih

回答by Kristoffer Bohmann

回答by Michael

回答by Oleg Matei

相关推荐

php 想要在浮点数后精确显示 2 位数字

PHP 删除字符串中的一个字符

如何将变量插入到 PHP 数组中？

PHP 后台进程

相关推荐

最近更新

标签