php PHP从子域获取域名

Question

提问by zuk1

I need to write a function to parse variables which contain domain names. It's best I explain this with an example, the variable could contain any of these things:

我需要编写一个函数来解析包含域名的变量。我最好用一个例子来解释这一点，变量可以包含以下任何内容：

here.example.com
example.com
example.org
here.example.org

But when passed through my function all of these must return either example.com or example.co.uk, the root domain name basically. I'm sure I've done this before but I've been searching Google for about 20 minutes and can't find anything. Any help would be appreciated.

但是当通过我的函数时，所有这些都必须返回 example.com 或 example.co.uk，基本上是根域名。我确定我以前这样做过，但我已经在谷歌上搜索了大约 20 分钟，但找不到任何东西。任何帮助，将不胜感激。

EDIT: Ignore the .co.uk, presume that all domains going through this function have a 3 letter TLD.

编辑：忽略 .co.uk，假设通过此功能的所有域都有 3 个字母的 TLD。

Answer 1

回答by Sampson

Stackoverflow Question Archive:

Stackoverflow 问题存档：

print get_domain("http://somedomain.co.uk"); // outputs 'somedomain.co.uk'

function get_domain($url)
{
  $pieces = parse_url($url);
  $domain = isset($pieces['host']) ? $pieces['host'] : '';
  if (preg_match('/(?P<domain>[a-z0-9][a-z0-9\-]{1,63}\.[a-z\.]{2,6})$/i', $domain, $regs)) {
    return $regs['domain'];
  }
  return false;
}

Answer 2

回答by suncat100

If you want a fast simple solution, without external calls and checking against predefined arrays. Works for new domains like "www.domain.gallery" also, unlike the most popular answer.

如果你想要一个快速简单的解决方案，不需要外部调用和检查预定义的数组。也适用于“www.domain.gallery”等新域，这与最受欢迎的答案不同。

function get_domain($host){
  $myhost = strtolower(trim($host));
  $count = substr_count($myhost, '.');
  if($count === 2){
    if(strlen(explode('.', $myhost)[1]) > 3) $myhost = explode('.', $myhost, 2)[1];
  } else if($count > 2){
    $myhost = get_domain(explode('.', $myhost, 2)[1]);
  }
  return $myhost;
}

domain.com -> domain.com
sub.domain.com -> domain.com
www.domain.com -> domain.com
www.sub.sub.domain.com -> domain.com
domain.co.uk -> domain.co.uk
sub.domain.co.uk -> domain.co.uk
www.domain.co.uk -> domain.co.uk
www.sub.sub.domain.co.uk -> domain.co.uk
domain.photography -> domain.photography
www.domain.photography -> domain.photography
www.sub.domain.photography -> domain.photography

domain.com -> domain.com
sub.domain.com -> domain.com
www.domain.com -> domain.com
www.sub.sub.domain.com -> domain.com
domain.co.uk -> domain.co.uk
sub.domain.co.uk -> domain.co.uk
www.domain.co.uk -> domain.co.uk
www.sub.sub.domain.co.uk -> domain.co.uk
domain.photography -> domain.photography
www.domain.photography -> domain.photography
www.sub.domain.photography -> domain.photography

Answer 3

回答by Gumbo

I would do something like the following:

我会做类似以下的事情：

// hierarchical array of top level domains
$tlds = array(
    'com' => true,
    'uk' => array(
        'co' => true,
        // …
    ),
    // …
);
$domain = 'here.example.co.uk';
// split domain
$parts = explode('.', $domain);
$tmp = $tlds;
// travers the tree in reverse order, from right to left
foreach (array_reverse($parts) as $key => $part) {
    if (isset($tmp[$part])) {
        $tmp = $tmp[$part];
    } else {
        break;
    }
}
// build the result
var_dump(implode('.', array_slice($parts, - $key - 1)));

Answer 4

回答by Nikola Petkanski

I ended up using the database Mozilla has.

我最终使用了 Mozilla 的数据库。

Here's my code:

这是我的代码：

fetch_mozilla_tlds.php contains caching algorhythm. This line is important:

fetch_mozilla_tlds.php 包含缓存算法。这一行很重要：

$mozillaTlds = file('http://mxr.mozilla.org/mozilla-central/source/netwerk/dns/effective_tld_names.dat?raw=1');

The main file used inside the application is this:

应用程序中使用的主要文件是这样的：

function isTopLevelDomain($domain)
{
    $domainParts = explode('.', $domain);
    if (count($domainParts) == 1) {
        return false;
    }

    $previousDomainParts = $domainParts;
    array_shift($previousDomainParts);

    $tld = implode('.', $previousDomainParts);

    return isDomainExtension($tld);
}

function isDomainExtension($domain)
{
    $tlds = getTLDs();

    /**
     * direct hit
     */
    if (in_array($domain, $tlds)) {
        return true;
    }

    if (in_array('!'. $domain, $tlds)) {
        return false;
    }

    $domainParts = explode('.', $domain);

    if (count($domainParts) == 1) {
        return false;
    }

    $previousDomainParts = $domainParts;

    array_shift($previousDomainParts);
    array_unshift($previousDomainParts, '*');

    $wildcardDomain = implode('.', $previousDomainParts);

    return in_array($wildcardDomain, $tlds);
}

function getTLDs()
{
    static $mozillaTlds = array();

    if (empty($mozillaTlds)) {
        require 'fetch_mozilla_tlds.php';
        /* @var $mozillaTlds array */
    }

    return $mozillaTlds;
}

UPDATE:
The database has evolved and is now available at its own website - http://publicsuffix.org/

更新：
数据库已经发展，现在可以在自己的网站上找到 - http://publicsuffix.org/

Answer 5

回答by Xaroth

Almost certainly, what you're looking for is this:

几乎可以肯定，您正在寻找的是：

https://github.com/Synchro/regdom-php

It's a PHP library that utilizes the (as nearly as is practical) full list of various TLD's that's collected at publicsuffix.org/list/ , and wraps it up in a spiffy little function.

它是一个 PHP 库，它利用（尽可能实用）在 publicsuffix.org/list/ 上收集的各种 TLD 的完整列表，并将其封装在一个漂亮的小函数中。

Once the library is included, it's as easy as:

一旦包含库，它就像：

$registeredDomain = getRegisteredDomain( $domain );

Answer 6

回答by TigerTiger

    $full_domain = $_SERVER['SERVER_NAME'];
$just_domain = preg_replace("/^(.*\.)?([^.]*\..*)$/", "", $_SERVER['HTTP_HOST']);

Answer 7

回答by Francisco Luz

This is a short way of accomplishing that:

这是实现这一目标的一种简短方法：

$host = $_SERVER['HTTP_HOST'];
preg_match("/[^\.\/]+\.[^\.\/]+$/", $host, $matches);
echo "domain name is: {$matches[0]}\n";

Answer 8

回答by Ric

This isn't foolproof and should only really be used if you know the domain isn't going to be anything obscure, but it's easier to read than most of the other options:

这不是万无一失的，只有在您知道域不会有任何晦涩之处时才应该真正使用，但它比大多数其他选项更容易阅读：

$justDomain = $_SERVER['SERVER_NAME'];
switch(substr_count($justDomain, '.')) {
    case 1:
        // 2 parts. Must not be a subdomain. Do nothing.
        break;

    case 2:
        // 3 parts. Either a subdomain or a 2-part suffix
        // If the 2nd part is over 3 chars's, assume it to be the main domain part which means we have a subdomain.
        // This isn't foolproof, but should be ok for most domains.
        // Something like domainname.parliament.nz would cause problems, though. As would www.abc.com
        $parts = explode('.', $justDomain);
        if(strlen($parts[1]) > 3) {
            unset($parts[0]);
            $justDomain = implode('.', $parts);
        }
        break;

    default:
        // 4+ parts. Must be a subdomain.
        $parts = explode('.', $justDomain, 2);
        $justDomain = $parts[1];
        break;
}

// $justDomain should now exclude any subdomain part.

Answer 9

回答by tim4dev

As a variant to Jonathan Sampson

作为乔纳森桑普森的变种

function get_domain($url)   {   
    if ( !preg_match("/^http/", $url) )
        $url = 'http://' . $url;
    if ( $url[strlen($url)-1] != '/' )
        $url .= '/';
    $pieces = parse_url($url);
    $domain = isset($pieces['host']) ? $pieces['host'] : ''; 
    if ( preg_match('/(?P<domain>[a-z0-9][a-z0-9\-]{1,63}\.[a-z\.]{2,6})$/i', $domain, $regs) ) { 
        $res = preg_replace('/^www\./', '', $regs['domain'] );
        return $res;
    }   
    return false;
}

Answer 10

回答by Chad

This script generates a Perl file containing a single function, get_domain from the ETLD file. So say you have hostnames like img1, img2, img3, ... in .photobucket.com. For each of those get_domain $host would return photobucket.com. Note that this isn't the fastest function on earth, so in my main log parser that's using this, I keep a hash of host to domain mappings and only run this for hosts that aren't in the hash yet.

此脚本生成一个 Perl 文件，其中包含来自 ETLD 文件的单个函数 get_domain。因此，假设您在 .photobucket.com 中有像 img1、img2、img3 之类的主机名。对于每个 get_domain $host 都会返回 photobucket.com。请注意，这不是地球上最快的函数，因此在我使用它的主日志解析器中，我保留了主机到域映射的哈希值，并且仅对尚未包含在哈希值中的主机运行此函数。

#!/bin/bash

cat << 'EOT' > suffixes.pl
#!/bin/perl

sub get_domain {
  $_ = shift;
EOT

wget -O - http://mxr.mozilla.org/mozilla-central/source/netwerk/dns/effective_tld_names.dat?raw=1 \
  | iconv -c -f UTF-8 -t ASCII//TRANSLIT \
  | egrep -v '/|^$' \
  | sed -e 's/^\!//' -e "s/\"/'/g" \
  | awk '{ print length(##代码##),##代码## | "sort -rn"}' | cut -d" " -f2- \
  | while read SUFF; do
      STAR=`echo $SUFF | cut -b1`
      if [ "$STAR" = '*' ]; then
        SUFF=`echo $SUFF | cut -b3-`
        echo "  return \"$1\.$2\.$SUFF\" if /([a-zA-Z0-9\-]+)\.([a-zA-Z0-9\-]+)\.$SUFF$/;"
      else
        echo "  return \"$1\.$SUFF\" if /([a-zA-Z0-9\-]+)\.$SUFF$/;"
      fi
    done >> suffixes.pl

cat << 'EOT' >> suffixes.pl
}

1;
EOT

php PHP从子域获取域名

提问by zuk1

回答by Sampson

Stackoverflow Question Archive:

Stackoverflow 问题存档：

回答by suncat100

回答by Gumbo

回答by Nikola Petkanski

回答by Xaroth

回答by TigerTiger

回答by Francisco Luz

回答by Ric

回答by tim4dev

回答by Chad

相关推荐

最近更新

标签

php PHP从子域获取域名

提问by zuk1

回答by Sampson

Stackoverflow Question Archive:

Stackoverflow 问题存档：

回答by suncat100

回答by Gumbo

回答by Nikola Petkanski

回答by Xaroth

回答by TigerTiger

回答by Francisco Luz

回答by Ric

回答by tim4dev

回答by Chad

相关推荐

PHP 中的安全随机数生成

您如何内联记录您的 PHP 函数和类？

php PHP中\x00、\x04是什么意思

php 何时使用静态类与实例化类

相关推荐

最近更新

标签