php TLD 可能有多长?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/9238640/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-26 06:24:45  来源:igfitidea点击:

How long can a TLD possibly be?

phpregexemail-validationtld

提问by HellaMad

I'm working on an email validation regex in PHP and I need to know how long the TLD could possibly be and still be valid. I did a few searches but couldn't find much information on the topic. So how long can a TLD possibly be?

我正在 PHP 中处理电子邮件验证正则表达式,我需要知道 TLD 可能存在多长时间并且仍然有效。我做了一些搜索,但找不到关于该主题的太多信息。那么 TLD 可能有多长?

采纳答案by tripleee

DNS allows for a maximum of 63 characters for an individual label.

DNS 允许单个标签最多 63 个字符。

回答by Dan Dascalescu

The longest TLD currently in existence is 24 characters long, and subject to change. The maximum TLD length specified by RFC 1034is 63 octets.

目前存在的最长 TLD 为 24 个字符,可能会发生变化。RFC 1034指定的最大 TLD 长度为 63 个八位字节。

To get the length of the longest existing TLD:

要获得最长的现有 TLD 的长度:

wget -qO - http://data.iana.org/TLD/tlds-alpha-by-domain.txt | tail -n+2 | wc -L

Here's what that command does:

这是该命令的作用:

  1. Get the latest list of actual existing TLDsfrom IANA
  2. Strip the first line, which is a long-ish comment
  3. Launch wcto count the longest line
  1. IANA获取实际现有 TLD最新列表
  2. 去掉第一行,这是一个很长的注释
  3. 启动wc以计算最长的行

Alternative using curlthanks to Stefan:

curl感谢 Stefan 的替代使用:

curl -s http://data.iana.org/TLD/tlds-alpha-by-domain.txt | tail -n+2 | wc -L

回答by aviad

-EDIT-

-编辑-

According to RFC 2606 .localhost is reserved domain name and its length is 9 characters. That is the longest I am aware of.

根据 RFC 2606 .localhost 是保留域名,长度为 9 个字符。这是我所知道的最长的。

-END OF EDIT-

-编辑结束-

However, I think that you should care about email address length and not only TLD length. Below is a quote from thisarticle. The email address length is 254 characters:

但是,我认为您应该关心电子邮件地址长度,而不仅仅是 TLD 长度。下面是这篇文章的引述。电子邮件地址长度为 254 个字符:

There appears to be some confusion over the maximum valid email address size. Most people believe it to be 320 characters (64 characters for the username + 255 characters for the domain + 1 character for the @ symbol). Other sources suggest 129 (64 + 1 + 64) or 384 (128+1+255, assuming the username doubles in length in the future).

This confusion means you should heed the 'robustness principle' ("developers should carefully write software that adheres closely to extant RFCs but accept and parse input from peers that might not be consistent with those RFCs." - Wikipedia) when writing software that deals with email addresses. Furthermore, some software may be crippled by naive assumptions, e.g. thinking that 50 characters is adequate (examples). Your 200 character email address may be technically valid but that will not help you if most websites or applications reject it.

The actual maximum email length is currently 254 characters:

"The original version of RFC 3696did indeed say 320 was the maximum length, but John Klensin (ICANN)subsequently accepted this was wrong."

"This arises from the simple arithmetic of maximum length of a domain (255 characters) + maximum length of a mailbox (64 characters) + the @ symbol = 320 characters. Wrong. This canard is actually documented in the original version of RFC3696. It was corrected in the errata. There's actually a restriction from RFC5321on the path element of an SMTP transaction of 256 characters. But this includes angled brackets around the email address, so the maximum length of an email address is 254 characters."

关于最大有效电子邮件地址大小似乎有些混乱。大多数人认为它是 320 个字符(用户名 64 个字符 + 域 255 个字符 + @ 符号 1 个字符)。其他来源建议 129 (64 + 1 + 64) 或 384 (128+1+255,假设用户名的长度在未来加倍)。

这种混淆意味着您应该注意“健壮性原则”(“开发人员应该仔细编写与现有 RFC 密切相关的软件,但接受和解析可能与这些 RFC 不一致的同行的输入。” -维基百科)在编写处理以下问题的软件时电子邮件地址。此外,一些软件可能会因幼稚的假设而瘫痪,例如认为 50 个字符就足够了(示例)。您的 200 个字符的电子邮件地址在技术上可能是有效的,但如果大多数网站或应用程序拒绝它,这对您没有帮助。

实际的最大电子邮件长度目前为 254 个字符:

RFC 3696的原始版本确实说 320 是最大长度,但John Klensin (ICANN)随后承认这是错误的。”

“这是由域的最大长度(255 个字符)+ 邮箱的最大长度(64 个字符)+@ 符号 = 320 个字符的简单算术引起的。错误。这个鸭子实际上记录在RFC3696的原始版本中。它已在勘误表中更正。实际上RFC5321对 SMTP 事务的路径元素有 256 个字符的限制。但这包括电子邮件地址周围的尖括号,因此电子邮件地址的最大长度为 254 个字符。”

回答by axiomer

The longest with latin letters is .MUSEUM (source), but there are some with special characters. The longest from them is XN--CLCHC0EA0B2G2A9GCD. Also, in a short time, it will be possible to reserve your own TLD for a high price and so it will be possible to be longer.

最长的拉丁字母是 .MUSEUM ( source),但也有一些带有特殊字符。其中最长的是 XN--CLCHC0EA0B2G2A9GCD。此外,在短时间内,有可能以高价保留您自己的 TLD,因此可能会更长。

回答by Chathura Edirisinghe

Since I'm a .net developer following is the java-script representation of determining the longest TLD currently available.this will return the length of the longest TLD which you would be able to use in your RegEx.

由于我是 .net 开发人员,因此以下是确定当前可用的最长 TLD 的 java 脚本表示。这将返回您可以在 RegEx 中使用的最长 TLD 的长度。

please try the following Code Snippet

请尝试以下代码片段

function getTLD() {
    var length = 0;
    var longest;
    var request = new XMLHttpRequest();

    request.open('GET', 'http://data.iana.org/TLD/tlds-alpha-by-domain.txt', true);
    request.send(null);
    request.onreadystatechange = function () {
        if (request.readyState === 4 && request.status === 200) {
            var type = request.getResponseHeader('Content-Type');
            if (type.indexOf("text") !== 1) {
                var tldArr = request.responseText.split('\n'); 
                tldArr.splice(0, 1);

                for (var i = 0; i < tldArr.length; i++) {
                    if (tldArr[i].length > length) {
                        length = tldArr[i].length;
                        longest = tldArr[i];
                    }
                } 

                console.log("Longest >> " + longest + " >> " + length);
                return length;
            }
        }
    }
}
<button onclick="getTLD()">Get TLD</button>

回答by Jan Kyu Peblik

A TLD can be any length at all. New TLDs happen all the time. In the future there will be more TLDs not regulated by the entity currently regulating the majority of TLDs. We also won't use email in the future as we presently do. That said:

TLD 可以是任意长度。新 TLD 一直在发生。未来将有更多的 TLD 不受目前监管大多数 TLD 的实体的监管。我们将来也不会像现在那样使用电子邮件。那说:

You don't need to validate an email address ever. If you want to slow people down and have an idea as to whether they're actually human, include a CAPTCHA. If you need to confirm working email, send an email with a validation link they can open. If you aren't throttling submissions of things that can generate things like emails being sent for verification, it won't matter whether you're confirming the address is technically valid anyway, it will be abused at that point regardless.

您永远不需要验证电子邮件地址。如果您想减慢人们的速度并了解他们是否真的是人类,请包含 CAPTCHA。如果您需要确认工作电子邮件,请发送一封带有验证链接的电子邮件,他们可以打开。如果您不限制提交可以生成诸如发送电子邮件以供验证之类的东西的提交,则无论如何您确认该地址在技术上是否有效都无关紧要,无论如何它都会在那时被滥用。

回答by Meisner

This is PHPcode to get up-to-date vertical bar separated UTF-8TLDs list to be used directly in a regular expression:

这是PHP获取UTF-8要直接在正则表达式中使用的最新竖线分隔TLD 列表的代码:

<?php 
  function getTLDs($separator){
    $tlds=file('http://data.iana.org/TLD/tlds-alpha-by-domain.txt');
    array_shift($tlds); // remove heading comment
    usort($tlds,function($a,$b){ return strlen($b)-strlen($a); }); // sort from longest to shortest
    return implode($separator,array_map(function($e){ return idn_to_utf8(trim(strtolower($e))); },$tlds));
  }
  echo getTLDs('|');
?>

To match a host name you could use it like this:

要匹配主机名,您可以像这样使用它:

$tlds=getTLDs('|');
if (preg_match("{([\da-z\.-]+)\.($tlds)}u",$address)) {
  ..
}