database 是否有在数据库中存储规范化电话号码的标准?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/41925/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-08 06:54:09  来源:igfitidea点击:

Is there a standard for storing normalized phone numbers in a database?

database

提问by Eric Z Beard

What is a good data structure for storing phone numbers in database fields? I'm looking for something that is flexible enough to handle international numbers, and also something that allows the various parts of the number to be queried efficiently.

在数据库字段中存储电话号码的良好数据结构是什么?我正在寻找足够灵活的东西来处理国际号码,以及允许有效查询号码的各个部分的东西。

Edit:Just to clarify the use case here: I currently store numbers in a single varchar field, and I leave them just as the customer entered them. Then, when the number is needed by code, I normalize it. The problem is that if I want to query a few million rows to find matching phone numbers, it involves a function, like

编辑:只是为了澄清这里的用例:我目前将数字存储在单个 varchar 字段中,并且在客户输入时保留它们。然后,当代码需要该数字时,我将其标准化。问题是,如果我想查询几百万行来查找匹配的电话号码,它涉及一个函数,比如

where dbo.f_normalizenum(num1) = dbo.f_normalizenum(num2)

which is terribly inefficient. Also queries that are looking for things like the area code become extremely tricky when it's just a single varchar field.

这是非常低效的。当它只是一个单一的 varchar 字段时,正在寻找诸如区号之类的东西的查询也会变得非常棘手。

[Edit]

[编辑]

People have made lots of good suggestions here, thanks! As an update, here is what I'm doing now: I still store numbers exactly as they were entered, in a varchar field, but instead of normalizing things at query time, I have a trigger that does all that work as records are inserted or updated. So I have ints or bigints for any parts that I need to query, and those fields are indexed to make queries run faster.

人们在这里提出了很多好的建议,谢谢!作为更新,这就是我现在正在做的事情:我仍然完全按照输入的数字在 varchar 字段中存储数字,但不是在查询时标准化事物,我有一个触发器可以在插入记录时完成所有工作或更新。所以我有需要查询的任何部分的整数或大整数,并且这些字段被索引以使查询运行得更快。

采纳答案by Adam Davis

First, beyond the country code, there is no real standard. About the best you can do is recognize, by the country code, which nation a particular phone number belongs to and deal with the rest of the number according to that nation's format.

首先,在国家代码之外,没有真正的标准。你能做的最好的事情是通过国家代码识别特定电话号码属于哪个国家,并根据该国家的格式处理号码的其余部分。

Generally, however, phone equipment and such is standardized so you can almost always break a given phone number into the following components

但是,通常电话设备等是标准化的,因此您几乎总是可以将给定的电话号码分解为以下组件

  • C Country code 1-10 digits (right now 4 or less, but that may change)
  • A Area code (Province/state/region) code 0-10 digits (may actually want a region field and an area field separately, rather than one area code)
  • E Exchange (prefix, or switch) code 0-10 digits
  • L Line number 1-10 digits
  • C 国家代码 1-10 位(现在 4 位或更少,但可能会改变)
  • 一个区号(省/州/地区)代码 0-10 位(实际上可能需要一个地区字段和一个地区字段分开,而不是一个区号)
  • E 交换(前缀或开关)代码 0-10 位
  • L 行号 1-10 位

With this method you can potentially separate numbers such that you can find, for instance, people that might be close to each other because they have the same country, area, and exchange codes. With cell phones that is no longer something you can count on though.

使用此方法,您可以潜在地分隔数字,以便您可以找到例如可能彼此接近的人,因为他们具有相同的国家、地区和交换代码。有了手机,你就不再可以指望了。

Further, inside each country there are differing standards. You can always depend on a (AAA) EEE-LLLL in the US, but in another country you may have exchanges in the cities (AAA) EE-LLL, and simply line numbers in the rural areas (AAA) LLLL. You will have to start at the top in a tree of some form, and format them as you have information. For example, country code 0 has a known format for the rest of the number, but for country code 5432 you might need to examine the area code before you understand the rest of the number.

此外,每个国家内部都有不同的标准。在美国,您始终可以依赖 (AAA) EEE-LLLL,但在另一个国家/地区,您可能会在城市 (AAA) EE-LLL 和农村地区 (AAA) LLLL 中使用行号。您必须从某种形式的树的顶部开始,并根据您的信息对它们进行格式化。例如,国家/地区代码 0 对号码的其余部分具有已知格式,但对于国家/地区代码 5432,您可能需要先检查区号,然后才能了解号码的其余部分。

You may also want to handle vanitynumbers such as (800) Lucky-Guy, which requires recognizing that, if it's a US number, there's one too many digits (and you may need to full representation for advertising or other purposes) and that in the US the letters map to the numbers differently than in Germany.

您可能还想处理vanity诸如 之类的数字(800) Lucky-Guy,这需要认识到,如果它是美国数字,则数字太多(并且您可能需要完全代表广告或其他目的),并且在美国,字母映射到数字与德国不同。

You may also want to store the entire number separately as a text field (with internationalization) so you can go back later and re-parse numbers as things change, or as a backup in case someone submits a bad method to parse a particular country's format and loses information.

您可能还希望将整个数字单独存储为文本字段(具有国际化),以便您稍后返回并在情况发生变化时重新解析数字,或者作为备份以防有人提交错误的方法来解析特定国家/地区的格式并丢失信息。

回答by Bjorn Reppen

KISS - I'm getting tired of many of the US web sites. They have some cleverly written code to validate postal codes and phone numbers. When I type my perfectly valid Norwegian contact info I find that quite often it gets rejected.

KISS - 我厌倦了许多美国网站。他们有一些巧妙编写的代码来验证邮政编码和电话号码。当我输入完全有效的挪威联系信息时,我发现它经常被拒绝。

Leave it a string, unless you have some specific need for something more advanced.

保留一个字符串,除非您对更高级的东西有一些特定的需求。

回答by Rich

The Wikipedia page on E.164should tell you everything you need to know.

E.164 上维基百科页面应该会告诉您所有您需要知道的信息。

回答by unintentionally left blank

Here's my proposed structure, I'd appreciate feedback:

这是我建议的结构,我很感激反馈:

The phone database field should be a varchar(42) with the following format:

电话数据库字段应为具有以下格式的 varchar(42):

CountryCode - Number x Extension

CountryCode - 号码 x 分机号

So, for example, in the US, we could have:

因此,例如,在美国,我们可以:

1-2125551234x1234

1-2125551234x1234

This would represent a US number (country code 1) with area-code/number (212) 555 1234 and extension 1234.

这将代表一个美国号码(国家代码 1),带有区号/号码 (212) 555 1234 和分机 1234。

Separating out the country code with a dash makes the country code clear to someone who is perusing the data. This is not strictlynecessary because country codes are "prefix codes" (you can read them left to right and you will always be able to unambiguously determine the country). But, since country codes have varying lengths (between 1 and 4 characters at the moment) you can't easily tell at a glance the country code unless you use some sort of separator.

使用破折号分隔国家/地区代码可以使阅读数据的人清楚国家/地区代码。这不是绝对必要的,因为国家/地区代码是“前缀代码”(您可以从左到右阅读它们,您将始终能够明确地确定国家/地区)。但是,由于国家/地区代码具有不同的长度(目前在 1 到 4 个字符之间),除非您使用某种分隔符,否则您很难一眼就看出国家/地区代码。

I use an "x" to separate the extension because otherwise it really wouldn't be possible (in many cases) to figure out which was the number and which was the extension.

我使用“x”来分隔扩展名,否则真的不可能(在许多情况下)确定哪个是数字,哪个是扩展名。

In this way you can store the entire number, including country code and extension, in a single database field, that you can then use to speed up your queries, instead of joining on a user-defined function as you have been painfully doing so far.

通过这种方式,您可以将整个数字(包括国家/地区代码和扩展名)存储在单个数据库字段中,然后您可以使用它来加快查询速度,而不是像迄今为止那样痛苦地加入用户定义的函数.

Why did I pick a varchar(42)? Well, first off, international phone numbers will be of varied lengths, hence the "var". I am storing a dash and an "x", so that explains the "char", and anyway, you won't be doing integer arithmetic on the phone numbers (I guess) so it makes little sense to try to use a numeric type. As for the length of 42, I used the maximum possible length of all the fields added up, based on Adam Davis' answer, and added 2 for the dash and the 'x".

为什么我选择了 varchar(42)?好吧,首先,国际电话号码的长度会有所不同,因此是“var”。我正在存储一个破折号和一个“x”,这样就解释了“char”,无论如何,你不会对电话号码进行整数运算(我猜)所以尝试使用数字类型毫无意义. 至于 42 的长度,我使用了所有字段的最大可能长度,根据 Adam Davis 的回答,并为破折号和“x”添加了 2。

回答by jcoby

Look up E.164. Basically, you store the phone number as a code starting with the country prefix and an optional pbx suffix. Display is then a localization issue. Validation can also be done, but it's also a localization issue (based on the country prefix).

查找 E.164。基本上,您将电话号码存储为以国家/地区前缀和可选的 pbx 后缀开头的代码。显示则是一个本地化问题。也可以进行验证,但这也是一个本地化问题(基于国家/地区前缀)。

For example, +12125551212+202 would be formatted in the en_US locale as (212) 555-1212 x202. It would have a different format in en_GBor de_DE.

例如,+12125551212+202 在 en_US 语言环境中将被格式化为 (212) 555-1212 x202。它将在en_GB或 中具有不同的格式de_DE

There is quite a bit of info out there about ITU-T E.164, but it's pretty cryptic.

有很多关于 ITU-T E.164 的信息,但它非常神秘。

回答by Mike Fielden

I personally like the idea of storing a normalized varchar phone number (e.g. 9991234567) then, of course, formatting that phone number inline as you display it.

我个人喜欢存储规范化的 varchar 电话号码(例如 9991234567)的想法,然后,当然,在显示时内联格式化该电话号码。

This way all the data in your database is "clean" and free of formatting

这样数据库中的所有数据都是“干净的”并且没有格式

回答by Thomas Owens

Perhaps storing the phone number sections in different columns, allowing for blank or null entries?

也许将电话号码部分存储在不同的列中,允许空白或空条目?

回答by cmcculloh

Ok, so based on the info on this page, here is a start on an international phone number validator:

好的,根据此页面上的信息,这是国际电话号码验证器的开始:

function validatePhone(phoneNumber) {
    var valid = true;
    var stripped = phoneNumber.replace(/[\(\)\.\-\ \+\x]/g, '');    

    if(phoneNumber == ""){
        valid = false;
    }else if (isNaN(parseInt(stripped))) {
        valid = false;
    }else if (stripped.length > 40) {
        valid = false;
    }
    return valid;
}

Loosely based on a script from this page: http://www.webcheatsheet.com/javascript/form_validation.php

松散地基于此页面的脚本:http: //www.webcheatsheet.com/javascript/form_validation.php

回答by Brian West

The standard for formatting numbers is e.164, You should always store numbers in this format. You should never allow the extension number in the same field with the phone number, those should be stored separately. As for numeric vs alphanumeric, It depends on what you're going to be doing with that data.

格式化数字的标准是e.164,您应该始终以这种格式存储数字。您永远不应将分机号码与电话号码放在同一字段中,它们应分开存储。至于数字与字母数字,这取决于您将要对这些数据做什么。

回答by Alex Klaus

Storage

贮存

Store phones in RFC 3966(like +1-202-555-0252, +1-202-555-7166;ext=22). The main difference from E.164are

将电话存储在RFC 3966 中(如+1-202-555-0252, +1-202-555-7166;ext=22)。与E.164的主要区别是

  • No limit on the length
  • Support of extensions
  • 没有长度限制
  • 扩展支持

To optimise performance of view operations, store the phone in the National/International format next to the RFC 3966 field.

要优化查看操作的性能,请将手机以国内/国际格式存储在 RFC 3966 字段旁边。

Don't store the country code in a separate field unless you have a serious reason for that. Why? Because you shouldn't ask for the country code on the UI.

不要将国家/地区代码存储在单独的字段中,除非您有严重的理由。为什么?因为您不应该在 UI 上询问国家/地区代码。

Mostly, people enter the phones as they hear them. E.g. if the local format will start from 0or 8, it'd be annoying for the user to make the number transformation in the head (like, "OK, don't type '0', choose the country and type the rest of what the person said in this field").

大多数情况下,人们会在听到电话时进入电话。例如,如果本地格式将从0或开始8,那么用户在头脑中进行数字转换会很烦人(例如,“好吧,不要输入'0',选择国家并输入其余的内容人在这个领域说“)。

Parsing

解析

Google has your back and you can validate and parse any phone number with using their libphonenumberlibrary. There are ports to almost any language.

Google 支持您,您可以使用他们的libphonenumber库验证和解析任何电话号码。几乎所有语言都有端口。

So let the user just enter "0449053501" or "04 4905 3501" or "(04) 4905 3501". The tool will figure out the rest for you.

所以让用户只需输入“ 0449053501”或“ 04 4905 3501”或“ (04) 4905 3501”。该工具将为您解决其余的问题。

See the official demo, to get a feeling of how much does it help.

查看官方演示,感受一下它有多大帮助。