database 在数据库 (RDBMS) 中存储邮政地址的最佳实践?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/310540/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Best practices for storing postal addresses in a database (RDBMS)?
提问by John
Are there any good references for best practices for storing postal addresses in an RDBMS? It seems there are lots of tradeoffs that can be made and lots of pros and cons to each to be evaluated -- surely this has been done time and time again? Maybe someone has at least written done some lessons learned somewhere?
对于在 RDBMS 中存储邮政地址的最佳实践,是否有任何好的参考资料?似乎有很多权衡可以进行,每个权衡都有很多优点和缺点需要评估——这肯定已经一次又一次地完成了吗?也许有人至少在某处写过一些经验教训?
Examples of the tradeoffs I am talking about are storing the zipcode as an integer vs a char field, should house number be stored as a separate field or part of address line 1, should suite/apartment/etc numbers be normalized or just stored as a chunk of text in address line 2, how do you handle zip +4 (separate fields or one big field, integer vs text)? etc.
我正在谈论的权衡示例是将邮政编码存储为整数与字符字段,房屋号码是否应存储为单独的字段或地址行 1 的一部分,套房/公寓/等号码是否应标准化或仅存储为地址行 2 中的文本块,您如何处理 zip +4(单独的字段或一个大字段,整数与文本)?等等。
I'm primarily concerned with U.S. addresses at this point but I imagine there are some best practices in regards to preparing yourself for the eventuality of going global as well (e.g. naming fields appropriately like region instead of state or postal code instead of zip code, etc.
在这一点上,我主要关注美国地址,但我想有一些最佳实践可以让自己为走向全球的可能性做好准备(例如,适当地命名字段,如地区而不是州或邮政编码而不是邮政编码,等等。
回答by Samm Cooper
For more international use, one schema to consider is the one used by Drupal Address Field. It's based on the xNAL standard, and seems to cover most international cases. A bit of digging into that module will reveal some nice pearls for interpreting and validating addresses internationally. It also has a nice set of administrative areas ( province, state, oblast, etc ) with ISO codes.
对于更多国际使用,要考虑的一种模式是Drupal Address Field使用的模式。它基于xNAL 标准,似乎涵盖了大多数国际案例。深入研究该模块将揭示一些用于解释和验证国际地址的好方法。它还有一组很好的带有 ISO 代码的行政区域(省、州、州等)。
Here's the gist of the schema, copied from the module page:
这是从模块页面复制的模式的要点:
country => Country (always required, 2 character ISO code)
name_line => Full name (default name entry)
first_name => First name
last_name => Last name
organisation_name => Company
administrative_area => State / Province / Region (ISO code when available)
sub_administrative_area => County / District (unused)
locality => City / Town
dependent_locality => Dependent locality (unused)
postal_code => Postal code / ZIP Code
thoroughfare => Street address
premise => Apartment, Suite, Box number, etc.
sub_premise => Sub premise (unused)
A lessons I've learned:
我学到的一个教训:
- Don't store anything numerically.
- Store country and administrative area as ISO codes where possible.
- When you don't know, be lax about requiring fields. Some country may not use fields you take for granted, even basic things like
locality
&thoroughfare
.
- 不要用数字存储任何东西。
- 尽可能将国家和行政区域存储为 ISO 代码。
- 当您不知道时,请不要随意要求字段。某些国家/地区可能不会使用您认为理所当然的字段,甚至是诸如
locality
& 之类的基本内容thoroughfare
。
回答by Andrew Ferrier
As an 'international' user, there is nothing more frustrating than dealing with a website that is oriented around only US-format addresses. It's a little rude at first, but becomes a serious problem when the validation is also over-zealous.
作为“国际”用户,没有什么比处理仅面向美国格式地址的网站更令人沮丧的了。一开始有点粗鲁,但是当验证也过于热情时就成了一个严重的问题。
If you are concerned with going global, the only advice I have is to keep things free-form. Different countries have different conventions - in some, the house number comes before the street name, in some it comes after. Some have states, some regions, some counties, some combinations of those. Here in the UK, the zipcode is not a zipcode, it's a postcode containing both letters and numbers.
如果你担心走向全球,我唯一的建议就是保持自由形式。不同的国家/地区有不同的惯例 - 在某些情况下,门牌号码在街道名称之前,在某些国家/地区之后。有些有州,有些地区,有些县,有些是这些的组合。在英国,邮政编码不是邮政编码,而是包含字母和数字的邮政编码。
I'd advise simply ~10 lines of variable-length strings, together with a separate field for a postcode (and be careful how you describe that to cope with national sensibilities). Let the user/customer decide how to write their addresses.
我建议只使用约 10 行可变长度字符串,以及一个单独的邮政编码字段(并注意您如何描述以应对国家敏感性)。让用户/客户决定如何写他们的地址。
回答by splattne
If you need comprehensive information about how other countries use postal addresses, here's a very good reference link (Columbia University):
如果您需要有关其他国家/地区如何使用邮政地址的综合信息,这里有一个非常好的参考链接(哥伦比亚大学):
Frank's Compulsive Guide to Postal Addresses
Effective Addressing for International Mail
回答by Paul Fisher
You should definitely consider storing house number as a character field rather than a number, because of special cases such as "half-numbers", or my current address, which is something like "129A" — but the A is not considered as an apartment number for delivery services.
您绝对应该考虑将门牌号码存储为字符字段而不是数字,因为特殊情况,例如“半数”或我的当前地址,例如“129A”——但 A 不被视为公寓送货服务号码。
回答by Paul Fisher
I've done this (rigorously model address structures in a database), and I would never do it again. You can't imagine how crazy the exceptions are that you'll have to take into account as a rule.
我已经这样做了(在数据库中对地址结构进行了严格的建模),而且我再也不会这样做了。您无法想象作为规则必须考虑的例外情况有多疯狂。
I vaguely recall some issue with Norwegian postal codes (I think), which were all 4 positions, except Oslo, which had 18 or so.
我依稀记得挪威邮政编码的一些问题(我认为),所有 4 个位置,除了奥斯陆,它有 18 个左右。
I'm positively sure that from the moment we started using the geographically correct ZIP codes for all of our own national addresses, quite a few people started complaining that their mail arrived too late. Turned out those people were living near a borderline between postal areas, and despite the fact that someone really lived in postal area, say, 1600, in reality his mail should be addressed to postal area 1610, because in reality it was that neighbouring postal area that actually served him, so sending his mail to his correct postal area would take that mail a couple of days longer to arrive, because of the unwanted intervention that was required in the correct postal office to forward it to the incorrect postal area ...
我非常肯定,从我们开始为我们自己的所有国家地址使用地理上正确的邮政编码的那一刻起,很多人就开始抱怨他们的邮件来得太晚了。结果这些人住在邮政区之间的边界附近,尽管有人确实住在邮政区,例如 1600,但实际上他的邮件应该寄往邮政区 1610,因为实际上是邻近的邮政区这实际上为他服务,因此将他的邮件发送到他正确的邮政区将需要几天时间才能到达,因为正确的邮政局需要进行不必要的干预才能将其转发到错误的邮政区......
(We ended up registering those people with an address abroad in the country with ISO-code 'ZZ'.)
(我们最终用 ISO 代码“ZZ”注册了那些在该国拥有国外地址的人。)
回答by Jonathan Leffler
You should certainly consult "Is this a good way to model address information in a relational database", but your question is not a direct duplicate of that.
您当然应该咨询“这是在关系数据库中对地址信息建模的好方法吗”,但您的问题不是直接重复的问题。
There are surely a lot of pre-existing answers (check out the example data models at DatabaseAnswers, for example). Many of the pre-existing answers are defective under some circumstances (not picking on DB Answers at all).
肯定有很多预先存在的答案(例如,查看DatabaseAnswers 中的示例数据模型)。许多预先存在的答案在某些情况下是有缺陷的(根本不选择 DB Answers)。
One major issue to consider is the scope of the addresses. If your database must deal with international addresses, you have to be more flexible than if you only have to deal with addresses in one country.
需要考虑的一个主要问题是地址的范围。如果您的数据库必须处理国际地址,那么您必须比只处理一个国家/地区的地址更加灵活。
In my view, it is often(which does not mean always) sensible to both record the 'address label image' of the address and separately analyze the content. This allows you to deal with differences between the placement of postal codes, for example, between different countries. Sure, you can write an analyzer and a formatter that handle the eccentricities of different countries (for instance, US addresses have 2 or 3 lines; by contrast, British addresses can have considerably more; one address I write to periodically has 9 lines). But it can be easier to have the humans do the analysis and formatting and let the DBMS just store the data.
在我看来,记录地址的“地址标签图像”并单独分析内容通常(并不意味着总是)明智。这允许您处理邮政编码位置之间的差异,例如,不同国家/地区之间的差异。当然,您可以编写一个分析器和一个格式化程序来处理不同国家的怪癖(例如,美国地址有 2 或 3 行;相比之下,英国地址可以有更多;我定期写入的一个地址有 9 行)。但是让人工进行分析和格式化并让 DBMS 只存储数据会更容易。
回答by seanb
Unless you are going to do maths on the street numbers or zip / postal codes, you are just inviting future pain by storing them as numerics.
除非您打算对街道号码或邮政编码进行数学运算,否则将它们存储为数字只会招致未来的痛苦。
You might save a few bytes here and there, and maybe get a faster index, but what do you when US postal, or whatever other country you are dealing with, decides the introduce alphas into the codes?
您可能会在这里和那里节省一些字节,并可能获得更快的索引,但是当美国邮政或您正在处理的任何其他国家/地区决定将字母引入代码时,您会怎么做?
The cost of disk space is going to be a lot cheaper than the cost of fixing it later on... y2k anybody?
磁盘空间的成本将比以后修复它的成本便宜很多... y2k 有人吗?
回答by Gaz_Edge
Ive found that listing all possible fields from smallest discrete unit to largest is the easiest way. Users will fill in the fields they see fit. My address table looks like this:
我发现从最小离散单元到最大列出所有可能的字段是最简单的方法。用户将填写他们认为合适的字段。我的地址表如下所示:
*********************************
Field Type
*********************************
address_id (PK) int
unit string
building string
street string
city string
region string
country string
address_code string
*********************************
回答by Ken Gentle
Adding to what @Jonathan Lefflerand @Paul Fisherhave said
添加@Jonathan Leffler和@Paul Fisher所说的内容
If you ever anticipate having postal addresses for Canada or Mexico added to your requirements, storing postal-code
as a string is a must. Canada has alpha-numeric postal codes and I don't remember what Mexico's look like off the top of my head.
如果您希望将加拿大或墨西哥的邮政地址添加到您的要求中,postal-code
则必须将其存储为字符串。加拿大有字母数字邮政编码,我不记得墨西哥的样子了。
回答by Shanmu
This might be an overkill, but if you need a solution that would work with multiple countries and you need to programmatically process parts of the address:
这可能有点矫枉过正,但如果您需要一个适用于多个国家/地区的解决方案,并且您需要以编程方式处理部分地址:
you could have country specific address handling using two tables: One generic table with 10 VARCHAR2 columns, 10 Number columns, another table which maps these fields to prompts and has a country column tying an address structure to a country.
您可以使用两个表来处理特定于国家/地区的地址:一个具有 10 个 VARCHAR2 列、10 个数字列的通用表,另一个表将这些字段映射到提示,并有一个国家/地区列将地址结构与国家/地区联系起来。