MySQL 在 SQL 数据库中存储地址的最佳实践/标准
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/3094126/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Best Practice / Standard for storing an Address in a SQL Database
提问by Douglas
I am wondering if there is some sort of "standard" for storing US addresses in a database? It seems this is a common task, and there should be some sort of a standard.
我想知道是否有某种“标准”可以将美国地址存储在数据库中?这似乎是一项常见的任务,应该有某种标准。
What I am looking for is a specificschema of how the database tables should work and interact, already in third normal form, including data types (MySQL). A good UML document would work.
我正在寻找的是数据库表应该如何工作和交互的特定模式,已经是第三范式,包括数据类型(MySQL)。一个好的 UML 文档会起作用。
Maybe I'm just being lazy, but this is a very common task, and I am sure someone has published an efficient way to do this somewhere. I just don't know where to look and Google isn't helping. Please point me to the resource. Thanks.
也许我只是懒惰,但这是一项非常常见的任务,我相信有人已经在某处发布了一种有效的方法来执行此操作。我只是不知道去哪里找,谷歌也没有帮助。请指点我的资源。谢谢。
EDIT
编辑
Although this is more of a general question, I would like to clarify my specific needs.
虽然这更像是一个普遍的问题,但我想澄清一下我的具体需求。
Addresses will be used to specify road addresses of locations of events. These addresses will need to be in a format that can be best broken down and searched, and also used by any third-party applications I may end up linking my data source to.
地址将用于指定事件位置的道路地址。这些地址需要采用一种可以最好地分解和搜索的格式,并且也可以被我最终可能将我的数据源链接到的任何第三方应用程序使用。
ALSO. Data will be geo-coded (long, lat) on entry and stored separately, so it must fit the (yet undecided) protocol of whatever geocoder / application / library does that.
还。数据将在输入时进行地理编码(长、纬度)并单独存储,因此它必须符合(尚未确定的)任何地理编码器/应用程序/库执行此操作的协议。
采纳答案by joe snyder
http://www.upu.inthas the format standards for international addresses. Publication 28 at http://usps.comhas the U.S. format standards.
http://www.upu.int有国际地址的格式标准。http://usps.com上的出版物 28具有美国格式标准。
The USPS wants the following unpunctuated address components concatenated on a single line:
USPS 希望将以下不带标点的地址组件串联在一行中:
* house number
* predirectional (N, SE, etc)
* street
* suffix (AVE, BLVD, etc)
* postdirectional (SW, E, etc)
* unit (APT, STE, etc)
* apartment/suite number
Eg, 102 N MAIN ST SE APT B.
例如,102 N MAIN ST SE APT B。
If you keep the entire address line as a single field in your database, input and editing is easy, but searches can be more difficult (eg, in the case SOUTH EAST LANE is the street EAST as in S EAST LN or is it LANE as in SE LANE ST?).
如果您将整个地址行保留为数据库中的单个字段,则输入和编辑很容易,但搜索可能会更困难(例如,如果 SOUTH EAST LANE 是 S EAST LN 中的 Street EAST 或 LANE 为在 SE LANE ST?)。
If you keep the address parsed into separate fields, searches for components like street name or apartments become easier, but you have to append everything together for output, you need CASS software to parse correctly, and PO boxes, rural route addresses, and APO/FPO addresses have special parsings.
如果将地址解析为单独的字段,搜索街道名称或公寓等组件会变得更容易,但您必须将所有内容附加在一起才能输出,您需要 CASS 软件正确解析,以及邮政信箱、农村路线地址和 APO/ FPO 地址有特殊的解析。
A physical location with multiple addresses at that location is either a multiunit building, in which case letters/numbers after units like APT and STE designate the address, or it's a Commercial Mail Receiving Agency (eg, UPS store) and a maildrop/private mailbox number is appended (like 100 MAIN ST STE B PMB 102), or it's a business with one USPS delivery point and mail is routed after USPS delivery (which usually requires a separate mailstop field which the company might need but the USPS won't want on the address line).
在该位置具有多个地址的物理位置要么是多单元建筑物,在这种情况下,APT 和 STE 等单元后的字母/数字表示地址,要么是商业邮件接收机构(例如,UPS 商店)和邮递/私人邮箱附加号码(如 100 MAIN ST STE B PMB 102),或者是一家拥有一个 USPS 递送点且邮件在 USPS 递送后路由的企业(通常需要一个单独的 mailstop 字段,公司可能需要该字段,但 USPS 不需要在地址行)。
A contact with more than one physical address is usually a business or person with a street address and a PO box. Note that it's common for each address to have a different ZIP code.
具有多个实际地址的联系人通常是具有街道地址和邮政信箱的企业或个人。请注意,每个地址都有不同的邮政编码是很常见的。
It's quite typical that one business transaction might have a shipping address and a billing address (again, with different ZIP codes). The information I keep for EACH address is:
一项业务交易可能具有送货地址和帐单地址(同样,具有不同的邮政编码)是非常典型的。我为每个地址保留的信息是:
* name prefix (DR, MS, etc)
* first name and initial
* last name
* name suffix (III, PHD, etc)
* mail stop
* company name
* address (one line only per Pub 28 for USA)
* city
* state/province
* ZIP/postal code
* country
I typically print mail stops somewhere between the person's name and company because the country contains the state/ZIP which contains the city which contains the address which contains the company which contains the mail stop which contains the person. I use CASS software to validate and standardize addresses when entered or edited.
我通常会在人名和公司之间的某个位置打印邮件停靠点,因为国家/地区包含州/邮政编码,其中包含包含地址的城市,该地址包含包含公司的地址,该公司包含包含该人的邮件停靠点。在输入或编辑地址时,我使用 CASS 软件来验证和标准化地址。
回答by Everette Mills
First, as a person who spend most of there professional day working with addresses, they are hard to manage from a data perspective.
首先,作为一个大部分时间都在处理地址的人,从数据的角度来看,他们很难管理。
If you ask 5 people what address they live at; you will find that you get 5 different answers. While you and I can tell that 123 Main Street Apt 1and Apt 1 123 Main Streetare the same address, the database program will have a challenge.
如果你问 5 个人住在哪里;你会发现你得到了 5 个不同的答案。虽然你我都知道123 Main Street Apt 1和Apt 1 123 Main Street是同一个地址,但数据库程序将面临挑战。
If you are using United States centric addresses CASS certified software from almost any vendor will standardize your addresses reasonably well. I would recommend a simple format as follows:
如果您使用以美国为中心的地址,几乎所有供应商的 CASS 认证软件都可以很好地标准化您的地址。我会推荐一个简单的格式如下:
- Address 1
- Address 2
- Address 3
- City
- State
- Zip
- Zip+4 (I would carry this so lookups are easier when checking for duplicates)
- 地址1
- 地址2
- 地址 3
- 城市
- 状态
- 压缩
- Zip+4(我会携带它,以便在检查重复项时更容易查找)
However, if you want a universal address I would look at the ADISstandard from IdeaAlliance. This standard can be used to breakdown (parse) addresses from almost any country into the relevant parts. Then they can be put back together using templates/components based on the Universal Postal Union standards (UPU S42 Standard on International Postal Address Components and Templates).
但是,如果您想要一个通用地址,我会查看IdeaAlliance的ADIS标准。该标准可用于将几乎任何国家/地区的地址分解(解析)为相关部分。然后可以使用基于万国邮政联盟标准(万国邮政联盟 S42 国际邮政地址组件和模板标准)的模板/组件将它们重新组合在一起。
The big plus of this format is that addresses that dont exist in a postal database like CASS can be entered and stored as separate parts.
这种格式的一大优点是,在 CASS 等邮政数据库中不存在的地址可以作为单独的部分输入和存储。
回答by Jonathan Leffler
Verysimilarquestionshavebeen asked before.
Addresses are messy - at best.
地址很乱 - 充其量。
It partly depends on what you want to do with the addresses. If you're going to use them to mail thing to people, then you simply need to record the image that will appear on the address label in a convenient form. If you're going to analyze the address, you have to work a lot harder.
这部分取决于您想对地址做什么。如果您打算使用它们向人们邮寄东西,那么您只需要以方便的形式记录将出现在地址标签上的图像。如果您要分析地址,则必须更加努力。
Remember that the first time you have to deal with someone outside the US, all previous rules go astray. You may be strictly US-only, but beware.
请记住,您第一次必须与美国以外的人打交道时,所有以前的规则都会误入歧途。您可能严格仅限于美国,但要小心。
回答by Thomas
First, the "best" means of storing an address depends greatly on how it will be used. Is it just for reference or searches on say city? Do you plan on addressing envelopes? Are you going to integrate with a shipping system like FedEx or UPS? Will you store non-US addresses? Once you get into the realm of integrating with something that ships, you should start looking at CASS. This is a specification for handling the USPS addresses. There are applications out there that are CASS certified which will store and verify addresses. Thus, the second best practice would be to try to avoid reinventing the wheel and see if there is a system out there that will solve your problem especially if you are going to go international. You want to leverage the fact that someone else has worked out all the details about how to properly and efficiently store addresses for many countries around the world instead of having to do that investigation yourself.
首先,存储地址的“最佳”方式在很大程度上取决于它将如何使用。仅用于参考或搜索城市?你打算给信封写地址吗?您是否打算与 FedEx 或 UPS 等运输系统集成?您会存储非美国地址吗?一旦你进入了与发布的东西集成的领域,你应该开始关注CASS. 这是处理 USPS 地址的规范。有一些应用程序经过 CASS 认证,可以存储和验证地址。因此,第二个最佳实践是尽量避免重新发明轮子,看看是否有一个系统可以解决您的问题,尤其是如果您要走向国际。您想利用这样一个事实,即其他人已经制定了有关如何正确有效地存储世界上许多国家/地区的地址的所有详细信息,而不必自己进行调查。
回答by Chris
I've had to try to do this before and I'd found this documentthat gives you some pointers. I ended up shelving my schema since my application does have to deal with international addresses.
我以前不得不尝试这样做,我发现这个文档为您提供了一些指导。我最终搁置了我的架构,因为我的应用程序确实必须处理国际地址。
回答by Mike
I looked into this a while ago, but for international addresses. I didn't find much in the way of a consensus. However, for the US, I found the succinctly named United States Thoroughfare, Landmark, and Postal Address Data Standard (Draft):
我不久前研究过这个,但对于国际地址。我没有发现太多的共识。但是,对于美国,我找到了名称简洁的美国公路、地标和邮政地址数据标准(草案):
http://www.fgdc.gov/standards/projects/FGDC-standards-projects/street-address/index_html
http://www.fgdc.gov/standards/projects/FGDC-standards-projects/street-address/index_html
I don't think that they actually provide any specific database schema ideas, but it might be a good starting point.
我不认为他们实际上提供了任何特定的数据库架构想法,但这可能是一个很好的起点。