C# 哪里有好的地址解析器
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/518210/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Where is a good Address Parser
提问by
I'm looking for a good tool that can take a full mailing address, formatted for display or use with a mailing label, and convert it into a structured object.
我正在寻找一个很好的工具,它可以获取完整的邮寄地址、格式化以供显示或与邮寄标签一起使用,并将其转换为结构化对象。
So for instance:
所以例如:
// Start with a formatted address in a single string
string f = "18698 E. Main Street\r\nBig Town, AZ, 86011";
// Parse into address
Address addr = new Address(f);
addr.Street; // 18698 E. Main Street
addr.Locality; // Big Town
addr.Region; // AZ
addr.PostalCode; // 86011
Now I could do this using RegEx. But the tricky part is keeping it general enough to handle any address in the world!
现在我可以使用 RegEx 做到这一点。但棘手的部分是保持它足够通用以处理世界上的任何地址!
I'm sure there has to be something out there that can do it.
我相信一定有什么东西可以做到这一点。
If anyone noticed, this is actually the format of the opensocial.addressobject.
如果有人注意到,这实际上是opensocial.address对象的格式。
回答by Steve B.
You could try Experian Address Verification. Has it's issues but pretty much works as advertised.
您可以尝试Experian 地址验证。有它的问题,但几乎和宣传的一样。
回答by aleemb
As there is no trivial solution like @duffymo said, the next best thing might be to reconsider the design. If it's a user form, make a compromise and let the user fill it. If you are retroactively parsing data, then use a very strict regex to parse addresses based on some criteria (country is US). Then make a second pass at the ones that are left over and so on. I have taken this approach and it's the only reliable approach.
由于没有像@duffymo 所说的微不足道的解决方案,下一个最好的事情可能是重新考虑设计。如果是用户表单,请妥协并让用户填写。如果您正在追溯解析数据,则使用非常严格的正则表达式根据某些标准(国家是美国)解析地址。然后对剩下的进行第二次传递,依此类推。我采用了这种方法,这是唯一可靠的方法。
Another design problem with taking a generic regex approach is that it will generate false positive for bad addresses. If you are sending out snail mail to these people, it will end up bouncing and you'll have more work at your hands trying to sort out which ones came back or continue to send mails to erroneous addresses.
采用通用正则表达式方法的另一个设计问题是它会为错误地址生成误报。如果您向这些人发送蜗牛邮件,它最终会被退回,您将有更多的工作要做,试图找出哪些人回来了,或者继续将邮件发送到错误的地址。
回答by Tom Lehman
The Googlemaps API works pretty well for this. E.g., suppose you are given the string "120 w 45 st nyc". Pass it into the Googlemaps API like so: http://maps.google.com/maps/geo?q=120+w+45+st+nyc
and you get this response:
Googlemaps API 对此非常有效。例如,假设您得到字符串“120 w 45 st nyc”。像这样将其传递到 Googlemaps API 中:http://maps.google.com/maps/geo?q=120+w+45+st+nyc
您会得到以下响应:
{
"name": "120 w 45 st nyc",
"Status": {
"code": 200,
"request": "geocode"
},
"Placemark": [ {
"id": "p1",
"address": "120 W 45th St, New York, NY 10036, USA",
"AddressDetails": {"Country": {"CountryNameCode": "US","CountryName": "USA","AdministrativeArea": {"AdministrativeAreaName": "NY","Locality": {"LocalityName": "New York","Thoroughfare":{"ThoroughfareName": "120 W 45th St"},"PostalCode": {"PostalCodeNumber": "10036"}}}},"Accuracy": 8},
"ExtendedData": {
"LatLonBox": {
"north": 40.7603883,
"south": 40.7540931,
"east": -73.9807141,
"west": -73.9870093
}
},
"Point": {
"coordinates": [ -73.9838617, 40.7572407, 0 ]
}
} ]
}
回答by Tom Lehman
I tried RecogniContact recently. It is a Windows COM component that parses US and European addresses. You can test it from the website.
我最近尝试了 RecogniContact。它是一个解析美国和欧洲地址的 Windows COM 组件。您可以从网站上进行测试。
回答by Brian c
For Canadian addresses, I have used one called Street Perfect. We had to wrap the c++
code in some .net
to make it reusable for our purpose, but that was fairly easy.
对于加拿大地址,我使用了一个叫做Street Perfect 的地址。我们不得不将c++
代码包装在一些中.net
,以使其可重用用于我们的目的,但这相当容易。
回答by Jonathan Oliver
As has been mentioned, this is not a trivial problem. One of the biggest issues--apart from international addresses--is that there is no standard format for addresses and the fact that an address can't tell you if it's well-formed, i.e. it's not self-validating like a credit card number.
如前所述,这不是一个小问题。最大的问题之一——除了国际地址——是地址没有标准格式,而且地址不能告诉你它是否格式正确,即它不像信用卡号那样自我验证.
Because of this, you have to rely on an external source of truth to ensure the address is real. This is where an address verification service comes into the mix. Depending upon your business needs and application requirements, you may be looking at a one-time "batch" scrub of your address list, or perhaps a realtime or live address validation service. There are a number of good providers (which vary in cost) that can easily solve this problem.
因此,您必须依靠外部事实来源来确保地址是真实的。这就是地址验证服务的用武之地。根据您的业务需求和应用程序要求,您可能正在查看地址列表的一次性“批量”清理,或者实时或实时地址验证服务。有许多优秀的提供商(成本各不相同)可以轻松解决此问题。
I should mention that I'm the founder of SmartyStreets. We do CASS-certified address verification. We'll take your unformatted/raw addresses and turn them into addresses which have been cleaned, standardized, and verified/confirmed. Depending on the size of your list, the cost is usually only a few dollars and the turnaround time is nearly instant--usually a few minutes.
我应该提到我是 SmartyStreets 的创始人。我们做CASS 认证的地址验证。我们会将您的未格式化/原始地址转换为经过清理、标准化和验证/确认的地址。根据清单的大小,成本通常只有几美元,而且周转时间几乎是即时的——通常是几分钟。
回答by liuhongbo
If you are looking for a address parser with a simple solution, try this:
如果您正在寻找具有简单解决方案的地址解析器,请尝试以下操作:
http://usaddress.codeplex.com/
http://usaddress.codeplex.com/
Good: 1. No database required 2. No internet lookup required 3. Pretty accurate
好:1. 不需要数据库 2. 不需要互联网查询 3. 相当准确
Bad: 1. Can not confirm if it is a real address 2. Only works for US address 3. in C#, use .NET 3.5 or above
不好: 1. 无法确认是否为真实地址 2. 仅适用于美国地址 3. 在 C# 中,使用 .NET 3.5 或更高版本