java 有没有Java解析器可以解析这样的地址
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/10146864/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Is there a Java parser that can parse addresses like this
提问by Dave
I'm using Java 6. I'm looking for an automated way to parse addresses. I'm not concerned if the addresses exist or not. The best thing I have found is JGeocoder (v 0.4.1), but JGeocoder is unable to parse addresses like this
我正在使用 Java 6。我正在寻找一种自动解析地址的方法。我不关心地址是否存在。我发现的最好的东西是 JGeocoder (v 0.4.1),但 JGeocoder 无法解析这样的地址
16th Street Theater, Berwyn Cultural Center, 6420 16th St.
Does anyone know of a free Java address parser that is up to the challenge? By "parse" I mean the ability to distinguish street, city, state, postal code, and potentially the venue name (the above venue name is "16th Street Theater, Berwyn Cultural Center").
有谁知道可以应对挑战的免费 Java 地址解析器?“解析”是指能够区分街道、城市、州、邮政编码以及可能的场地名称(上述场地名称是“16th Street Theatre, Berwyn Cultural Center”)。
回答by Matt
Update:This topic is more exhaustively covered in this StackOverflow question.
更新:此StackOverflow 问题中更详尽地涵盖了此主题。
I work for SmartyStreetswhere we parse and process addresses, and we have an answer. This is what we call "SLAP" or Single-Line Address Parsing (or Processing). The formal term is Named Entity Recognition (NER).
我在SmartyStreets工作,在那里我们解析和处理地址,我们有答案。这就是我们所说的“ SLAP”或单行地址解析(或处理)。正式术语是命名实体识别 (NER)。
I'm not an expert on Java libraries, but I do know that any in-house implementations will not live up to expectations. Here's some common reasons that people who I've helped have previously had difficulty:
我不是 Java 库方面的专家,但我知道任何内部实现都不会达到预期。以下是我帮助过的人以前遇到困难的一些常见原因:
Google / Yahoo! / Bing Maps web services do not allow automated queries and do not verify accuracy of the parsed address.
In-house code can make also only make a best guess without any knowledge of existent addresses (a database) or other sorts of official sources. I know you want a library that can do this in-house, but you can at best make a guess...
By the way, regular expressions are notthe answer. The best regex I've seen to parse addresses was dynamically generated over hundreds of lines of code and several classes. It was a mess, and was only correct for types of addresses you'd expect, not all the valid (US) formats there actually are.
谷歌/雅虎!/ Bing Maps Web 服务不允许自动查询,也不验证解析地址的准确性。
内部代码也只能做出最佳猜测,而无需了解现有地址(数据库)或其他类型的官方来源。我知道您想要一个可以在内部完成此操作的库,但您最多只能猜测...
顺便说一句,正则表达式不是答案。我见过的最好的解析地址的正则表达式是通过数百行代码和几个类动态生成的。这是一团糟,仅适用于您期望的地址类型,而不是所有有效的(美国)格式。
This is an incredibly complex task... unless you have the right tools. One of our services is called LiveAddress API, and it's similar to Google Maps in that it parses addresses and geocodes them, but goes a step further by being CASS-Certified and returning only validaddresses, almost no matter the input format.
这是一项极其复杂的任务……除非您拥有合适的工具。我们的一项服务称为LiveAddress API,它类似于 Google 地图,因为它解析地址并对其进行地理编码,但更进一步,通过 CASS 认证并仅返回有效地址,几乎无论输入格式如何。
I encourage you to do some research of your own, but this is probably the most effective and reliable method.
我鼓励你自己做一些研究,但这可能是最有效和最可靠的方法。
回答by Bertrand Liechtenstein
https://code.google.com/p/usaddressparser/Parses US address string and splits it into fields ( number, street, suite,city,zip etc.). Java jar and sources
https://code.google.com/p/usaddressparser/解析美国地址字符串并将其拆分为字段(号码、街道、套房、城市、邮编等)。Java jar 和源代码
回答by JohanB
If webservices are allowed, you could try google maps.
如果允许网络服务,您可以尝试谷歌地图。