java 检查 URL 相等性的正确方法

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/3771081/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-10-30 03:18:02  来源:igfitidea点击:

Proper way to check for URL equality

javaurl

提问by NG.

I have the following scenario:

我有以下场景:

URL u1 = new URL("http://www.yahoo.com/");
URL u2 = new URL("http://www.yahoo.com");

if (u1.equals(u2)) {
    System.out.println("yes");
}
if (u1.toURI().equals(u2.toURI())) {
    System.out.println("uri equality");
}
if (u1.toExternalForm().equals(u2.toExternalForm())) {
    System.out.println("external form equality");
}
if (u1.toURI().normalize().equals(u2.toURI().normalize())) {
    System.out.println("uri normalized equality");
}

None of these checks are succeeding. Only the path differs: u1 has a path of "/" while u2 has a path of "". Are these URLs pointing to the same resource and is there a way for me to check such a thing without opening a connection? Am I misunderstanding something fundamental about URLs?

这些检查都没有成功。只有路径不同:u1 的路径为“/”,而 u2 的路径为“”。这些 URL 是否指向相同的资源,有没有办法让我在不打开连接的情况下检查这样的事情?我是否误解了 URL 的一些基本知识?

EDITI should state that a non hacky check is desired. Is it reasonable to say that empty path == / ? I was hoping to not have this kind of code

编辑我应该声明需要一个非黑客检查。说空路径 == / 是否合理?我希望没有这种代码

回答by Colin Hebert

From the 2007 JavaOne :

从 2007 JavaOne :

The second puzzle, aptly titled "More Joys of Sets" has the user create HashMap keys that consist or several URL objects. Again, most of the audience was unable to guess the correct answer.

The important thing the audience learned here is that the URL object's equals() method is, in effect, broken. In this case, two URL objects are equal if they resolve to the same IP address and port, not just if they have equal strings. However, Bloch and Pugh point out an even more severe Achilles' Heel: the equality behavior differs depending on if you're connected to the network, where virtual addresses can resolve to the same host, or if you're not on the net, where the resolve is a blocking operation. So, as far as lessons learned, they recommend:

Don't use URL; use URI instead. URI makes no attempt to compare addresses or ports. In addition, don't use URL as a Set element or a Map key.
For API designers, the equals() method should not depend on the environment. For example, in this case, equality should not change if a computer is connected to the Internet versus standalone.

第二个难题,恰如其分地标题为“更多乐趣集”,让用户创建包含一个或多个 URL 对象的 HashMap 键。再次,大多数观众无法猜出正确答案。

听众在这里学到的重要一点是URL 对象的 equals() 方法实际上是已经损坏的。在这种情况下,如果两个 URL 对象解析为相同的 IP 地址和端口,则它们是相等的,而不仅仅是它们具有相等的字符串。然而,Bloch 和 Pugh 指出了一个更严重的阿喀琉斯之踵:平等行为取决于您是否连接到网络,虚拟地址可以解析到同一主机,或者您是否不在网络上,其中 resolve 是一个阻塞操作。因此,就经验教训而言,他们建议:

不要使用网址;改用 URI。URI 不会尝试比较地址或端口。此外,不要将 URL 用作 Set 元素或 Map 键。
对于 API 设计者来说,equals() 方法不应该依赖于环境。例如,在这种情况下,如果计算机连接到 Internet 与独立,则平等不应改变。



From the URI equals documentation :

从 URI 等于文档:

For two hierarchical URIs to be considered equal, their paths must be equaland their queries must either both be undefined or else be equal.

要使两个分层 URI 被视为相等,它们的路径必须相等,并且它们的查询必须要么是未定义的,要么是相等的。

In your case, the two path are different. one is "/" the other is "".

在你的情况下,两条路径是不同的。一个是“/”,另一个是“”。



According to the URI RFC §6.2.3:

根据 URI RFC §6.2.3:

Implementations mayuse scheme-specific rules, at further processing cost, to reduce the probability of false negatives. For example, because the "http" scheme makes use of an authority component, has a default port of "80", and defines an empty path to be equivalent to "/", the following four URIs are equivalent:

 http://example.com
 http://example.com/
 http://example.com:/
 http://example.com:80/

实现可以使用特定于方案的规则,以进一步的处理成本,以减少漏报的可能性。例如,由于“http”方案使用了权限组件,默认端口为“80”,并定义了一个空路径等价于“/”,因此以下四个URI是等价的:

 http://example.com
 http://example.com/
 http://example.com:/
 http://example.com:80/

It seems that this implementation doesn't use scheme-specific rules.

这个实现似乎没有使用特定于方案的规则。



Resources :

资源 :

回答by Wernight

Strictly speaking they are notequal. The optionaltrailing slash (/) is only a common usage but not a must. You could display different pages for

严格来说,它们并不相等。在可选尾部的斜杠(/)仅仅是一个常见的用法,但不是必须的。您可以显示不同的页面

http://www.yahoo.com/foo/

and for

并为

http://www.yahoo.com/foo

It's even possible for the one you provided I believe the HTTP header could skip that slash.

您提供的那个甚至有可能我相信 HTTP 标头可以跳过那个斜杠。

回答by BTalker

You can always compare relative URLs with Path.equals-method

您始终可以使用 Path.equals-method 比较相对 URL

ex.

前任。

Paths.get("/user/login").equals(Paths.get("/user/login/")))

produce true

产生

You can also use startsWith/endsWith-methods

你也可以使用startsWith/endsWith-methods