Java 为什么我应该使用 url.openStream 而不是 url.getContent？

Question

提问by juwens

I would like to retrieve the content of a url. Similar to pythons:

我想检索网址的内容。类似于蟒蛇：

html_content = urllib.urlopen("http://www.test.com/test.html").read()

In examples( java2s.com) you see very often the following code:

在示例（java2s.com）中，您经常会看到以下代码：

URL url = new URL("http://www.test.com/test.html");
String foo = (String) url.getContent();

The Description of getContent is the following:

getContent 的描述如下：

Gets the contents of this URL. This method is a shorthand for: openConnection().getContent()
Returns: the contents of this URL.

In my opinion that should work perfectly fine. Buuut obviously this code doesnt work, because it raises an error:

在我看来，这应该工作得很好。但显然这段代码不起作用，因为它引发了一个错误：

Exception in thread "main" java.lang.ClassCastException: sun.net.www.protocol.http.HttpURLConnection$HttpInputStream cannot be cast to java.lang.String

Obviously it returns an inputStream.

显然它返回一个 inputStream。

So i ask myself: what's the purpose of this function which isn't doing what it is seems to do? And why is no hint for quirks it in the documentation? And why did i saw it in several examples?

所以我问自己：这个功能没有做它似乎做的事情的目的是什么？为什么在文档中没有提示它的怪癖？为什么我会在几个例子中看到它？

Or am i getting this wrong?

还是我弄错了？

The suggested solution (stackoverflow) is to use url.openStream() and then read the Stream.

建议的解决方案（stackoverflow）是使用 url.openStream() 然后读取流。

Answer 1

采纳答案by Dave Webb

As you said, documentation says that URL.getContent()is a shortcut for openConnection().getContent()so we need to look at the documentation for URLConnection.getContent().

至于你说的，文件说URL.getContent()是一条捷径openConnection().getContent()，所以我们需要看的文件URLConnection.getContent()。

We can see that this returns an Objectthe type of which is determined by the the content-typeheader field of the response. This type determines the ContentHandlerthat will be used. So a ContentHandlerconverts data based on its MIME type to the appropriate class of Java Object.

我们可以看到，这会返回一个Object类型由content-type响应的头字段确定的类型。此类型确定ContentHandler将使用的。因此 aContentHandler将基于其 MIME 类型的数据转换为相应的 Java 对象类。

In other words the type of Object you get will depend on the content served. For example, it wouldn't make sense to return a Stringif the MIME type was image/png.

换句话说，您获得的 Object 类型将取决于所提供的内容。例如，String如果 MIME 类型是image/png.

This is why in the example code you link to at java2s.com they check the class of the returned Object:

这就是为什么在您链接到 java2s.com 的示例代码中，它们检查返回对象的类的原因：

try {
  URL u = new URL("http://www.java2s.com");
  Object o = u.getContent();
  System.out.println("I got a " + o.getClass().getName());
} catch (Exception ex) {
  System.err.println(ex);
}

So you can say String foo = (String) url.getContent();if you know your ContentHandlerwill return a String.

所以你可以说String foo = (String) url.getContent();如果你知道你ContentHandler会返回一个String.

There are default content handlers defined in the sun.net.www.contentpackage but as you can see they are returning streams for you.

sun.net.www.content包中定义了默认的内容处理程序，但正如您所见，它们正在为您返回流。

You could create your own ContentHandlerthat does return a Stringbut it will probably be easier just to read the Stream as you suggest.

您可以创建自己的ContentHandler返回 a 的内容，String但按照您的建议阅读 Stream 可能会更容易。

Answer 2

回答by Martijn Courteaux

You misunderstand what "Content" means. You expected it to return a String containing the HTML, but it returns a HttpInputStream. Why? Because the requested URL is a html webpage. Another valid URL might be http://www.google.com/logo.png. This URL doesn't contain String content. It is an image.

您误解了“内容”的含义。您希望它返回一个包含 HTML 的字符串，但它返回一个 HttpInputStream。为什么？因为请求的 URL 是一个 html 网页。另一个有效的 URL 可能是http://www.google.com/logo.png. 此 URL 不包含字符串内容。它是一个图像。

Answer 3

回答by prunge

You can use Guava's Resources.toString(URL, Charset)method to more easily read a URL to a string.

您可以使用Guava的Resources.toString(URL, Charset)方法更轻松地读取 URL 到字符串。

Java 为什么我应该使用 url.openStream 而不是 url.getContent？

提问by juwens

采纳答案by Dave Webb

回答by Martijn Courteaux

回答by prunge

相关推荐

最近更新

标签

Java 为什么我应该使用 url.openStream 而不是 url.getContent？

提问by juwens

采纳答案by Dave Webb

回答by Martijn Courteaux

回答by prunge

相关推荐

Java Grizzly 和 Jersey 独立罐

Java Collections.emptyList() 而不是空检查？

Java 如何访问 Eclipse 项目文件夹中的图像文件

在 Java 中，this.method() 和 method() 有什么区别？

相关推荐

最近更新

标签