Java 为什么我应该使用 url.openStream 而不是 url.getContent?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/9795331/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-16 07:43:27  来源:igfitidea点击:

Why should i use url.openStream instead of of url.getContent?

java

提问by juwens

I would like to retrieve the content of a url. Similar to pythons:

我想检索网址的内容。类似于蟒蛇:

html_content = urllib.urlopen("http://www.test.com/test.html").read()

In examples( java2s.com) you see very often the following code:

在示例(java2s.com)中,您经常会看到以下代码:

URL url = new URL("http://www.test.com/test.html");
String foo = (String) url.getContent();

The Description of getContent is the following:

getContent 的描述如下:

Gets the contents of this URL. This method is a shorthand for: openConnection().getContent()
Returns: the contents of this URL.

In my opinion that should work perfectly fine. Buuut obviously this code doesnt work, because it raises an error:

在我看来,这应该工作得很好。但显然这段代码不起作用,因为它引发了一个错误:

Exception in thread "main" java.lang.ClassCastException: sun.net.www.protocol.http.HttpURLConnection$HttpInputStream cannot be cast to java.lang.String

Obviously it returns an inputStream.

显然它返回一个 inputStream。

So i ask myself: what's the purpose of this function which isn't doing what it is seems to do? And why is no hint for quirks it in the documentation? And why did i saw it in several examples?

所以我问自己:这个功能没有做它似乎做的事情的目的是什么?为什么在文档中没有提示它的怪癖?为什么我会在几个例子中看到它?

Or am i getting this wrong?

还是我弄错了?

The suggested solution (stackoverflow) is to use url.openStream() and then read the Stream.

建议的解决方案(stackoverflow)是使用 url.openStream() 然后读取流。

采纳答案by Dave Webb

As you said, documentation says that URL.getContent()is a shortcut for openConnection().getContent()so we need to look at the documentation for URLConnection.getContent().

至于你说的,文件说URL.getContent()是一条捷径openConnection().getContent(),所以我们需要看的文件URLConnection.getContent()

We can see that this returns an Objectthe type of which is determined by the the content-typeheader field of the response. This type determines the ContentHandlerthat will be used. So a ContentHandlerconverts data based on its MIME type to the appropriate class of Java Object.

我们可以看到,这会返回一个Object类型由content-type响应的头字段确定的类型。此类型确定ContentHandler将使用的。因此 aContentHandler将基于其 MIME 类型的数据转换为相应的 Java 对象类。

In other words the type of Object you get will depend on the content served. For example, it wouldn't make sense to return a Stringif the MIME type was image/png.

换句话说,您获得的 Object 类型将取决于所提供的内容。例如,String如果 MIME 类型是image/png.

This is why in the example code you link to at java2s.com they check the class of the returned Object:

这就是为什么在您链接到 java2s.com 的示例代码中,它们检查返回对象的类的原因:

try {
  URL u = new URL("http://www.java2s.com");
  Object o = u.getContent();
  System.out.println("I got a " + o.getClass().getName());
} catch (Exception ex) {
  System.err.println(ex);
}

So you can say String foo = (String) url.getContent();if you know your ContentHandlerwill return a String.

所以你可以说String foo = (String) url.getContent();如果你知道你ContentHandler会返回一个String.

There are default content handlers defined in the sun.net.www.contentpackage but as you can see they are returning streams for you.

sun.net.www.content包中定义了默认的内容处理程序,但正如您所见,它们正在为您返回流。

You could create your own ContentHandlerthat does return a Stringbut it will probably be easier just to read the Stream as you suggest.

您可以创建自己的ContentHandler返回 a 的内容,String但按照您的建议阅读 Stream 可能会更容易。

回答by Martijn Courteaux

You misunderstand what "Content" means. You expected it to return a String containing the HTML, but it returns a HttpInputStream. Why? Because the requested URL is a html webpage. Another valid URL might be http://www.google.com/logo.png. This URL doesn't contain String content. It is an image.

您误解了“内容”的含义。您希望它返回一个包含 HTML 的字符串,但它返回一个 HttpInputStream。为什么?因为请求的 URL 是一个 html 网页。另一个有效的 URL 可能是http://www.google.com/logo.png. 此 URL 不包含字符串内容。它是一个图像。

回答by prunge

You can use Guava's Resources.toString(URL, Charset)method to more easily read a URL to a string.

您可以使用GuavaResources.toString(URL, Charset)方法更轻松地读取 URL 到字符串。