java URLConnection 没有得到字符集

Question

提问by Bart van Heukelom

I'm using URL.openConnection()to download something from a server. The server says

我正在使用URL.openConnection()从服务器下载一些东西。服务员说

Content-Type: text/plain; charset=utf-8

But connection.getContentEncoding()returns null. What up?

但connection.getContentEncoding()返回null。怎么了？

Answer 1

采纳答案by Waldheinz

This is documented behaviour as the getContentEncoding()method is specified to return the contents of the Content-EncodingHTTP header, which is not set in your example. You could use the getContentType()method and parse the resulting String on your own, or possibly go for a more advancedHTTP client library like the one from Apache.

这是记录在案的行为，因为该getContentEncoding()方法被指定为返回Content-EncodingHTTP 标头的内容，这在您的示例中未设置。您可以使用该getContentType()方法并自行解析生成的 String，或者可能使用更高级的HTTP 客户端库，例如来自Apache 的库。

Answer 2

回答by Buhake Sindi

The value returned from URLConnection.getContentEncoding()returns the value from header Content-Encoding

从返回的值URLConnection.getContentEncoding()返回从报头中的值Content-Encoding

Code from URLConnection.getContentEncoding()

代码来自 URLConnection.getContentEncoding()

/**
     * Returns the value of the <code>content-encoding</code> header field.
     *
     * @return  the content encoding of the resource that the URL references,
     *          or <code>null</code> if not known.
     * @see     java.net.URLConnection#getHeaderField(java.lang.String)
     */
    public String getContentEncoding() {
       return getHeaderField("content-encoding");
    }

Instead, rather do a connection.getContentType()to retrieve the Content-Type and retrieve the charset from the Content-Type. I've included a sample code on how to do this....

相反，而是执行 aconnection.getContentType()来检索 Content-Type 并从 Content-Type 中检索字符集。我已经包含了一个关于如何做到这一点的示例代码......

String contentType = connection.getContentType();
String[] values = contentType.split(";"); // values.length should be 2
String charset = "";

for (String value : values) {
    value = value.trim();

    if (value.toLowerCase().startsWith("charset=")) {
        charset = value.substring("charset=".length());
    }
}

if ("".equals(charset)) {
    charset = "UTF-8"; //Assumption
}

Answer 3

回答by Juan M. Rivero

Just as an addition to the answer from @Buhake Sindi. If you are using Guava, instead of the manual parsing you can do:

就像对@Buhake Sindi 的回答的补充一样。如果您使用的是 Guava，您可以执行以下操作，而不是手动解析：

MediaType mediaType = MediaType.parse(httpConnection.getContentType());
Optional<Charset> typeCharset = mediaType.charset();

java URLConnection 没有得到字符集

提问by Bart van Heukelom

采纳答案by Waldheinz

回答by Buhake Sindi

回答by Juan M. Rivero

相关推荐

最近更新

标签

java URLConnection 没有得到字符集

提问by Bart van Heukelom

采纳答案by Waldheinz

回答by Buhake Sindi

回答by Juan M. Rivero

相关推荐

java 是否有可用的聊天机器人框架？

java JSP中如何控制显示位数？

Java 中的对象检测/跟踪

java 如何获取我的对象的父对象的实例

相关推荐

最近更新

标签