为什么在java中使用request.getParameter()时字符被损坏了?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/1365806/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-12 11:13:24  来源:igfitidea点击:

Why the character is corrupted when use request.getParameter() in java?

javacharacter-encodingrequest

提问by MemoryLeak

I have such a link in JSP page with encoding big5 http://hello/world?name=婀ㄉAnd when I input it in browser's URL bar, it will be changed to something like http://hello/world?name=%23%24%23And when we want to get this parameter in jsp page, all the characters are corrupted.

我在 JSP 页面中有这样一个链接,编码为 big5 http://hello/world?name=婀ㄉ当我在浏览器的 URL 栏中输入它时,它会变成类似 http://hello/world?name= %23%24%23而当我们想在jsp页面中获取这个参数时,所有的字符都被破坏了。

And we have set this: request.setCharacterEncoding("UTF-8"), so all the requests will be converted to UTF8.

我们设置了这个:request.setCharacterEncoding("UTF-8"),所以所有的请求都会被转换成UTF8。

But why in this case, it doesn't work ? Thanks in advance!.

但为什么在这种情况下,它不起作用?提前致谢!。

采纳答案by ZZ Coder

When you enter the URL in browser's address bar, browser may convert the character encoding before URL-encoding. However, this behavior is not well defined, see my question,

当您在浏览器地址栏中输入 URL 时,浏览器可能会先转换字符编码,然后再进行 URL 编码。但是,这种行为没有明确定义,请参阅我的问题,

Handling Character Encoding in URI on Tomcat

在 Tomcat 上处理 URI 中的字符编码

We mostly get UTF-8 and Latin-1 on newer browsers but we get all kinds of encodings (including Big5) in old ones. So it's best to avoid non-ASCII characters in URL entered by user directly.

我们主要在较新的浏览器上获得 UTF-8 和 Latin-1,但我们在旧浏览器中获得各种编码(包括 Big5)。所以最好避免用户直接输入的 URL 中的非 ASCII 字符。

If the URL is embedded in JSP, you can force it into UTF-8 by generating it like this,

如果 URL 嵌入在 JSP 中,您可以通过像这样生成它来强制将其转换为 UTF-8,

String link = "http://hello/world?name=" + URLEncoder.encode(name, "UTF-8");

On Tomcat, the encoding needs to be specified on Connector like this,

在 Tomcat 上,需要像这样在 Connector 上指定编码,

<Connector port="8080" URIEncoding="UTF-8"/>

You also need to use request.setCharacterEncoding("UTF-8")for body encoding but it's not safe to set this in servlet because this only works when the parameter is not processed but other filter or valve may trigger the processing. So you should do it in a filter. Tomcat comes with such a filter in the source distribution.

您还需要request.setCharacterEncoding("UTF-8")用于正文编码,但在 servlet 中设置它是不安全的,因为这仅在未处理参数但其他过滤器或阀门可能触发处理时才有效。所以你应该在过滤器中进行。Tomcat 在源代码分发中带有这样的过滤器。

回答by Martin v. L?wis

You cannot have non-ASCII characters in an URL - you always need to percent-encode them. When doing so, browsers have difficulties rendering them. Rendering works best if you encode the URL in UTF-8, and then percent-encode it. For your specific URL, this would give http://hello/world?name=%E5%A9%80%E3%84%89(check your browser what it gives for this specific link). When you get the parameter in JSP, you need to explicitly unquote it, and then decode it from UTF-8, as the browser will send it as-is.

URL 中不能包含非 ASCII 字符 - 您始终需要对它们进行百分比编码。这样做时,浏览器很难呈现它们。如果您将 URL 编码为 UTF-8,然后对其进行百分比编码,则渲染效果最佳。对于您的特定 URL,这将提供http://hello/world?name=%E5%A9%80%E3%84%89(检查您的浏览器为该特定链接提供的内容)。当您在 JSP 中获取参数时,您需要明确地取消引用它,然后将其从 UTF-8 解码,因为浏览器将按原样发送它。

回答by Mr_and_Mrs_D

To avoid fiddling with the server.xmluse :

为了避免摆弄server.xml使用:

protected static final String CHARSET_FOR_URL_ENCODING = "UTF-8";

protected String encodeString(String baseLink, String parameter)
        throws UnsupportedEncodingException {
    return String.format(baseLink + "%s",
            URLEncoder.encode(parameter, CHARSET_FOR_URL_ENCODING));
}
// Used in the servlet code to generate GET requests
response.sendRedirect(encodeString("userlist?name=", name));

To actually get those parameters on Tomcat you need to do something like:

要在 Tomcat 上实际获取这些参数,您需要执行以下操作

final String name =
        new String(request.getParameter("name").getBytes("iso-8859-1"), "UTF-8");

As apparently (?) request.getParameterURLDecodes() the string and interprets it as iso-8859-1- or whatever the URIEncodingis set to in the server.xml. For an example of how to get the URIEncodingcharset from the server.xmlfor Tomcat 7 see here

显然 (?) request.getParameterURLDecodes() 字符串并将其解释为iso-8859-1- 或URIEncodingserver.xml. 有关如何URIEncodingserver.xmlTomcat 7获取字符集的示例,请参见此处

回答by ff9will

I had a problem with JBoss 7.0, and I think this filter solution also works with Tomcat:

我在使用 JBoss 7.0 时遇到了问题,我认为这个过滤器解决方案也适用于 Tomcat:

public void doFilter(ServletRequest request, ServletResponse response, FilterChain chain) throws IOException, ServletException {

    HttpServletRequest httpRequest = (HttpServletRequest) request;
    HttpServletResponse httpResponse = (HttpServletResponse) response;

    try {
        httpRequest.setCharacterEncoding(MyAppConfig.getAppSetting("System.Character.Encoding"));

        String appServer = MyAppConfig.getAppSetting("System.AppServer");
        if(appServer.equalsIgnoreCase("JBOSS7")) {
            Field requestField = httpRequest.getClass().getDeclaredField("request");
            requestField.setAccessible(true);
            Object requestValue = requestField.get(httpRequest);

            Field coyoteRequestField = requestValue.getClass().getDeclaredField("coyoteRequest");
            coyoteRequestField.setAccessible(true);
            Object coyoteRequestValue = coyoteRequestField.get(requestValue);

            Method getParameters = coyoteRequestValue.getClass().getMethod("getParameters");
            Object parameters = getParameters.invoke(coyoteRequestValue);

            Method setQueryStringEncoding = parameters.getClass().getMethod("setQueryStringEncoding", String.class);
            setQueryStringEncoding.invoke(parameters, MyAppConfig.getAppSetting("System.Character.Encoding"));

            Method setEncoding = parameters.getClass().getMethod("setEncoding", String.class);
            setEncoding.invoke(parameters, MyAppConfig.getAppSetting("System.Character.Encoding"));
        }

    } catch (NoSuchMethodException nsme) {
        System.err.println(nsme.getLocalizedMessage());
        nsme.printStackTrace();
        MyLogger.logException(nsme);
    } catch (InvocationTargetException ite) {
        System.err.println(ite.getLocalizedMessage());
        ite.printStackTrace();
        MyLogger.logException(ite);
    } catch (IllegalAccessException iae) {
        System.err.println(iae.getLocalizedMessage());
        iae.printStackTrace();
        MyLogger.logException(iae);

    } catch(Exception e) {
        TALogger.logException(e);
    }

    try {
        httpResponse.setCharacterEncoding(MyAppConfig.getAppSetting("System.Character.Encoding"));
    } catch(Exception e) {
        MyLogger.logException(e);
    }
}

回答by Tuan

I did quite a bit of searching on this issue so this might help others who are experiencing the same problem on tomcat. This is taken from http://wiki.apache.org/tomcat/FAQ/CharacterEncoding.

我在这个问题上做了很多搜索,所以这可能会帮助在 tomcat 上遇到同样问题的其他人。这取自http://wiki.apache.org/tomcat/FAQ/CharacterEncoding

(How to use UTF-8 everywhere).

(如何在任何地方使用 UTF-8)。

  • Set URIEncoding="UTF-8" on your <Connector>in server.xml. References: HTTP Connector, AJP Connector.
  • Use a character encoding filter with the default encoding set to UTF-8
  • Change all your JSPs to include charset name in their contentType. For example, use <%@page contentType="text/html; charset=UTF-8" %> for the usual JSP pages and <jsp:directive.page contentType="text/html; charset=UTF-8" />for the pages in XML syntax (aka JSP Documents).
  • Change all your servlets to set the content type for responses and to include charset name in the content type to be UTF-8. Use response.setContentType("text/html; charset=UTF-8") or response.setCharacterEncoding("UTF-8").
  • Change any content-generation libraries you use (Velocity, Freemarker, etc.) to use UTF-8 and to specify UTF-8 in the content type of the responses that they generate.
  • Disable any valves or filters that may read request parameters before your character encoding filter or jsp page has a chance to set the encoding to UTF-8.
  • <Connector>在 server.xml 中设置 URIEncoding="UTF-8" 。参考资料:HTTP 连接器、AJP 连接器。
  • 使用默认编码设置为 UTF-8 的字符编码过滤器
  • 更改所有 JSP 以在其 contentType 中包含字符集名称。例如,将 <%@page contentType="text/html; charset=UTF-8" %> 用于通常的 JSP 页面和<jsp:directive.page contentType="text/html; charset=UTF-8" />XML 语法(也称为 JSP 文档)的页面。
  • 更改所有 servlet 以设置响应的内容类型,并将内容类型中的字符集名称包含为 UTF-8。使用 response.setContentType("text/html; charset=UTF-8") 或 response.setCharacterEncoding("UTF-8")。
  • 将您使用的任何内容生成库(Velocity、Freemarker 等)更改为使用 UTF-8 并在它们生成的响应的内容类型中指定 UTF-8。
  • 在您的字符编码过滤器或 jsp 页面有机会将编码设置为 UTF-8 之前,禁用任何可能读取请求参数的阀门或过滤器。