Java jQuery AJAX 调用弄乱了字符编码

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/3198532/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-13 17:19:23  来源:igfitidea点击:

jQuery AJAX call messes up character encoding

javajqueryajaxjsoniso-8859-1

提问by Vivin Paliath

I have a servlet that outputs JSON. The output encoding for the servlet is ISO-8859-1. Pages in our webapp are also set to ISO-8859-1. I would use UTF-8, but this is outside my control; we have to use ISO-8859-1.

我有一个输出 JSON 的 servlet。servlet 的输出编码是 ISO-8859-1。我们的 webapp 中的页面也设置为 ISO-8859-1。我会使用 UTF-8,但这不在我的控制范围内;我们必须使用 ISO-8859-1。

When I hit the servlet by itself, I can see JSON data that has been outputted. The character encoding is correct, and none of the characters look strange.

当我自己点击servlet时,我可以看到已经输出的JSON数据。字符编码是正确的,没有一个字符看起来很奇怪。

However, when I call the servlet via AJAX and use the data retrieved to populate a select box, I get ? in the place of (it seems) all characters that have accents (for example i with grave or acute accent, dieresis, or circumflex). When I look at the response in the Net tab under Firebug, I can see that that the text looks fine. However, when I use that data to populate the select box, I get the diamond-with-questionmark.

但是,当我通过 AJAX 调用 servlet 并使用检索到的数据填充选择框时,我得到 ? 代替(似乎)所有带有重音符号的字符(例如 i 带有重音符或尖音符、分音符号或抑扬符号)。当我查看 Firebug 下 Net 选项卡中的响应时,我可以看到文本看起来不错。但是,当我使用该数据填充选择框时,我得到了带问号的菱形。

These characters are all valid ISO-8859-1 characters, and so I don't understand why they don't show up correctly.

这些字符都是有效的 ISO-8859-1 字符,所以我不明白为什么它们不能正确显示。

EDIT

编辑

Some more information. I use GETin jQuery.ajaxand I've set scriptCharsetto ISO-8859-1. On the server-side, I've explicitly set the encoding to ISO-8859-1 using request.setCharacterEncoding("ISO-8859-1");

更多信息。我使用GETinjQuery.ajax并且我已经设置scriptCharsetISO-8859-1. 在服务器端,我使用以下命令将编码显式设置为 ISO-8859-1request.setCharacterEncoding("ISO-8859-1");

EDIT

编辑

Code samples:

代码示例:

This is what I have currently. I added scriptCharset: "ISO-8859-1"to no effect.

这就是我目前所拥有的。我加scriptCharset: "ISO-8859-1"了没有效果。

        jQuery.ajax({
            url: "/countryAndProvinceCodeServlet",
            data: data,
            dataType: "json",
            type: "GET",
            success: function(data) {
               ...
            },
        });

My servlet uses org.json.JSONObjectand simply outputs the string by doing response.getWriter().print(jsonObject.toString());

我的 servlet 使用org.json.JSONObject并通过执行简单地输出字符串response.getWriter().print(jsonObject.toString());

UPDATE

更新

Per the comments about JSON and how it should be UTF-8, I tried to see if I could grab the data as text (so set dataTypeto textin jQuery.ajax) and then evaluate it as JSON myself (in Javascript). That doesn't seem to work either! When I do console.log, I still get the funky diamonds. However, when I look at it under the Net tab in Firebug everything shows up fine:

根据关于 JSON 以及它应该是 UTF-8 的评论,我尝试查看是否可以将数据作为文本获取(因此设置dataTypetextin jQuery.ajax),然后自己将其评估为 JSON(在 Javascript 中)。这似乎也不起作用!当我这样做时console.log,我仍然会得到时髦的钻石。但是,当我在 Firebug 的 Net 选项卡下查看它时,一切都显示正常:

Net tab:

网络标签:

{"error":false,
 "provinces":{"DZ-01":"Adrar",
              "DZ-16":"Alger",
              "DZ-23":"Annaba",
              "DZ-44":"A?n Defla",
              "DZ-46":"A?n Témouchent",
              "DZ-05":"Batna",
              "DZ-07":"Biskra",
              "DZ-09":"Blida",
              "DZ-34":"Bordj Bou Arréridj",
              "DZ-10":"Bouira",
              "DZ-35":"Boumerdès",
              "DZ-08":"Béchar",
              "DZ-06":"Béja?a",
              "DZ-02":"Chlef",
              "DZ-25":"Constantine",
              "DZ-17":"Djelfa",
              "DZ-32":"El Bayadh",
              "DZ-39":"El Oued",
              "DZ-36":"El Tarf",
              "DZ-47":"Gharda?a",
              "DZ-24":"Guelma",
              "DZ-33":"Illizi",
              "DZ-18":"Jijel",
              "DZ-40":"Khenchela",
              "DZ-03":"Laghouat",
              "DZ-29":"Mascara",
              "DZ-43":"Mila",
              "DZ-27":"Mostaganem",
              "DZ-28":"Msila",
              "DZ-26":"Médéa",
              "DZ-45":"Naama",
              "DZ-31":"Oran",
              "DZ-30":"Ouargla",
              "DZ-04":"Oum el Bouaghi",
              "DZ-48":"Relizane",
              "DZ-20":"Sa?da",
              "DZ-22":"Sidi Bel Abbès",
              "DZ-21":"Skikda",
              "DZ-41":"Souk Ahras",
              "DZ-19":"Sétif",
              "DZ-11":"Tamanghasset",
              "DZ-14":"Tiaret",
              "DZ-37":"Tindouf",
              "DZ-42":"Tipaza",
              "DZ-38":"Tissemsilt",
              "DZ-15":"Tizi Ouzou",
              "DZ-13":"Tlemcen",
              "DZ-12":"Tébessa"}}

But when I do console.log(text)with what I get from jQuery.ajax, I get the following:

但是当我console.log(text)处理我从中得到的东西时jQuery.ajax,我得到以下信息:

{"error":false,
 "provinces":{"DZ-01":"Adrar",
              "DZ-16":"Alger",
              "DZ-23":"Annaba",
              "DZ-44":"A?n Defla",
              "DZ-46":"A?n T?mouchent",
              "DZ-05":"Batna",
              "DZ-07":"Biskra",
              "DZ-09":"Blida",
              "DZ-34":"Bordj Bou Arr?ridj",
              "DZ-10":"Bouira",
              "DZ-35":"Boumerd?s",
              "DZ-08":"B?char",
              "DZ-06":"B?ja?a",
              "DZ-02":"Chlef",
              "DZ-25":"Constantine",
              "DZ-17":"Djelfa",
              "DZ-32":"El Bayadh",
              "DZ-39":"El Oued",
              "DZ-36":"El Tarf",
              "DZ-47":"Gharda?a",
              "DZ-24":"Guelma",
              "DZ-33":"Illizi",
              "DZ-18":"Jijel",
              "DZ-40":"Khenchela",
              "DZ-03":"Laghouat",
              "DZ-29":"Mascara",
              "DZ-43":"Mila",
              "DZ-27":"Mostaganem",
              "DZ-28":"Msila",
              "DZ-26":"M?d?a",
              "DZ-45":"Naama",
              "DZ-31":"Oran",
              "DZ-30":"Ouargla",
              "DZ-04":"Oum el Bouaghi",
              "DZ-48":"Relizane",
              "DZ-20":"Sa?da",
              "DZ-22":"Sidi Bel Abb?s",
              "DZ-21":"Skikda",
              "DZ-41":"Souk Ahras",
              "DZ-19":"S?tif",
              "DZ-11":"Tamanghasset",
              "DZ-14":"Tiaret",
              "DZ-37":"Tindouf",
              "DZ-42":"Tipaza",
              "DZ-38":"Tissemsilt",
              "DZ-15":"Tizi Ouzou",
              "DZ-13":"Tlemcen",
              "DZ-12":"T?bessa"}}

It seems to me that jQuery is doing something weird with the data.

在我看来,jQuery 对数据做了一些奇怪的事情。

采纳答案by Vivin Paliath

I finally figured it out. It's pretty weird!

我终于弄明白了。这很奇怪!

response.setCharacterEncoding(String)does notwork (don't know if it's related to my setup or what). It looks like it sets the character encoding, but for some reason jQuery messes it all up. You have the explicitly set the headers like so:

response.setCharacterEncoding(String)没有工作(不知道它是否与我的设置还是什么)。看起来它设置了字符编码,但由于某种原因,jQuery 把它搞砸了。您可以像这样显式设置标题:

response.setHeader("Content-Type", "application/json; charset=ISO-8859-1");

Thanks for all the help, everyone!

谢谢各位的帮助!

EDIT

编辑

I did some research and checked out the JavaDocsand saw this:

我做了一些研究并查看了JavaDocs并看到了这一点:

Containers must communicate the character encoding used for the servlet response's writer to the client if the protocol provides a way for doing so. In the case of HTTP, the character encoding is communicated as part of the Content-Type header for text media types. Note that the character encoding cannotbe communicated via HTTP headers if the servlet does not specify a content type; however, it is still used to encode text written via the servlet response's writer.

如果协议提供了这样做的方法,容器必须将用于 servlet 响应的编写器的字符编码传达给客户端。在 HTTP 的情况下,字符编码作为文本媒体类型的 Content-Type 标头的一部分进行通信。请注意,如果 servlet 未指定内容类型,则字符编码无法通过 HTTP 标头进行通信。但是,它仍然用于对通过 servlet 响应的 writer 编写的文本进行编码

So the above still works, but you can also (and probably should) do this:

所以以上仍然有效,但你也可以(并且可能应该)这样做:

response.setContentType("application/json");
response.setCharacterEncoding("ISO-8859-1"); 

回答by Dave Jarvis

Can you use UTF-8, instead?

你可以改用 UTF-8 吗?

<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />

In PHP, you can encode JSON data as UTF-8:

在 PHP 中,您可以将 JSON 数据编码为 UTF-8:

/**
 * Applies a UTF-8 encoding conversion for text.
 */
function utf8_enc( $rows ) {
  $encoded = array();

  foreach( $rows as $row ) {
    $temp = array();

    foreach( $row as $name => $value ) {
      $temp[ $name ] = $value = mb_convert_encoding( $value, 'auto', 'UTF-8' );
    }

    array_push( $encoded, $temp );
  }

  return $encoded;
}

function db_json( $query ) {
  echo json_encode( utf8_enc( db_fetch_all( db_query( $query ) ) ) );
}

I was seeing some strange results using the ISO-8859-1 accented character set. I switched to UTF-8 and the encoding problems disappeared.

我在使用 ISO-8859-1 重音字符集时看到了一些奇怪的结果。我切换到 UTF-8,编码问题消失了。

For what it's worth, I have coded getJSONas follows:

对于它的价值,我编码getJSON如下:

  $.getJSON( HOST + 'cat.dhtml', function( data ) {
    var h = '';
    var len = data.length;

    for( var i = 0; i < len; i++ ) {
      h += '<option value="' + data[i].id + '">' + data[i].name + '</option>';
      categories[ data[i].id ] = data[i];
    }

    $('#category').html(h);
  });

回答by phixr

The php function json_encode does not support ISO-8859-1 encoded data.

php 函数 json_encode 不支持 ISO-8859-1 编码数据。

This article might help you with your problem: http://www.pabloviquez.com/2009/07/json-iso-8859-1-and-utf-8-%E2%80%93-part2/

本文可能会帮助您解决问题:http: //www.pabloviquez.com/2009/07/json-iso-8859-1-and-utf-8-%E2%80%93-part2/

回答by mogsie

RFC 4627states that JSON text SHALL be encoded in Unicode, whatever that means, and json.orgindicates that all characters be "unicode characters":

RFC 4627规定 JSON 文本应以 Unicode 编码,无论这意味着什么,而json.org表示所有字符都是“Unicode 字符”:

  • Encoding

    JSON text SHALL be encoded in Unicode. The default encoding is UTF-8.

    Since the first two characters of a JSON text will always be ASCII characters [RFC0020], it is possible to determine whether an octet stream is UTF-8, UTF-16 (BE or LE), or UTF-32 (BE or LE) by looking at the pattern of nulls in the first four octets.

       00 00 00 xx  UTF-32BE
       00 xx 00 xx  UTF-16BE
       xx 00 00 00  UTF-32LE
       xx 00 xx 00  UTF-16LE
       xx xx xx xx  UTF-8
    
  • 编码

    JSON 文本应以 Unicode 编码。默认编码为 UTF-8。

    由于 JSON 文本的前两个字符将始终是 ASCII 字符 [RFC0020],因此可以确定八位字节流是 UTF-8、UTF-16(BE 或 LE)还是 UTF-32(BE 或 LE)通过查看前四个八位字节中的空值模式。

       00 00 00 xx  UTF-32BE
       00 xx 00 xx  UTF-16BE
       xx 00 00 00  UTF-32LE
       xx 00 xx 00  UTF-16LE
       xx xx xx xx  UTF-8
    

So if you're transferring JSON and saying that it's ISO-8859-1 then different JSON libraries may interpret the SHALL clause from the RFC that defines JSON in various ways, e.g. by encoding the replacement character or by sniffing the encoding. The best way if obviously to take this to whatever is outside your control and tell them to fix it :-)

因此,如果您正在传输 JSON 并说它是 ISO-8859-1,那么不同的 JSON 库可能会以各种方式解释来自定义 JSON 的 RFC 的 SHALL 子句,例如通过对替换字符进行编码或通过嗅探编码。最好的方法显然是将它带到您无法控制的任何地方并告诉他们修复它:-)

Workarounds

解决方法

One way to work around it is to create a servlet filter that removes all characters that are incompatible with both UTF-8 and ISO-8859-1 and replace them with JSON escapes:

解决此问题的一种方法是创建一个 servlet 过滤器,删除与 UTF-8 和 ISO-8859-1 不兼容的所有字符,并将它们替换为 JSON 转义符:

In the following fragment, replace 'é' with '\u00E9' so that any offending ISO-8859-1 character is safely transported in the 7-bits that are identical:

在以下片段中,将 'é' 替换为 '\u00E9',以便在相同的 7 位中安全传输任何有问题的 ISO-8859-1 字符:

Before: { "a" : "éte" }

前: { "a" : "éte" }

After: { "a" : "\u00E9te" }

后: { "a" : "\u00E9te" }

It's not as legible, but semantically speaking, it's the same, and any good JSON library should treat them identically.

它不是那么清晰,但从语义上来说,它是一样的,任何好的 JSON 库都应该一视同仁地对待它们。

回答by Oleg

It seems to me you receive a parsing error because the response data are wrong decoded and so contain some wrong characters.

在我看来,您收到解析错误是因为响应数据解码错误,因此包含一些错误字符。

You could try to insert in jQuery.ajax an additional parameter

您可以尝试在 jQuery.ajax 中插入一个附加参数

dataFilter : function ( data, type ) {
    alert(data);
    return data;
}

If you will have wrong but differentcharacters for all non-ASCII characters ('?', 'é' and so on) you can try to replace the wrong encoded characters to the correct characters and return correct encoded data from the dataFilter.

如果所有非 ASCII 字符('?'、'é' 等)都有错误但不同的字符,您可以尝试将错误的编码字符替换为正确的字符,并从dataFilter.

回答by Jamal Azizbeigi

if you want retrieved data from database you should write these under sentences in the page that send request from ajax page. For example if you write HTML and AJAX code in page "A" and send variable from java code to page "B", write these codes in page "B".
don't forgot your database should be in unicode mode such as "utf8_general_ci".

如果您想从数据库中检索数据,您应该将这些写在从 ajax 页面发送请求的页面中的句子下。例如,如果您在页面“A”中编写 HTML 和 AJAX 代码并将变量从 Java 代码发送到页面“B”,则在页面“B”中编写这些代码。
不要忘记您的数据库应该处于 unicode 模式,例如“utf8_general_ci”。

mysqli_query ($conn,"set character_set_client='utf8'");
mysqli_query ($conn,"set character_set_results='utf8'");
mysqli_query ($conn,"set collation_connection='utf8_general_ci'");
mysqli_query($conn,"set collation_connection='utf8_persian_ci'");
mysqli_set_charset($conn,"set character_set_results='utf8'") ;
mysqli_set_charset($conn,"set collation_connection='utf8_general_ci'") ;

; I wrote these sentence for Persian language, you can modify it. $connis a variable for connect to specified table in database of MySQL.

; 我为波斯语写了这句话,你可以修改它。$conn是连接到 MySQL 数据库中指定表的变量。