java string.getBytes("UTF-8") javascript 等价物
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/22861828/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
java string.getBytes("UTF-8") javascript equivalent
提问by Wesley
I have this string in java:
我在java中有这个字符串:
"test.message"
byte[] bytes = plaintext.getBytes("UTF-8");
//result: [116, 101, 115, 116, 46, 109, 101, 115, 115, 97, 103, 101]
If I do the same thing in javascript:
如果我在 javascript 中做同样的事情:
stringToByteArray: function (str) {
str = unescape(encodeURIComponent(str));
var bytes = new Array(str.length);
for (var i = 0; i < str.length; ++i)
bytes[i] = str.charCodeAt(i);
return bytes;
},
I get:
我得到:
[7,163,140,72,178,72,244,241,149,43,67,124]
I was under the impression that the unescape(encodeURIComponent()) would correctly translate the string to UTF-8. Is this not the case?
我的印象是 unescape(encodeURIComponent()) 会正确地将字符串转换为 UTF-8。不是这样吗?
Reference:
参考:
http://ecmanaut.blogspot.be/2006/07/encoding-decoding-utf8-in-javascript.html
http://ecmanaut.blogspot.be/2006/07/encoding-decoding-utf8-in-javascript.html
采纳答案by Paul S.
JavaScripthas no concept of character encoding for String, everything is in UTF-16. Most of time time the value of a char
in UTF-16matches UTF-8, so you can forget it's any different.
JavaScript没有String字符编码的概念,一切都在UTF-16 中。大多数情况下char
,UTF-16中a 的值与UTF-8匹配,因此您可以忘记它有什么不同。
There are more optimal ways to do this but
有更多最佳方法可以做到这一点,但是
function s(x) {return x.charCodeAt(0);}
"test.message".split('').map(s);
// [116, 101, 115, 116, 46, 109, 101, 115, 115, 97, 103, 101]
So what is unescape(encodeURIComponent(str))
doing? Let's look at each individually,
那么在unescape(encodeURIComponent(str))
做什么呢?让我们分别看一下,
encodeURIComponent
is converting every character instr
which is illegal or has a meaning in URI Syntaxinto a URI escapedversion so that there is no problem using it as a key or value in the search component of a URI, for exampleencodeURIComponent('&='); // "%26%3D"
Notice how this is now a 6 character long String.unescape
is actually depreciated, but it does a similar job todecodeURI
ordecodeURIComponent
(the reverse ofencodeURIComponent
). If we look in the ES5 specwe can see11. Let c be the character whose code unit value is the integer represented by the four hexadecimal digits at positions k+2, k+3, k+4, and k+5 within Result(1).
So,4
digits is2
bytes is "UTF-8", however as I mentioned, all Stringsare UTF-16, so it's really a UTF-16string limiting itself to UTF-8.
encodeURIComponent
是每个字符转换中str
这是非法的或者具有意义URI语法为URI转义版本,因此不存在使用它作为一个的搜索组件的键或值没有问题的URI,例如encodeURIComponent('&='); // "%26%3D"
注意如何,这是现在6字符长字符串。unescape
实际上已折旧,但它的作用与decodeURI
或decodeURIComponent
(与 相反encodeURIComponent
)。如果我们查看ES5 规范,我们可以看到,11. Let c be the character whose code unit value is the integer represented by the four hexadecimal digits at positions k+2, k+3, k+4, and k+5 within Result(1).
因此,4
digits is2
bytes 是"UTF-8",但是正如我所提到的,所有字符串都是UTF-16,因此它实际上是一个UTF-16字符串,将自身限制为UTF-8。
回答by Kevin Hakanson
You can use TextEncoder
which is part of the Encoding Living Standard. According to the Encoding APIentry from the Chromium Dashboard, it shipped in Firefox and will ship in Chrome 38. There is also a text-encodingpolyfill available.
您可以使用TextEncoder
哪个是编码生活标准的一部分。根据Chromium Dashboard的Encoding API条目,它在 Firefox 中提供,并将在 Chrome 38 中提供。还有一个文本编码polyfill 可用。
The JavaScript code sample below returns a Uint8Array
filled with the values you expect.
下面的 JavaScript 代码示例返回一个Uint8Array
填充了您期望的值。
var s = "test.message";
var encoder = new TextEncoder();
encoder.encode(s);
// [116, 101, 115, 116, 46, 109, 101, 115, 115, 97, 103, 101]