Uint8Array 到 Javascript 中的字符串
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/8936984/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Uint8Array to string in Javascript
提问by Hyman Wester
I have some UTF-8 encoded data living in a range of Uint8Array elements in Javascript. Is there an efficient way to decode these out to a regular javascript string (I believe Javascript uses 16 bit Unicode)? I dont want to add one character at the time as the string concaternation would become to CPU intensive.
我有一些 UTF-8 编码的数据存在于 Javascript 中的一系列 Uint8Array 元素中。有没有一种有效的方法可以将这些解码为常规的 javascript 字符串(我相信 Javascript 使用 16 位 Unicode)?我不想在当时添加一个字符,因为字符串连接会占用 CPU。
回答by Vincent Scheib
TextEncoder
and TextDecoder
from the Encoding standard, which is polyfilled by the stringencoding library, converts between strings and ArrayBuffers:
TextEncoder
并TextDecoder
从由stringencoding 库填充的编码标准在字符串和 ArrayBuffers 之间进行转换:
var uint8array = new TextEncoder("utf-8").encode("¢");
var string = new TextDecoder("utf-8").decode(uint8array);
回答by Albert
This should work:
这应该有效:
// http://www.onicos.com/staff/iz/amuse/javascript/expert/utf.txt
/* utf.js - UTF-8 <=> UTF-16 convertion
*
* Copyright (C) 1999 Masanao Izumo <[email protected]>
* Version: 1.0
* LastModified: Dec 25 1999
* This library is free. You can redistribute it and/or modify it.
*/
function Utf8ArrayToStr(array) {
var out, i, len, c;
var char2, char3;
out = "";
len = array.length;
i = 0;
while(i < len) {
c = array[i++];
switch(c >> 4)
{
case 0: case 1: case 2: case 3: case 4: case 5: case 6: case 7:
// 0xxxxxxx
out += String.fromCharCode(c);
break;
case 12: case 13:
// 110x xxxx 10xx xxxx
char2 = array[i++];
out += String.fromCharCode(((c & 0x1F) << 6) | (char2 & 0x3F));
break;
case 14:
// 1110 xxxx 10xx xxxx 10xx xxxx
char2 = array[i++];
char3 = array[i++];
out += String.fromCharCode(((c & 0x0F) << 12) |
((char2 & 0x3F) << 6) |
((char3 & 0x3F) << 0));
break;
}
}
return out;
}
It's somewhat cleaner as the other solutions because it doesn't use any hacks nor depends on Browser JS functions, e.g. works also in other JS environments.
它比其他解决方案更简洁,因为它不使用任何 hacks 也不依赖于浏览器 JS 功能,例如也适用于其他 JS 环境。
Check out the JSFiddle demo.
查看JSFiddle 演示。
回答by dlchambers
Here's what I use:
这是我使用的:
var str = String.fromCharCode.apply(null, uint8Arr);
回答by Will Scott
Found in one of the Chrome sample applications, although this is meant for larger blocks of data where you're okay with an asynchronous conversion.
可在其中一个 Chrome 示例应用程序中找到,尽管这适用于可以进行异步转换的较大数据块。
/**
* Converts an array buffer to a string
*
* @private
* @param {ArrayBuffer} buf The buffer to convert
* @param {Function} callback The function to call when conversion is complete
*/
function _arrayBufferToString(buf, callback) {
var bb = new Blob([new Uint8Array(buf)]);
var f = new FileReader();
f.onload = function(e) {
callback(e.target.result);
};
f.readAsText(bb);
}
回答by kpowz
In Node "Buffer
instances are also Uint8Array
instances", so buf.toString()
works in this case.
在 Node 中“Buffer
实例也是Uint8Array
实例”,所以buf.toString()
在这种情况下有效。
回答by Bob Arlof
The solution given by Albert works well as long as the provided function is invoked infrequently and is only used for arrays of modest size, otherwise it is egregiously inefficient. Here is an enhanced vanilla JavaScript solution that works for both Node and browsers and has the following advantages:
只要提供的函数不经常调用并且仅用于中等大小的数组,Albert 给出的解决方案就可以很好地工作,否则效率极低。这是一个增强的 vanilla JavaScript 解决方案,适用于 Node 和浏览器,并具有以下优点:
? Works efficiently for all octet array sizes
? 适用于所有八位字节数组大小
? Generates no intermediate throw-away strings
? 不生成中间丢弃字符串
? Supports 4-byte characters on modern JS engines (otherwise "?" is substituted)
? 在现代 JS 引擎上支持 4 字节字符(否则用“?”代替)
var utf8ArrayToStr = (function () {
var charCache = new Array(128); // Preallocate the cache for the common single byte chars
var charFromCodePt = String.fromCodePoint || String.fromCharCode;
var result = [];
return function (array) {
var codePt, byte1;
var buffLen = array.length;
result.length = 0;
for (var i = 0; i < buffLen;) {
byte1 = array[i++];
if (byte1 <= 0x7F) {
codePt = byte1;
} else if (byte1 <= 0xDF) {
codePt = ((byte1 & 0x1F) << 6) | (array[i++] & 0x3F);
} else if (byte1 <= 0xEF) {
codePt = ((byte1 & 0x0F) << 12) | ((array[i++] & 0x3F) << 6) | (array[i++] & 0x3F);
} else if (String.fromCodePoint) {
codePt = ((byte1 & 0x07) << 18) | ((array[i++] & 0x3F) << 12) | ((array[i++] & 0x3F) << 6) | (array[i++] & 0x3F);
} else {
codePt = 63; // Cannot convert four byte code points, so use "?" instead
i += 3;
}
result.push(charCache[codePt] || (charCache[codePt] = charFromCodePt(codePt)));
}
return result.join('');
};
})();
回答by shuki
Do what @Sudhir said, and then to get a String out of the comma seperated list of numbers use:
做@Sudhir 所说的,然后从逗号分隔的数字列表中获取一个字符串,使用:
for (var i=0; i<unitArr.byteLength; i++) {
myString += String.fromCharCode(unitArr[i])
}
This will give you the string you want, if it's still relevant
如果它仍然相关,这将为您提供所需的字符串
回答by serdarsenay
Try these functions,
试试这些功能,
var JsonToArray = function(json)
{
var str = JSON.stringify(json, null, 0);
var ret = new Uint8Array(str.length);
for (var i = 0; i < str.length; i++) {
ret[i] = str.charCodeAt(i);
}
return ret
};
var binArrayToJson = function(binArray)
{
var str = "";
for (var i = 0; i < binArray.length; i++) {
str += String.fromCharCode(parseInt(binArray[i]));
}
return JSON.parse(str)
}
source: https://gist.github.com/tomfa/706d10fed78c497731ac, kudos to Tomfa
来源:https: //gist.github.com/tomfa/706d10fed78c497731ac ,感谢 Tomfa
回答by simbo1905
I was frustrated to see that people were not showing how to go both ways or showing that things work on none trivial UTF8 strings. I found a post on codereview.stackexchange.comthat has some code that works well. I used it to turn ancient runes into bytes, to test some crypo on the bytes, then convert things back into a string. The working code is on github here. I renamed the methods for clarity:
我很沮丧地看到人们没有展示如何双向使用或表明事情对非平凡的 UTF8 字符串有效。我在 codereview.stackexchange.com 上找到了一篇文章,其中有一些运行良好的代码。我用它把古老的符文转换成字节,在字节上测试一些密码,然后把东西转换回字符串。工作代码是在github这里。为了清楚起见,我重命名了这些方法:
// https://codereview.stackexchange.com/a/3589/75693
function bytesToSring(bytes) {
var chars = [];
for(var i = 0, n = bytes.length; i < n;) {
chars.push(((bytes[i++] & 0xff) << 8) | (bytes[i++] & 0xff));
}
return String.fromCharCode.apply(null, chars);
}
// https://codereview.stackexchange.com/a/3589/75693
function stringToBytes(str) {
var bytes = [];
for(var i = 0, n = str.length; i < n; i++) {
var char = str.charCodeAt(i);
bytes.push(char >>> 8, char & 0xFF);
}
return bytes;
}
The unit test uses this UTF-8 string:
单元测试使用这个 UTF-8 字符串:
// http://kermitproject.org/utf8.html
// From the Anglo-Saxon Rune Poem (Rune version)
const secretUtf8 = `?????????????????????????????
?????????????????????????????????????????
?????????????????????????????????????`;
Note that the string length is only 117 characters but the byte length, when encoded, is 234.
请注意,字符串长度仅为 117 个字符,但编码后的字节长度为 234。
If I uncomment the console.log lines I can see that the string that is decoded is the same string that was encoded (with the bytes passed through Shamir's secret sharing algorithm!):
如果我取消注释 console.log 行,我可以看到解码的字符串与编码的字符串相同(字节通过 Shamir 的秘密共享算法传递!):
回答by Rosberg Linhares
If you can't use the TextDecoder APIbecause it is not supported on IE:
如果因为IE 不支持而无法使用TextDecoder API:
- You can use the FastestSmallestTextEncoderDecoderpolyfill recommended by the Mozilla Developer Network website;
- You can use this function also provided at the MDN website:
- 可以使用Mozilla Developer Network 网站推荐的FastestSmallestTextEncoderDecoder polyfill;
- 您可以使用MDN 网站上也提供的此功能:
function utf8ArrayToString(aBytes) {
var sView = "";
for (var nPart, nLen = aBytes.length, nIdx = 0; nIdx < nLen; nIdx++) {
nPart = aBytes[nIdx];
sView += String.fromCharCode(
nPart > 251 && nPart < 254 && nIdx + 5 < nLen ? /* six bytes */
/* (nPart - 252 << 30) may be not so safe in ECMAScript! So...: */
(nPart - 252) * 1073741824 + (aBytes[++nIdx] - 128 << 24) + (aBytes[++nIdx] - 128 << 18) + (aBytes[++nIdx] - 128 << 12) + (aBytes[++nIdx] - 128 << 6) + aBytes[++nIdx] - 128
: nPart > 247 && nPart < 252 && nIdx + 4 < nLen ? /* five bytes */
(nPart - 248 << 24) + (aBytes[++nIdx] - 128 << 18) + (aBytes[++nIdx] - 128 << 12) + (aBytes[++nIdx] - 128 << 6) + aBytes[++nIdx] - 128
: nPart > 239 && nPart < 248 && nIdx + 3 < nLen ? /* four bytes */
(nPart - 240 << 18) + (aBytes[++nIdx] - 128 << 12) + (aBytes[++nIdx] - 128 << 6) + aBytes[++nIdx] - 128
: nPart > 223 && nPart < 240 && nIdx + 2 < nLen ? /* three bytes */
(nPart - 224 << 12) + (aBytes[++nIdx] - 128 << 6) + aBytes[++nIdx] - 128
: nPart > 191 && nPart < 224 && nIdx + 1 < nLen ? /* two bytes */
(nPart - 192 << 6) + aBytes[++nIdx] - 128
: /* nPart < 127 ? */ /* one byte */
nPart
);
}
return sView;
}
let str = utf8ArrayToString([50,72,226,130,130,32,43,32,79,226,130,130,32,226,135,140,32,50,72,226,130,130,79]);
// Must show 2H? + O? ? 2H?O
console.log(str);