Uint8Array 到 Javascript 中的字符串

Question

提问by Hyman Wester

I have some UTF-8 encoded data living in a range of Uint8Array elements in Javascript. Is there an efficient way to decode these out to a regular javascript string (I believe Javascript uses 16 bit Unicode)? I dont want to add one character at the time as the string concaternation would become to CPU intensive.

我有一些 UTF-8 编码的数据存在于 Javascript 中的一系列 Uint8Array 元素中。有没有一种有效的方法可以将这些解码为常规的 javascript 字符串（我相信 Javascript 使用 16 位 Unicode）？我不想在当时添加一个字符，因为字符串连接会占用 CPU。

Answer 1

回答by Vincent Scheib

TextEncoderand TextDecoderfrom the Encoding standard, which is polyfilled by the stringencoding library, converts between strings and ArrayBuffers:

TextEncoder并TextDecoder从由stringencoding 库填充的编码标准在字符串和 ArrayBuffers 之间进行转换：

var uint8array = new TextEncoder("utf-8").encode("￠");
var string = new TextDecoder("utf-8").decode(uint8array);

Answer 2

回答by Albert

This should work:

这应该有效：

// http://www.onicos.com/staff/iz/amuse/javascript/expert/utf.txt

/* utf.js - UTF-8 <=> UTF-16 convertion
 *
 * Copyright (C) 1999 Masanao Izumo <[email protected]>
 * Version: 1.0
 * LastModified: Dec 25 1999
 * This library is free.  You can redistribute it and/or modify it.
 */

function Utf8ArrayToStr(array) {
    var out, i, len, c;
    var char2, char3;

    out = "";
    len = array.length;
    i = 0;
    while(i < len) {
    c = array[i++];
    switch(c >> 4)
    { 
      case 0: case 1: case 2: case 3: case 4: case 5: case 6: case 7:
        // 0xxxxxxx
        out += String.fromCharCode(c);
        break;
      case 12: case 13:
        // 110x xxxx   10xx xxxx
        char2 = array[i++];
        out += String.fromCharCode(((c & 0x1F) << 6) | (char2 & 0x3F));
        break;
      case 14:
        // 1110 xxxx  10xx xxxx  10xx xxxx
        char2 = array[i++];
        char3 = array[i++];
        out += String.fromCharCode(((c & 0x0F) << 12) |
                       ((char2 & 0x3F) << 6) |
                       ((char3 & 0x3F) << 0));
        break;
    }
    }

    return out;
}

It's somewhat cleaner as the other solutions because it doesn't use any hacks nor depends on Browser JS functions, e.g. works also in other JS environments.

它比其他解决方案更简洁，因为它不使用任何 hacks 也不依赖于浏览器 JS 功能，例如也适用于其他 JS 环境。

Check out the JSFiddle demo.

查看JSFiddle 演示。

Also see the related questions: hereand here

另请参阅相关问题：此处和此处

Answer 3

回答by dlchambers

Here's what I use:

这是我使用的：

var str = String.fromCharCode.apply(null, uint8Arr);

Answer 4

回答by Will Scott

Found in one of the Chrome sample applications, although this is meant for larger blocks of data where you're okay with an asynchronous conversion.

可在其中一个 Chrome 示例应用程序中找到，尽管这适用于可以进行异步转换的较大数据块。

/**
 * Converts an array buffer to a string
 *
 * @private
 * @param {ArrayBuffer} buf The buffer to convert
 * @param {Function} callback The function to call when conversion is complete
 */
function _arrayBufferToString(buf, callback) {
  var bb = new Blob([new Uint8Array(buf)]);
  var f = new FileReader();
  f.onload = function(e) {
    callback(e.target.result);
  };
  f.readAsText(bb);
}

Answer 5

回答by kpowz

In Node "Bufferinstances are also Uint8Arrayinstances", so buf.toString()works in this case.

在 Node 中“Buffer实例也是Uint8Array实例”，所以buf.toString()在这种情况下有效。

Answer 6

回答by Bob Arlof

The solution given by Albert works well as long as the provided function is invoked infrequently and is only used for arrays of modest size, otherwise it is egregiously inefficient. Here is an enhanced vanilla JavaScript solution that works for both Node and browsers and has the following advantages:

只要提供的函数不经常调用并且仅用于中等大小的数组，Albert 给出的解决方案就可以很好地工作，否则效率极低。这是一个增强的 vanilla JavaScript 解决方案，适用于 Node 和浏览器，并具有以下优点：

? Works efficiently for all octet array sizes

? 适用于所有八位字节数组大小

? Generates no intermediate throw-away strings

? 不生成中间丢弃字符串

? Supports 4-byte characters on modern JS engines (otherwise "?" is substituted)

? 在现代 JS 引擎上支持 4 字节字符（否则用“？”代替）

var utf8ArrayToStr = (function () {
    var charCache = new Array(128);  // Preallocate the cache for the common single byte chars
    var charFromCodePt = String.fromCodePoint || String.fromCharCode;
    var result = [];

    return function (array) {
        var codePt, byte1;
        var buffLen = array.length;

        result.length = 0;

        for (var i = 0; i < buffLen;) {
            byte1 = array[i++];

            if (byte1 <= 0x7F) {
                codePt = byte1;
            } else if (byte1 <= 0xDF) {
                codePt = ((byte1 & 0x1F) << 6) | (array[i++] & 0x3F);
            } else if (byte1 <= 0xEF) {
                codePt = ((byte1 & 0x0F) << 12) | ((array[i++] & 0x3F) << 6) | (array[i++] & 0x3F);
            } else if (String.fromCodePoint) {
                codePt = ((byte1 & 0x07) << 18) | ((array[i++] & 0x3F) << 12) | ((array[i++] & 0x3F) << 6) | (array[i++] & 0x3F);
            } else {
                codePt = 63;    // Cannot convert four byte code points, so use "?" instead
                i += 3;
            }

            result.push(charCache[codePt] || (charCache[codePt] = charFromCodePt(codePt)));
        }

        return result.join('');
    };
})();

Answer 7

回答by shuki

Do what @Sudhir said, and then to get a String out of the comma seperated list of numbers use:

做@Sudhir 所说的，然后从逗号分隔的数字列表中获取一个字符串，使用：

for (var i=0; i<unitArr.byteLength; i++) {
            myString += String.fromCharCode(unitArr[i])
        }

This will give you the string you want, if it's still relevant

如果它仍然相关，这将为您提供所需的字符串

Answer 8

回答by serdarsenay

Try these functions,

试试这些功能，

var JsonToArray = function(json)
{
    var str = JSON.stringify(json, null, 0);
    var ret = new Uint8Array(str.length);
    for (var i = 0; i < str.length; i++) {
        ret[i] = str.charCodeAt(i);
    }
    return ret
};

var binArrayToJson = function(binArray)
{
    var str = "";
    for (var i = 0; i < binArray.length; i++) {
        str += String.fromCharCode(parseInt(binArray[i]));
    }
    return JSON.parse(str)
}

source: https://gist.github.com/tomfa/706d10fed78c497731ac, kudos to Tomfa

来源：https: //gist.github.com/tomfa/706d10fed78c497731ac ，感谢 Tomfa

Answer 9

回答by simbo1905

I was frustrated to see that people were not showing how to go both ways or showing that things work on none trivial UTF8 strings. I found a post on codereview.stackexchange.comthat has some code that works well. I used it to turn ancient runes into bytes, to test some crypo on the bytes, then convert things back into a string. The working code is on github here. I renamed the methods for clarity:

我很沮丧地看到人们没有展示如何双向使用或表明事情对非平凡的 UTF8 字符串有效。我在 codereview.stackexchange.com 上找到了一篇文章，其中有一些运行良好的代码。我用它把古老的符文转换成字节，在字节上测试一些密码，然后把东西转换回字符串。工作代码是在github这里。为了清楚起见，我重命名了这些方法：

// https://codereview.stackexchange.com/a/3589/75693
function bytesToSring(bytes) {
    var chars = [];
    for(var i = 0, n = bytes.length; i < n;) {
        chars.push(((bytes[i++] & 0xff) << 8) | (bytes[i++] & 0xff));
    }
    return String.fromCharCode.apply(null, chars);
}

// https://codereview.stackexchange.com/a/3589/75693
function stringToBytes(str) {
    var bytes = [];
    for(var i = 0, n = str.length; i < n; i++) {
        var char = str.charCodeAt(i);
        bytes.push(char >>> 8, char & 0xFF);
    }
    return bytes;
}

The unit test uses this UTF-8 string:

单元测试使用这个 UTF-8 字符串：

    // http://kermitproject.org/utf8.html
    // From the Anglo-Saxon Rune Poem (Rune version) 
    const secretUtf8 = `?????????????????????????????
?????????????????????????????????????????
?????????????????????????????????????`;

Note that the string length is only 117 characters but the byte length, when encoded, is 234.

请注意，字符串长度仅为 117 个字符，但编码后的字节长度为 234。

If I uncomment the console.log lines I can see that the string that is decoded is the same string that was encoded (with the bytes passed through Shamir's secret sharing algorithm!):

如果我取消注释 console.log 行，我可以看到解码的字符串与编码的字符串相同（字节通过 Shamir 的秘密共享算法传递！）：

Answer 10

回答by Rosberg Linhares

If you can't use the TextDecoder APIbecause it is not supported on IE:

如果因为IE 不支持而无法使用TextDecoder API：

You can use the FastestSmallestTextEncoderDecoderpolyfill recommended by the Mozilla Developer Network website;
You can use this function also provided at the MDN website:

可以使用Mozilla Developer Network 网站推荐的FastestSmallestTextEncoderDecoder polyfill；
您可以使用MDN 网站上也提供的此功能：

function utf8ArrayToString(aBytes) {
    var sView = "";
    
    for (var nPart, nLen = aBytes.length, nIdx = 0; nIdx < nLen; nIdx++) {
        nPart = aBytes[nIdx];
        
        sView += String.fromCharCode(
            nPart > 251 && nPart < 254 && nIdx + 5 < nLen ? /* six bytes */
                /* (nPart - 252 << 30) may be not so safe in ECMAScript! So...: */
                (nPart - 252) * 1073741824 + (aBytes[++nIdx] - 128 << 24) + (aBytes[++nIdx] - 128 << 18) + (aBytes[++nIdx] - 128 << 12) + (aBytes[++nIdx] - 128 << 6) + aBytes[++nIdx] - 128
            : nPart > 247 && nPart < 252 && nIdx + 4 < nLen ? /* five bytes */
                (nPart - 248 << 24) + (aBytes[++nIdx] - 128 << 18) + (aBytes[++nIdx] - 128 << 12) + (aBytes[++nIdx] - 128 << 6) + aBytes[++nIdx] - 128
            : nPart > 239 && nPart < 248 && nIdx + 3 < nLen ? /* four bytes */
                (nPart - 240 << 18) + (aBytes[++nIdx] - 128 << 12) + (aBytes[++nIdx] - 128 << 6) + aBytes[++nIdx] - 128
            : nPart > 223 && nPart < 240 && nIdx + 2 < nLen ? /* three bytes */
                (nPart - 224 << 12) + (aBytes[++nIdx] - 128 << 6) + aBytes[++nIdx] - 128
            : nPart > 191 && nPart < 224 && nIdx + 1 < nLen ? /* two bytes */
                (nPart - 192 << 6) + aBytes[++nIdx] - 128
            : /* nPart < 127 ? */ /* one byte */
                nPart
        );
    }
    
    return sView;
}

let str = utf8ArrayToString([50,72,226,130,130,32,43,32,79,226,130,130,32,226,135,140,32,50,72,226,130,130,79]);

// Must show 2H? + O? ? 2H?O
console.log(str);

Uint8Array 到 Javascript 中的字符串

提问by Hyman Wester

回答by Vincent Scheib

回答by Albert

回答by dlchambers

回答by Will Scott

回答by kpowz

回答by Bob Arlof

回答by shuki

回答by serdarsenay

回答by simbo1905

回答by Rosberg Linhares

相关推荐

最近更新

标签

Uint8Array 到 Javascript 中的字符串

提问by Hyman Wester

回答by Vincent Scheib

回答by Albert

回答by dlchambers

回答by Will Scott

回答by kpowz

回答by Bob Arlof

回答by shuki

回答by serdarsenay

回答by simbo1905

回答by Rosberg Linhares

相关推荐

Javascript 检查所有输入字段是否已用 jQuery 填写

Javascript 在新窗口中打开图片

Javascript 使用 jquery click 处理锚点 onClick()

Javascript 在循环中添加“点击”事件侦听器

相关推荐

最近更新

标签