使用 Javascript 的 atob 解码 base64 无法正确解码 utf-8 字符串

Question

提问by brandonscript

I'm using the Javascript window.atob()function to decode a base64-encoded string (specifically the base64-encoded content from the GitHub API). Problem is I'm getting ASCII-encoded characters back (like a￠instead of ?). How can I properly handle the incoming base64-encoded stream so that it's decoded as utf-8?

我正在使用 Javascriptwindow.atob()函数来解码 base64 编码的字符串（特别是来自 GitHub API 的 base64 编码内容）。问题是我得到了 ASCII 编码的字符（比如a￠而不是?）。如何正确处理传入的 base64 编码流，以便将其解码为 utf-8？

Answer 1

回答by brandonscript

There's a great articleon Mozilla's MDN docs that describes exactly this issue:

Mozilla 的 MDN 文档上有一篇很棒的文章准确描述了这个问题：

The "Unicode Problem" Since DOMStrings are 16-bit-encoded strings, in most browsers calling window.btoaon a Unicode string will cause a Character Out Of Range exceptionif a character exceeds the range of a 8-bit byte (0x00~0xFF). There are two possible methods to solve this problem:
the first one is to escape the whole string (with UTF-8, see encodeURIComponent) and then encode it;
the second one is to convert the UTF-16 DOMStringto an UTF-8 array of characters and then encode it.

“Unicode 问题” 由于DOMStrings 是 16 位编码的字符串，因此在大多数浏览器中调用window.btoaUnicode 字符串会导致Character Out Of Range exception字符超出 8 位字节的范围（0x00~0xFF）。有两种可能的方法来解决这个问题：
第一个是转义整个字符串（使用 UTF-8，请参阅encodeURIComponent），然后对其进行编码；
第二种是将 UTF-16DOMString转换为 UTF-8 字符数组，然后对其进行编码。

A note on previous solutions: the MDN article originally suggested using unescapeand escapeto solve the Character Out Of Rangeexception problem, but they have since been deprecated. Some other answers here have suggested working around this with decodeURIComponentand encodeURIComponent, this has proven to be unreliable and unpredictable. The most recent update to this answer uses modern JavaScript functions to improve speed and modernize code.

关于以前的解决方案的说明：MDN 文章最初建议使用unescape和escape来解决Character Out Of Range异常问题，但它们已被弃用。这里的一些其他答案建议使用decodeURIComponentand解决这个问题encodeURIComponent，这已被证明是不可靠和不可预测的。此答案的最新更新使用现代 JavaScript 函数来提高速度和现代化代码。

If you're trying to save yourself some time, you could also consider using a library:

如果你想节省一些时间，你也可以考虑使用一个库：

js-base64(NPM, great for Node.js)
base64-js

js-base64（NPM，非常适合 Node.js）
base64-js

Encoding UTF8 ? base64

编码 UTF8 ? base64

function b64EncodeUnicode(str) {
    // first we use encodeURIComponent to get percent-encoded UTF-8,
    // then we convert the percent encodings into raw bytes which
    // can be fed into btoa.
    return btoa(encodeURIComponent(str).replace(/%([0-9A-F]{2})/g,
        function toSolidBytes(match, p1) {
            return String.fromCharCode('0x' + p1);
    }));
}

b64EncodeUnicode('? à la mode'); // "4pyTIMOgIGxhIG1vZGU="
b64EncodeUnicode('\n'); // "Cg=="

Decoding base64 ? UTF8

解码 base64 ? UTF8

function b64DecodeUnicode(str) {
    // Going backwards: from bytestream, to percent-encoding, to original string.
    return decodeURIComponent(atob(str).split('').map(function(c) {
        return '%' + ('00' + c.charCodeAt(0).toString(16)).slice(-2);
    }).join(''));
}

b64DecodeUnicode('4pyTIMOgIGxhIG1vZGU='); // "? à la mode"
b64DecodeUnicode('Cg=='); // "\n"

The pre-2018 solution (functional, and though likely better support for older browsers, not up to date)

2018 年之前的解决方案（功能强大，虽然可能更好地支持旧浏览器，但不是最新的）

Here is the the current recommendation, direct from MDN, with some additional TypeScript compatibility via @MA-Maddin:

这是当前的建议，直接来自 MDN，通过@MA-Maddin 具有一些额外的 TypeScript 兼容性：

// Encoding UTF8 ? base64

function b64EncodeUnicode(str) {
    return btoa(encodeURIComponent(str).replace(/%([0-9A-F]{2})/g, function(match, p1) {
        return String.fromCharCode(parseInt(p1, 16))
    }))
}

b64EncodeUnicode('? à la mode') // "4pyTIMOgIGxhIG1vZGU="
b64EncodeUnicode('\n') // "Cg=="

// Decoding base64 ? UTF8

function b64DecodeUnicode(str) {
    return decodeURIComponent(Array.prototype.map.call(atob(str), function(c) {
        return '%' + ('00' + c.charCodeAt(0).toString(16)).slice(-2)
    }).join(''))
}

b64DecodeUnicode('4pyTIMOgIGxhIG1vZGU=') // "? à la mode"
b64DecodeUnicode('Cg==') // "\n"

The original solution (deprecated)

原始解决方案（已弃用）

This used escapeand unescape(which are now deprecated, though this still works in all modern browsers):

使用escape和unescape（现在已弃用，尽管这仍然适用于所有现代浏览器）：

function utf8_to_b64( str ) {
    return window.btoa(unescape(encodeURIComponent( str )));
}

function b64_to_utf8( str ) {
    return decodeURIComponent(escape(window.atob( str )));
}

// Usage:
utf8_to_b64('? à la mode'); // "4pyTIMOgIGxhIG1vZGU="
b64_to_utf8('4pyTIMOgIGxhIG1vZGU='); // "? à la mode"

And one last thing: I first encountered this problem when calling the GitHub API. To get this to work on (Mobile) Safari properly, I actually had to strip all white space from the base64 source beforeI could even decode the source. Whether or not this is still relevant in 2017, I don't know:

最后一件事：我在调用 GitHub API 时第一次遇到这个问题。为了让它在（移动）Safari 上正常工作，我实际上必须在解码源之前从 base64 源中去除所有空白。这在 2017 年是否仍然相关，我不知道：

function b64_to_utf8( str ) {
    str = str.replace(/\s/g, '');    
    return decodeURIComponent(escape(window.atob( str )));
}

Answer 2

回答by Tedd Hansen

Things change. The escape/unescapemethods have been deprecated.

事情会改变的。该逃逸/ UNESCAPE方法已被弃用。

You can URI encode the string before you Base64-encode it. Note that this does't produce Base64-encoded UTF8, but rather Base64-encoded URL-encoded data. Both sides must agree on the same encoding.

您可以在对字符串进行 Base64 编码之前对其进行 URI 编码。请注意，这不会产生 Base64 编码的 UTF8，而是 Base64 编码的 URL 编码数据。双方必须就相同的编码达成一致。

See working example here: http://codepen.io/anon/pen/PZgbPW

请参阅此处的工作示例：http: //codepen.io/anon/pen/PZgbPW

// encode string
var base64 = window.btoa(encodeURIComponent(' 你好 ??????'));
// decode string
var str = decodeURIComponent(window.atob(tmp));
// str is now === ' 你好 ??????'

For OP's problem a third party library such as js-base64should solve the problem.

对于 OP 的问题，第三方库（例如js-base64）应该可以解决问题。

Answer 3

回答by Riccardo Galli

If treating strings as bytes is more your thing, you can use the following functions

如果将字符串视为字节更适合您，则可以使用以下函数

function u_atob(ascii) {
    return Uint8Array.from(atob(ascii), c => c.charCodeAt(0));
}

function u_btoa(buffer) {
    var binary = [];
    var bytes = new Uint8Array(buffer);
    for (var i = 0, il = bytes.byteLength; i < il; i++) {
        binary.push(String.fromCharCode(bytes[i]));
    }
    return btoa(binary.join(''));
}


// example, it works also with astral plane characters such as ''
var encodedString = new TextEncoder().encode('?');
var base64String = u_btoa(encodedString);
console.log('?' === new TextDecoder().decode(u_atob(base64String)))

Answer 4

回答by Manuel G

Here is 2018 updated solution as described in the Mozilla Development Resources

这是Mozilla 开发资源中描述的 2018 年更新的解决方案

TO ENCODE FROM UNICODE TO B64

从 Unicode 编码到 B64

function b64EncodeUnicode(str) {
    // first we use encodeURIComponent to get percent-encoded UTF-8,
    // then we convert the percent encodings into raw bytes which
    // can be fed into btoa.
    return btoa(encodeURIComponent(str).replace(/%([0-9A-F]{2})/g,
        function toSolidBytes(match, p1) {
            return String.fromCharCode('0x' + p1);
    }));
}

b64EncodeUnicode('? à la mode'); // "4pyTIMOgIGxhIG1vZGU="
b64EncodeUnicode('\n'); // "Cg=="

TO DECODE FROM B64 TO UNICODE

从 B64 解码为 Unicode

function b64DecodeUnicode(str) {
    // Going backwards: from bytestream, to percent-encoding, to original string.
    return decodeURIComponent(atob(str).split('').map(function(c) {
        return '%' + ('00' + c.charCodeAt(0).toString(16)).slice(-2);
    }).join(''));
}

b64DecodeUnicode('4pyTIMOgIGxhIG1vZGU='); // "? à la mode"
b64DecodeUnicode('Cg=='); // "\n"

Answer 5

回答by Hyman Giffin

I would assume that one might want a solution that produces a widely useable base64 URI. Please visit data:text/plain;charset=utf-8;base64,4pi44pi54pi64pi74pi84pi+4pi/to see a demonstration (copy the data uri, open a new tab, paste the data URI into the address bar, then press enter to go to the page). Despite the fact that this URI is base64-encoded, the browser is still able to recognize the high code points and decode them properly. The minified encoder+decoder is 1058 bytes (+Gzip→589 bytes)

我假设人们可能想要一种生成广泛使用的 base64 URI 的解决方案。请访问data:text/plain;charset=utf-8;base64,4pi44pi54pi64pi74pi84pi+4pi/查看演示（复制数据uri，打开新标签页，将数据URI粘贴到地址栏中，然后按回车进入该页面）。尽管这个 URI 是 base64 编码的，浏览器仍然能够识别高代码点并正确解码它们。缩小后的编码器+解码器为 1058 字节（+Gzip→589 字节）

!function(e){"use strict";function h(b){var a=b.charCodeAt(0);if(55296<=a&&56319>=a)if(b=b.charCodeAt(1),b===b&&56320<=b&&57343>=b){if(a=1024*(a-55296)+b-56320+65536,65535<a)return d(240|a>>>18,128|a>>>12&63,128|a>>>6&63,128|a&63)}else return d(239,191,189);return 127>=a?inputString:2047>=a?d(192|a>>>6,128|a&63):d(224|a>>>12,128|a>>>6&63,128|a&63)}function k(b){var a=b.charCodeAt(0)<<24,f=l(~a),c=0,e=b.length,g="";if(5>f&&e>=f){a=a<<f>>>24+f;for(c=1;c<f;++c)a=a<<6|b.charCodeAt(c)&63;65535>=a?g+=d(a):1114111>=a?(a-=65536,g+=d((a>>10)+55296,(a&1023)+56320)):c=0}for(;c<e;++c)g+="\ufffd";return g}var m=Math.log,n=Math.LN2,l=Math.clz32||function(b){return 31-m(b>>>0)/n|0},d=String.fromCharCode,p=atob,q=btoa;e.btoaUTF8=function(b,a){return q((a?"\u00ef\u00bb\u00bf":"")+b.replace(/[\x80-\uD7ff\uDC00-\uFFFF]|[\uD800-\uDBFF][\uDC00-\uDFFF]?/g,h))};e.atobUTF8=function(b,a){a||"\u00ef\u00bb\u00bf"!==b.substring(0,3)||(b=b.substring(3));return p(b).replace(/[\xc0-\xff][\x80-\xbf]*/g,k)}}(""+void 0==typeof global?""+void 0==typeof self?this:self:global)

Below is the source code used to generate it.

下面是用于生成它的源代码。

var fromCharCode = String.fromCharCode;
var btoaUTF8 = (function(btoa, replacer){"use strict";
    return function(inputString, BOMit){
        return btoa((BOMit ? "\xEF\xBB\xBF" : "") + inputString.replace(
            /[\x80-\uD7ff\uDC00-\uFFFF]|[\uD800-\uDBFF][\uDC00-\uDFFF]?/g, replacer
        ));
    }
})(btoa, function(nonAsciiChars){"use strict";
    // make the UTF string into a binary UTF-8 encoded string
    var point = nonAsciiChars.charCodeAt(0);
    if (point >= 0xD800 && point <= 0xDBFF) {
        var nextcode = nonAsciiChars.charCodeAt(1);
        if (nextcode !== nextcode) // NaN because string is 1 code point long
            return fromCharCode(0xef/*11101111*/, 0xbf/*10111111*/, 0xbd/*10111101*/);
        // https://mathiasbynens.be/notes/javascript-encoding#surrogate-formulae
        if (nextcode >= 0xDC00 && nextcode <= 0xDFFF) {
            point = (point - 0xD800) * 0x400 + nextcode - 0xDC00 + 0x10000;
            if (point > 0xffff)
                return fromCharCode(
                    (0x1e/*0b11110*/<<3) | (point>>>18),
                    (0x2/*0b10*/<<6) | ((point>>>12)&0x3f/*0b00111111*/),
                    (0x2/*0b10*/<<6) | ((point>>>6)&0x3f/*0b00111111*/),
                    (0x2/*0b10*/<<6) | (point&0x3f/*0b00111111*/)
                );
        } else return fromCharCode(0xef, 0xbf, 0xbd);
    }
    if (point <= 0x007f) return nonAsciiChars;
    else if (point <= 0x07ff) {
        return fromCharCode((0x6<<5)|(point>>>6), (0x2<<6)|(point&0x3f));
    } else return fromCharCode(
        (0xe/*0b1110*/<<4) | (point>>>12),
        (0x2/*0b10*/<<6) | ((point>>>6)&0x3f/*0b00111111*/),
        (0x2/*0b10*/<<6) | (point&0x3f/*0b00111111*/)
    );
});

Then, to decode the base64 data, either HTTP get the data as a data URI or use the function below.

然后，要解码 base64 数据，HTTP 将数据作为数据 URI 获取或使用下面的函数。

var clz32 = Math.clz32 || (function(log, LN2){"use strict";
    return function(x) {return 31 - log(x >>> 0) / LN2 | 0};
})(Math.log, Math.LN2);
var fromCharCode = String.fromCharCode;
var atobUTF8 = (function(atob, replacer){"use strict";
    return function(inputString, keepBOM){
        inputString = atob(inputString);
        if (!keepBOM && inputString.substring(0,3) === "\xEF\xBB\xBF")
            inputString = inputString.substring(3); // eradicate UTF-8 BOM
        // 0xc0 => 0b11000000; 0xff => 0b11111111; 0xc0-0xff => 0b11xxxxxx
        // 0x80 => 0b10000000; 0xbf => 0b10111111; 0x80-0xbf => 0b10xxxxxx
        return inputString.replace(/[\xc0-\xff][\x80-\xbf]*/g, replacer);
    }
})(atob, function(encoded){"use strict";
    var codePoint = encoded.charCodeAt(0) << 24;
    var leadingOnes = clz32(~codePoint);
    var endPos = 0, stringLen = encoded.length;
    var result = "";
    if (leadingOnes < 5 && stringLen >= leadingOnes) {
        codePoint = (codePoint<<leadingOnes)>>>(24+leadingOnes);
        for (endPos = 1; endPos < leadingOnes; ++endPos)
            codePoint = (codePoint<<6) | (encoded.charCodeAt(endPos)&0x3f/*0b00111111*/);
        if (codePoint <= 0xFFFF) { // BMP code point
          result += fromCharCode(codePoint);
        } else if (codePoint <= 0x10FFFF) {
          // https://mathiasbynens.be/notes/javascript-encoding#surrogate-formulae
          codePoint -= 0x10000;
          result += fromCharCode(
            (codePoint >> 10) + 0xD800,  // highSurrogate
            (codePoint & 0x3ff) + 0xDC00 // lowSurrogate
          );
        } else endPos = 0; // to fill it in with INVALIDs
    }
    for (; endPos < stringLen; ++endPos) result += "\ufffd"; // replacement character
    return result;
});

The advantage of being more standard is that this encoder and this decoder are more widely applicable because they can be used as a valid URL that displays correctly. Observe.

更标准的好处是这个编码器和这个解码器应用更广泛，因为它们可以用作正确显示的有效 URL。观察。

(function(window){
    "use strict";
    var sourceEle = document.getElementById("source");
    var urlBarEle = document.getElementById("urlBar");
    var mainFrameEle = document.getElementById("mainframe");
    var gotoButton = document.getElementById("gotoButton");
    var parseInt = window.parseInt;
    var fromCodePoint = String.fromCodePoint;
    var parse = JSON.parse;
    
    function unescape(str){
        return str.replace(/\u[\da-f]{0,4}|\x[\da-f]{0,2}|\u{[^}]*}|\[bfnrtv"'\]|\0[0-7]{1,3}|\\d{1,3}/g, function(match){
          try{
            if (match.startsWith("\u{"))
              return fromCodePoint(parseInt(match.slice(2,-1),16));
            if (match.startsWith("\u") || match.startsWith("\x"))
              return fromCodePoint(parseInt(match.substring(2),16));
            if (match.startsWith("\0") && match.length > 2)
              return fromCodePoint(parseInt(match.substring(2),8));
            if (/^\\d/.test(match)) return fromCodePoint(+match.slice(1));
          }catch(e){return "\ufffd".repeat(match.length)}
          return parse('"' + match + '"');
        });
    }
    
    function whenChange(){
      try{ urlBarEle.value = "data:text/plain;charset=UTF-8;base64," + btoaUTF8(unescape(sourceEle.value), true);
      } finally{ gotoURL(); }
    }
    sourceEle.addEventListener("change",whenChange,{passive:1});
    sourceEle.addEventListener("input",whenChange,{passive:1});
    
    // IFrame Setup:
    function gotoURL(){mainFrameEle.src = urlBarEle.value}
    gotoButton.addEventListener("click", gotoURL, {passive: 1});
    function urlChanged(){urlBarEle.value = mainFrameEle.src}
    mainFrameEle.addEventListener("load", urlChanged, {passive: 1});
    urlBarEle.addEventListener("keypress", function(evt){
      if (evt.key === "enter") evt.preventDefault(), urlChanged();
    }, {passive: 1});
    
        
    var fromCharCode = String.fromCharCode;
    var btoaUTF8 = (function(btoa, replacer){
      "use strict";
        return function(inputString, BOMit){
         return btoa((BOMit?"\xEF\xBB\xBF":"") + inputString.replace(
          /[\x80-\uD7ff\uDC00-\uFFFF]|[\uD800-\uDBFF][\uDC00-\uDFFF]?/g, replacer
      ));
     }
    })(btoa, function(nonAsciiChars){
  "use strict";
     // make the UTF string into a binary UTF-8 encoded string
     var point = nonAsciiChars.charCodeAt(0);
     if (point >= 0xD800 && point <= 0xDBFF) {
      var nextcode = nonAsciiChars.charCodeAt(1);
      if (nextcode !== nextcode) { // NaN because string is 1code point long
       return fromCharCode(0xef/*11101111*/, 0xbf/*10111111*/, 0xbd/*10111101*/);
      }
      // https://mathiasbynens.be/notes/javascript-encoding#surrogate-formulae
      if (nextcode >= 0xDC00 && nextcode <= 0xDFFF) {
       point = (point - 0xD800) * 0x400 + nextcode - 0xDC00 + 0x10000;
       if (point > 0xffff) {
        return fromCharCode(
         (0x1e/*0b11110*/<<3) | (point>>>18),
         (0x2/*0b10*/<<6) | ((point>>>12)&0x3f/*0b00111111*/),
         (0x2/*0b10*/<<6) | ((point>>>6)&0x3f/*0b00111111*/),
         (0x2/*0b10*/<<6) | (point&0x3f/*0b00111111*/)
        );
       }
      } else {
       return fromCharCode(0xef, 0xbf, 0xbd);
      }
     }
     if (point <= 0x007f) { return inputString; }
     else if (point <= 0x07ff) {
      return fromCharCode((0x6<<5)|(point>>>6), (0x2<<6)|(point&0x3f/*00111111*/));
     } else {
      return fromCharCode(
       (0xe/*0b1110*/<<4) | (point>>>12),
       (0x2/*0b10*/<<6) | ((point>>>6)&0x3f/*0b00111111*/),
       (0x2/*0b10*/<<6) | (point&0x3f/*0b00111111*/)
      );
     }
    });
    setTimeout(whenChange, 0);
})(window);

img:active{opacity:0.8}

<center>
<textarea id="source" style="width:66.7vw">Hello \u1234 W656ld!
Enter text into the top box. Then the URL will update automatically.
</textarea><br />
<div style="width:66.7vw;display:inline-block;height:calc(25vw + 1em + 6px);border:2px solid;text-align:left;line-height:1em">
<input id="urlBar" style="width:calc(100% - 1em - 13px)" /><img id="gotoButton" src="data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAABsAAAAeCAMAAADqx5XUAAAAclBMVEX///9NczZ8e32ko6fDxsU/fBoSQgdFtwA5pAHVxt+7vLzq5ex23y4SXABLiiTm0+/c2N6DhoQ6WSxSyweVlZVvdG/Uz9aF5kYlbwElkwAggACxs7Jl3hX07/cQbQCar5SU9lRntEWGum+C9zIDHwCGnH5IvZAOAAABmUlEQVQoz7WS25acIBBFkRLkIgKKtOCttbv//xdDmTGZzHv2S63ltuBQQP4rdRiRUP8UK4wh6nVddQwj/NtDQTvac8577zTQb72zj65/876qqt7wykU6/1U6vFEgjE1mt/5LRqrpu7oVsn0sjZejMfxR3W/yLikqAFcUx93YxLmZGOtElmEu6Ufd9xV3ZDTGcEvGLbMk0mHHlUSvS5svCwS+hVL8loQQyfpI1Ay8RF/xlNxcsTchGjGDIuBG3Ik7TMyNxn8m0TSnBAK6Z8UZfp3IbAonmJvmsEACum6aNv7B0CnvpezDcNhw9XWsuAr7qnRg6dABmeM4dTgn/DZdXWs3LMspZ1KDMt1kcPJ6S1icWNp2qaEmjq6myx7jbQK3VKItLJaW5FR+cuYlRhYNKzGa9vF4vM5roLW3OSVjkmiGJrPhUq301/16pVKZRGFYWjTP50spTxBN5Z4EKnSonruk+n4tUokv1aJSEl/MLZU90S3L6/U6o0J142iQVp3HcZxKSo8LfkNRCtJaKYFSRX7iaoAAUDty8wvWYR6HJEepdwAAAABJRU5ErkJggg==" style="width:calc(1em + 4px);line-height:1em;vertical-align:-40%;cursor:pointer" />
<iframe id="mainframe" style="width:66.7vw;height:25vw" frameBorder="0"></iframe>
</div>
</center>

In addition to being very standardized, the above code snippets are also very fast. Instead of an indirect chain of succession where the data has to be converted several times between various forms (such as in Riccardo Galli's response), the above code snippet is as direct as performantly possible. It uses only one simple fast String.prototype.replacecall to process the data when encoding, and only one to decode the data when decoding. Another plus is that (especially for big strings), String.prototype.replaceallows the browser to automatically handle the underlying memory management of resizing the string, leading a significant performance boost especially in evergreen browsers like Chrome and Firefox that heavily optimize String.prototype.replace. Finally, the icing on the cake is that for you latin script exclūsīvō users, strings which don't contain any code points above 0x7f are extra fast to process because the string remains unmodified by the replacement algorithm.

除了非常规范之外，上面的代码片段也非常快。上面的代码片段不是必须在各种形式之间多次转换数据的间接连续链（例如在 Riccardo Galli 的响应中），而是尽可能直接地执行。它String.prototype.replace在编码时只使用一个简单的快速调用来处理数据，在解码时只使用一个来解码数据。另一个优点是（尤其是对于大字符串），String.prototype.replace允许浏览器自动处理调整字符串大小的底层内存管理，从而显着提升性能，尤其是在 Chrome 和 Firefox 等大量优化的常青浏览器中String.prototype.replace. 最后，锦上添花的是，对于拉丁脚本 exclūsīvō 用户来说，不包含任何 0x7f 以上代码点的字符串处理起来特别快，因为替换算法不会修改该字符串。

I have created a github repository for this solution at https://github.com/anonyco/BestBase64EncoderDecoder/

我在https://github.com/anonyco/BestBase64EncoderDecoder/为这个解决方案创建了一个 github 存储库

Answer 6

回答by Beejor

Here's some future-proof code for browsers that may lack escape/unescape(). Note that IE 9 and older don't support atob/btoa(), so you'd need to use custom base64 functions for them.

这里有一些面向未来的浏览器代码，可能缺少escape/unescape(). 请注意，IE 9 及更早版本不支持atob/btoa()，因此您需要为它们使用自定义 base64 函数。

// Polyfill for escape/unescape
if( !window.unescape ){
    window.unescape = function( s ){
        return s.replace( /%([0-9A-F]{2})/g, function( m, p ) {
            return String.fromCharCode( '0x' + p );
        } );
    };
}
if( !window.escape ){
    window.escape = function( s ){
        var chr, hex, i = 0, l = s.length, out = '';
        for( ; i < l; i ++ ){
            chr = s.charAt( i );
            if( chr.search( /[A-Za-z0-9\@\*\_\+\-\.\/]/ ) > -1 ){
                out += chr; continue; }
            hex = s.charCodeAt( i ).toString( 16 );
            out += '%' + ( hex.length % 2 != 0 ? '0' : '' ) + hex;
        }
        return out;
    };
}

// Base64 encoding of UTF-8 strings
var utf8ToB64 = function( s ){
    return btoa( unescape( encodeURIComponent( s ) ) );
};
var b64ToUtf8 = function( s ){
    return decodeURIComponent( escape( atob( s ) ) );
};

A more comprehensive example for UTF-8 encoding and decoding can be found here: http://jsfiddle.net/47zwb41o/

可以在此处找到更全面的 UTF-8 编码和解码示例：http: //jsfiddle.net/47zwb41o/

Answer 7

回答by Darkves

Small correction, unescape and escape are deprecated, so:

小更正、 unescape 和 escape 已被弃用，因此：

function utf8_to_b64( str ) {
    return window.btoa(decodeURIComponent(encodeURIComponent(str)));
}

function b64_to_utf8( str ) {
     return decodeURIComponent(encodeURIComponent(window.atob(str)));
}


function b64_to_utf8( str ) {
    str = str.replace(/\s/g, '');    
    return decodeURIComponent(encodeURIComponent(window.atob(str)));
}

Answer 8

回答by Diwakar

including above solution if still facing issue try as below, Considerign the case where escape is not supported for TS.

包括上述解决方案，如果仍然面临问题，请尝试如下，考虑 TS 不支持转义的情况。

blob = new Blob(["\ufeff", csv_content]); // this will make symbols to appears in excel

for csv_content you can try like below.

对于 csv_content，您可以尝试如下。

function b64DecodeUnicode(str: any) {        
        return decodeURIComponent(atob(str).split('').map((c: any) => {
            return '%' + ('00' + c.charCodeAt(0).toString(16)).slice(-2);
        }).join(''));
    }

使用 Javascript 的 atob 解码 base64 无法正确解码 utf-8 字符串

提问by brandonscript

回答by brandonscript

Encoding UTF8 ? base64

编码 UTF8 ? base64

Decoding base64 ? UTF8

解码 base64 ? UTF8

The pre-2018 solution (functional, and though likely better support for older browsers, not up to date)

2018 年之前的解决方案（功能强大，虽然可能更好地支持旧浏览器，但不是最新的）

The original solution (deprecated)

原始解决方案（已弃用）

回答by Tedd Hansen

回答by Riccardo Galli

回答by Manuel G

回答by Hyman Giffin

回答by Beejor

回答by Darkves

回答by Diwakar

相关推荐

最近更新

标签

使用 Javascript 的 atob 解码 base64 无法正确解码 utf-8 字符串

提问by brandonscript

回答by brandonscript

Encoding UTF8 ? base64

编码 UTF8 ? base64

Decoding base64 ? UTF8

解码 base64 ? UTF8

The pre-2018 solution (functional, and though likely better support for older browsers, not up to date)

2018 年之前的解决方案（功能强大，虽然可能更好地支持旧浏览器，但不是最新的）

The original solution (deprecated)

原始解决方案（已弃用）

回答by Tedd Hansen

回答by Riccardo Galli

回答by Manuel G

回答by Hyman Giffin

回答by Beejor

回答by Darkves

回答by Diwakar

相关推荐

Javascript 预期验证器返回 Promise 或 Observable

Javascript <input type="button" runat="server" /> 在 ASP.NET 中不起作用

Javascript 如何使用状态在 React.js 中显示日期？

Javascript AngularJS - 在 ng-repeat 中设置选择的默认值

相关推荐

最近更新

标签