在 JavaScript 中如何将字符串转换为字符数组？

Question

提问by DarkLightA

How do you convert a string to a character array in JavaScript?

I'm thinking getting a string like "Hello world!"to the array
['H','e','l','l','o',' ','w','o','r','l','d','!']

我想得到一个像"Hello world!"数组一样的字符串
['H','e','l','l','o',' ','w','o','r','l','d','!']

Answer 1

回答by meder omuraliev

Note: This is not unicode compliant. "IU".split('')results in the 4 character array ["I", "?", "?", "u"]which can lead to dangerous bugs. See answers below for safe alternatives.

注意：这不符合 Unicode。"IU".split('')导致 4 个字符的数组["I", "?", "?", "u"]，这可能会导致危险的错误。有关安全的替代方案，请参阅下面的答案。

Just split it by an empty string.

只需将其拆分为空字符串即可。

var output = "Hello world!".split('');
console.log(output);

See the String.prototype.split()MDN docs.

请参阅String.prototype.split()MDN 文档。

Answer 2

回答by hakatashi

As hippietrail suggests, meder's answercan break surrogate pairs and misinterpret “characters.” For example:

正如hippietrail 所暗示的那样，meder 的回答可以打破代理对并误解“字符”。例如：

// DO NOT USE THIS!
> ''.split('')
[ '?', '?', '?', '?', '?', '?', '?', '?' ]

I suggest using one of the following ES2015 features to correctly handle these character sequences.

我建议使用以下 ES2015 特性之一来正确处理这些字符序列。

Spread syntax (already answeredby insertusernamehere)

传播语法（已由 insertusernamehere回答）

> [...'']
[ '', '', '', '' ]

Array.from

数组.from

> Array.from('')
[ '', '', '', '' ]

RegExp `u`flag

正则表达式`u`标志

> ''.split(/(?=[\s\S])/u)
[ '', '', '', '' ]

Use /(?=[\s\S])/uinstead of /(?=.)/ubecause .does not match newlines.

使用/(?=[\s\S])/u而不是/(?=.)/u因为.不匹配换行符。

If you are still in ES5.1 era (or if your browser doesn't handle this regex correctly - like Edge), you can use this alternative (transpiled by Babel):

如果你还在 ES5.1 时代（或者你的浏览器不能正确处理这个正则表达式——比如 Edge），你可以使用这个替代方案（由Babel转译）：

> ''.split(/(?=(?:[function run_test(){
  str=document.getElementById('nonBMP').checked ? '_NL__HIGH__LOW_' : '0_NL_1_HIGH_2_LOW_3';
  str=str.replace('_NL_'  ,document.getElementById('nl'  ).checked ? '\n'          : '');
  str=str.replace('_HIGH_',document.getElementById('high').checked ? ''.charAt(0) : '');
  str=str.replace('_LOW_' ,document.getElementById('low' ).checked ? ''.charAt(1) : '');
  
  //wrap all examples into try{ eval(...) } catch {} to aloow script execution if some syntax not supported (for example in Internet Explorer)
        document.getElementById("testString"   ).innerText=JSON.stringify(str);
  try { document.getElementById("splitEmpty"   ).innerText=JSON.stringify(eval('str.split("")'));            } catch(err) { }
  try { document.getElementById("splitRegexDot").innerText=JSON.stringify(eval('str.split(/(?=.)/u)'));      } catch(err) { }
  try { document.getElementById("spread"       ).innerText=JSON.stringify(eval('[...str]'));                 } catch(err) { }
  try { document.getElementById("arrayFrom"    ).innerText=JSON.stringify(eval('Array.from(str)'));          } catch(err) { }
  try { document.getElementById("splitRegex"   ).innerText=JSON.stringify(eval('str.split(/(?=[\s\S])/u)')); } catch(err) { }
  try { document.getElementById("splitBabel"   ).innerText=JSON.stringify(eval('str.split(/(?=(?:[\0-\uD7FF\uE000-\uFFFF]|[\uD800-\uDBFF][\uDC00-\uDFFF]|[\uD800-\uDBFF](?![\uDC00-\uDFFF])|(?:[^\uD800-\uDBFF]|^)[\uDC00-\uDFFF]))/)')); } catch(err) { }
}


document.getElementById('runTest').onclick=run_test;-\uD7FF\uE000-\uFFFF]|[\uD800-\uDBFF][\uDC00-\uDFFF]|[\uD800-\uDBFF](?![\uDC00-\uDFFF])|(?:[^\uD800-\uDBFF]|^)[\uDC00-\uDFFF]))/);
[ '', '', '', '' ]

Note, that Babel tries to also handle unmatched surrogates correctly. However, this doesn't seem to work for unmatched low surrogates.

请注意，Babel 也尝试正确处理不匹配的代理。但是，这似乎不适用于无与伦比的低代理。

Test all in your browser:

在浏览器中测试所有内容：

th, td {
    border: 1px solid black;
    padding: 4px;
}

<div><input type="checkbox" id="nonBMP" checked /><label for="nonBMP">Codepoints above U+FFFF</label></div>
<div><input type="checkbox" id="nl"     checked /><label for="nl"    >Newline</label></div>
<div><input type="checkbox" id="high"           /><label for="high"  >Unmached high surrogate</label></div>
<div><input type="checkbox" id="low"            /><label for="low"   >Unmached low surrogate</label></div>
<button type="button" id="runTest">Run Test!</button>

<table>
  <tr><td>str=</td>                     <td><div id="testString"></div></td></tr>
  <tr><th colspan="2">Wrong:</th></tr>
  <tr><td>str.split("")</td>            <td><div id="splitEmpty"></div></td></tr>
  <tr><td>str.split(/(?=.)/u)</td>      <td><div id="splitRegexDot"></div></td></tr>
  <tr><th colspan="2">Better:</th></tr>
  <tr><td>[...str]</td>                 <td><div id="spread"></div></td></tr>
  <tr><td>Array.from(str)</td>          <td><div id="arrayFrom"></div></td></tr>
  <tr><td>str.split(/(?=[\s\S])/u)</td> <td><div id="splitRegex"></div></td></tr>
  <tr><td>str.split(/(?=(?:[var arr = [...str];
-\uD7FF\uE000-\uFFFF]|[\uD800-\uDBFF][\uDC00-\uDFFF]|[\uD800-\uDBFF](?![\uDC00-\uDFFF])|(?:[^\uD800-\uDBFF]|^)[\uDC00-\uDFFF]))/)</td><td><div id="splitBabel"></div></td></tr>
</table>

function a() {
    return arguments;
}

var str = 'Hello World';

var arr1 = [...str],
    arr2 = [...'Hello World'],
    arr3 = new Array(...str),
    arr4 = a(...str);

console.log(arr1, arr2, arr3, arr4);

Answer 3

回答by insertusernamehere

The spreadSyntax

该spread语法

You can use the spread syntax, an Array Initializer introduced in ECMAScript 2015 (ES6) standard:

您可以使用扩展语法，即ECMAScript 2015 (ES6) 标准中引入的数组初始值设定项：

["H", "e", "l", "l", "o", " ", "W", "o", "r", "l", "d"]

Examples

例子

{0: "H", 1: "e", 2: "l", 3: "l", 4: "o", 5: " ", 6: "W", 7: "o", 8: "r", 9: "l", 10: "d"}

The first three result in:

前三个结果是：

var m = "Hello world!";
console.log(Array.from(m))

The last one results in

最后一个结果是

var output = Object.assign([], "Hello, world!");
console.log(output);
    // [ 'H', 'e', 'l', 'l', 'o', ',', ' ', 'w', 'o', 'r', 'l', 'd', '!' ]

Browser Support

浏览器支持

Check the ECMAScript ES6 compatibility table.

检查ECMAScript ES6 兼容性表。

Further reading

进一步阅读

spreadis also referenced as "splat" (e.g. in PHPor Rubyor as "scatter" (e.g. in Python).

spread也称为“ splat”（例如在PHP或Ruby 中或称为“ scatter”（例如在Python 中）。

Demo

演示

Try before buy

先试后买

Answer 4

回答by Rajesh

You can also use Array.from.

您也可以使用Array.from.

var mystring = 'foobar';
console.log(mystring[0]); // Outputs 'f'
console.log(mystring[3]); // Outputs 'b'

This method has been introduced in ES6.

这个方法已经在 ES6 中引入了。

Reference

参考

Array.from

数组.from

Answer 5

回答by David Thomas

This is an old question but I came across another solution not yet listed.

这是一个老问题，但我遇到了另一个尚未列出的解决方案。

You can use the Object.assign function to get the desired output:

您可以使用 Object.assign 函数来获得所需的输出：

var mystring = 'foobar';
console.log(mystring.charAt(3)); // Outputs 'b'

Not necessarily right or wrong, just another option.

不一定对或错，只是另一种选择。

Object.assign is described well at the MDN site.

Object.assign 在 MDN 站点上有很好的描述。

Answer 6

回答by dansimau

It already is:

它已经是：

const yourString = 'Hello, World!';
const charArray = [];
for (let i=0; i<=yourString.length; i++) {
    charArray.push(yourString[i]);
}
console.log(charArray);

Or for a more older browser friendly version, use:

或者对于更旧的浏览器友好版本，请使用：

const charArray = 'Hello, World!'.split('');
console.log(charArray);

Answer 7

回答by Mark Amery

There are (at least) three different things you might conceive of as a "character", and consequently, three different categories of approach you might want to use.

您可能将（至少）三种不同的事物视为“角色”，因此，您可能想要使用三种不同类别的方法。

Splitting into UTF-16 code units

拆分为 UTF-16 代码单元

JavaScript strings were originally invented as sequences of UTF-16 code units, back at a point in history when there was a one-to-one relationship between UTF-16 code units and Unicode code points. The .lengthproperty of a string measures its length in UTF-16 code units, and when you do someString[i]you get the ith UTF-16 code unit of someString.

JavaScript 字符串最初是作为 UTF-16 代码单元序列而发明的，在历史上，UTF-16 代码单元和 Unicode 代码点之间存在一对一的关系。.length字符串的属性以 UTF-16 代码单元来衡量它的长度，当你这样做时，你会someString[i]得到第i个 UTF-16 代码单元someString。

Consequently, you can get an array of UTF-16 code units from a string by using a C-style for-loop with an index variable...

因此，您可以通过使用带有索引变量的 C 样式 for 循环从字符串中获取一组 UTF-16 代码单元...

const yourString = '';
console.log('First code unit:', yourString[0]);
const charArray = yourString.split('');
console.log('charArray:', charArray);

There are also various short ways to achieve the same thing, like using .split()with the empty string as a separator:

还有各种简短的方法可以实现相同的目的，例如使用.split()空字符串作为分隔符：

const yourString = '';
const charArray = [];
for (const char of yourString) {
  charArray.push(char);
}
console.log(charArray);

However, if your string contains code points that are made up of multiple UTF-16 code units, this will split them into individual code units, which may not be what you want. For instance, the string ''is made up of four unicode code points (code points 0x1D7D8 through 0x1D7DB) which, in UTF-16, are each made up of two UTF-16 code units. If we split that string using the methods above, we'll get an array of eight code units:

但是，如果您的字符串包含由多个 UTF-16 代码单元组成的代码点，这会将它们拆分为单独的代码单元，这可能不是您想要的。例如，字符串''由四个 unicode 代码点（代码点 0x1D7D8 到 0x1D7DB）组成，在 UTF-16 中，每个由两个 UTF-16 代码单元组成。如果我们使用上述方法拆分该字符串，我们将得到一个包含八个代码单元的数组：

const yourString = '';
const charArray = Array.from(yourString);
console.log(charArray);

Splitting into Unicode Code Points

拆分为 Unicode 代码点

So, perhaps we want to instead split our string into Unicode Code Points! That's been possible since ECMAScript 2015 added the concept of an iterableto the language. Strings are now iterables, and when you iterate over them (e.g. with a for...ofloop), you get Unicode code points, not UTF-16 code units:

因此，也许我们想要将字符串拆分为 Unicode 代码点！自从 ECMAScript 2015 向该语言添加可迭代的概念以来，这已经成为可能。字符串现在是可迭代的，当您对它们进行迭代时（例如使用for...of循环），您将获得 Unicode 代码点，而不是 UTF-16 代码单元：

const yourString = 'A?';
const charArray = Array.from(yourString);
console.log(charArray);

We can shorten this using Array.from, which iterates over the iterable it's passed implicitly:

我们可以使用来缩短它Array.from，它迭代它隐式传递的可迭代对象：

const splitter = new GraphemeSplitter();
const yourString = 'A?';
const charArray = splitter.splitGraphemes(yourString);
console.log(charArray);

However, unicode code points are not the largest possible thing that could possibly be considered a "character" either. Some examples of things that could reasonably be considered a single "character" but be made up of multiple code points include:

然而，Unicode码点是不是也可能会被认为是一个“性格”最大可能的事情要么。可以合理地被视为单个“字符”但由多个代码点组成的一些示例包括：

Accented characters, if the accent is applied with a combining code point
Flags
Some emojis

重音字符，如果重音与组合码点一起应用
旗帜
一些表情符号

We can see below that if we try to convert a string with such characters into an array via the iteration mechanism above, the characters end up broken up in the resulting array. (In case any of the characters don't render on your system, yourStringbelow consists of a capital Awith an acute accent, followed by the flag of the United Kingdom, followed by a black woman.)

我们可以在下面看到，如果我们尝试通过上面的迭代机制将具有此类字符的字符串转换为数组，则这些字符最终会在结果数组中分解。（如果任何字符未在您的系统上呈现，yourString下面由带有尖音符的大写字母A组成，后跟英国国旗，后跟一位黑人女性。）

<script src="https://cdn.jsdelivr.net/npm/[email protected]/index.js"></script>

If we want to keep each of these as a single item in our final array, then we need an array of graphemes, not code points.

如果我们想将这些中的每一个都作为我们最终数组中的一个项目，那么我们需要一个graphemes数组，而不是代码点。

Splitting into graphemes

拆分成字素

JavaScript has no built-in support for this - at least not yet. So we need a library that understands and implements the Unicode rules for what combination of code points constitute a grapheme. Fortunately, one exists: orling's grapheme-splitter. You'll want to install it with npm or, if you're not using npm, download the index.js file and serve it with a <script>tag. For this demo, I'll load it from jsDelivr.

JavaScript 没有对此的内置支持 - 至少现在还没有。因此，我们需要一个库来理解和实现 Unicode 规则，以了解哪些代码点组合构成了一个字素。幸运的是，存在一个：orling 的 grapheme -splitter。您需要使用 npm 安装它，或者，如果您不使用 npm，请下载 index.js 文件并为其提供<script>标签。对于这个演示，我将从 jsDelivr 加载它。

grapheme-splitter gives us a GraphemeSplitterclass with three methods: splitGraphemes, iterateGraphemes, and countGraphemes. Naturally, we want splitGraphemes:

字形分离器给了我们一个GraphemeSplitter班有三种方法：splitGraphemes，iterateGraphemes，和countGraphemes。自然，我们想要splitGraphemes：

const str = 'Hello World';

const stringToArray = (text) => {
  var chars = [];
  for (var i = 0; i < text.length; i++) {
    chars.push(text[i]);
  }
  return chars
}

console.log(stringToArray(str))

let str = 'this is string, length is >26';

console.log([...str]);

And there we are - an array of three graphemes, which is probablywhat you wanted.

我们就是这样 - 一个由三个字素组成的数组，这可能是你想要的。

Answer 8

回答by Mohit Rathore

You can iterate over the length of the string and push the character at each position:

您可以遍历字符串的长度并在每个位置推送字符：

console.log([1, 2, 3].map(e => Math.random().toString(36).slice(2)).join('').split('').map(e => Math.random() > 0.5 ? e.toUpperCase() : e).join(''));

Answer 9

回答by ajit kumar

simple answer:

简单的回答：

##代码##

Answer 10

回答by user2301515

One possibility is the next:

一种可能性是下一个：

##代码##

在 JavaScript 中如何将字符串转换为字符数组？

提问by DarkLightA

回答by meder omuraliev

回答by hakatashi

Spread syntax (already answeredby insertusernamehere)

传播语法（已由 insertusernamehere回答）

Array.from

数组.from

RegExp `u`flag

正则表达式`u`标志

Test all in your browser:

在浏览器中测试所有内容：

回答by insertusernamehere

回答by Rajesh

Reference

参考

回答by David Thomas

回答by dansimau

回答by Mark Amery

Splitting into UTF-16 code units

拆分为 UTF-16 代码单元

Splitting into Unicode Code Points

拆分为 Unicode 代码点

Splitting into graphemes

拆分成字素

回答by Mohit Rathore

回答by ajit kumar

回答by user2301515

相关推荐

最近更新

标签