Python 将字符串转换为 numpy 数组

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/28207743/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 02:55:01  来源:igfitidea点击:

Convert string to numpy array

pythonarraysstringnumpy

提问by Am1rr3zA

I have a string like mystr = "100110"(the real size is much bigger) I want to convert it to numpy array like mynumpy = [1, 0, 0, 1, 1, 0], mynumpy.shape = (6,0), I know that numpy has np.fromstring(mystr, dtype=int, sep='')yet the problem is I can't split my string to every digit of it, so numpy takes it as an one number. any idea how to convert my string to numpy array?

我有一个字符串mystr = "100110"(实际大小要大得多)我想将它转换为 numpy 数组mynumpy = [1, 0, 0, 1, 1, 0], mynumpy.shape = (6,0),我知道 numpynp.fromstring(mystr, dtype=int, sep='')的问题是我无法将字符串拆分为它的每个数字,因此 numpy 将其视为一个数字。知道如何将我的字符串转换为 numpy 数组吗?

采纳答案by dragon2fly

listmay help you do that.

list可能会帮助你做到这一点。

import numpy as np

mystr = "100110"
print np.array(list(mystr))
# ['1' '0' '0' '1' '1' '0']

If you want to get numbers instead of string:

如果您想获取数字而不是字符串:

print np.array(list(mystr), dtype=int)
# [1 0 0 1 1 0]

回答by grc

You could read them as ASCII characters then subtract 48 (the ASCII value of 0). This should be the fastest way for large strings.

您可以将它们读取为 ASCII 字符,然后减去 48( 的 ASCII 值0)。这应该是大字符串的最快方法。

>>> np.fromstring("100110", np.int8) - 48
array([1, 0, 0, 1, 1, 0], dtype=int8)

Alternatively, you could convert the string to a list of integers first:

或者,您可以先将字符串转换为整数列表:

>>> np.array(map(int, "100110"))
array([1, 0, 0, 1, 1, 0])

Edit: I did some quick timing and the first method is over 100x faster than converting it to a list first.

编辑:我做了一些快速计时,第一种方法比首先将其转换为列表快 100 倍以上。

回答by Hrushikesh Dhumal

Adding to above answers, numpy now gives a deprecation warning when you use fromstring
DeprecationWarning: The binary mode of fromstring is deprecated, as it behaves surprisingly on unicode inputs. Use frombuffer instead.
A better option is to use the fromiter. It performs twice as fast. This is what I got in jupyter notebook -

添加到上述答案中,numpy 现在在您使用fromstring
DeprecationWarning: The binary mode of fromstring is deprecated, as it behaves surprisingly on unicode inputs. Use frombuffer instead.
更好的选择是使用fromiter. 它的执行速度是原来的两倍。这是我在 jupyter notebook 中得到的 -

import numpy as np
mystr = "100110"

np.fromiter(mystr, dtype=int)
>> array([1, 0, 0, 1, 1, 0])

# Time comparison
%timeit np.array(list(mystr), dtype=int)
>> 3.5 μs ± 627 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

%timeit np.fromstring(mystr, np.int8) - 48
>> 3.52 μs ± 508 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

%timeit np.fromiter(mystr, dtype=int)
1.75 μs ± 133 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)