Python 如何正确保存和加载 numpy.array() 数据?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/28439701/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to save and load numpy.array() data properly?
提问by bluevoxel
I wonder, how to save and load numpy.array
data properly. Currently I'm using the numpy.savetxt()
method. For example, if I got an array markers
, which looks like this:
我想知道,如何numpy.array
正确保存和加载数据。目前我正在使用该numpy.savetxt()
方法。例如,如果我有一个数组markers
,它看起来像这样:
I try to save it by the use of:
我尝试使用以下方法保存它:
numpy.savetxt('markers.txt', markers)
In other script I try to open previously saved file:
在其他脚本中,我尝试打开以前保存的文件:
markers = np.fromfile("markers.txt")
And that's what I get...
这就是我得到的......
Saved data first looks like this:
保存的数据首先看起来像这样:
0.000000000000000000e+00
0.000000000000000000e+00
0.000000000000000000e+00
0.000000000000000000e+00
0.000000000000000000e+00
0.000000000000000000e+00
0.000000000000000000e+00
0.000000000000000000e+00
0.000000000000000000e+00
0.000000000000000000e+00
But when I save just loaded data by the use of the same method, ie. numpy.savetxt()
it looks like this:
但是当我使用相同的方法保存刚刚加载的数据时,即。numpy.savetxt()
它看起来像这样:
1.398043286095131769e-76
1.398043286095288860e-76
1.396426376485745879e-76
1.398043286055061908e-76
1.398043286095288860e-76
1.182950697433698368e-76
1.398043275797188953e-76
1.398043286095288860e-76
1.210894289234927752e-99
1.398040649781712473e-76
What am I doing wrong? PS there are no other "backstage" operation which I perform. Just saving and loading, and that's what I get. Thank you in advance.
我究竟做错了什么?PS我没有执行其他“后台”操作。只是保存和加载,这就是我得到的。先感谢您。
采纳答案by xnx
The most reliable way I have found to do this is to use np.savetxt
with np.loadtxt
and not np.fromfile
which is better suited to binary files written with tofile
. The np.fromfile
and np.tofile
methods write and read binary files whereas np.savetxt
writes a text file.
So, for example:
我发现最可靠的方法是使用np.savetxt
withnp.loadtxt
而 notnp.fromfile
哪个更适合用tofile
. 该np.fromfile
和np.tofile
方法写入和读取二进制文件,而np.savetxt
写入一个文本文件。因此,例如:
In [1]: a = np.array([1, 2, 3, 4])
In [2]: np.savetxt('test1.txt', a, fmt='%d')
In [3]: b = np.loadtxt('test1.txt', dtype=int)
In [4]: a == b
Out[4]: array([ True, True, True, True], dtype=bool)
Or:
或者:
In [5]: a.tofile('test2.dat')
In [6]: c = np.fromfile('test2.dat', dtype=int)
In [7]: c == a
Out[7]: array([ True, True, True, True], dtype=bool)
I use the former method even if it is slower and creates bigger files (sometimes): the binary format can be platform dependent (for example, the file format depends on the endianness of your system).
我使用前一种方法,即使它速度较慢并创建更大的文件(有时):二进制格式可能依赖于平台(例如,文件格式取决于系统的字节序)。
There is a platform independentformat for NumPy arrays, which can be saved and read with np.save
and np.load
:
NumPy 数组有一种与平台无关的格式,可以使用np.save
和保存和读取np.load
:
In [8]: np.save('test3.npy', a) # .npy extension is added if not given
In [9]: d = np.load('test3.npy')
In [10]: a == d
Out[10]: array([ True, True, True, True], dtype=bool)
回答by ali_m
np.fromfile()
has a sep=
keyword argument:
np.fromfile()
有一个sep=
关键字参数:
Separator between items if file is a text file. Empty (“”) separator means the file should be treated as binary. Spaces (” ”) in the separator match zero or more whitespace characters. A separator consisting only of spaces must match at least one whitespace.
如果文件是文本文件,则项目之间的分隔符。空(“”)分隔符表示文件应被视为二进制文件。分隔符中的空格(“”)匹配零个或多个空白字符。仅由空格组成的分隔符必须至少匹配一个空格。
The default value of sep=""
means that np.fromfile()
tries to read it as a binary file rather than a space-separated text file, so you get nonsense values back. If you use np.fromfile('markers.txt', sep=" ")
you will get the result you are looking for.
的默认值sep=""
意味着np.fromfile()
尝试将其作为二进制文件而不是空格分隔的文本文件读取,因此您会得到无意义的值。如果您使用,np.fromfile('markers.txt', sep=" ")
您将获得您正在寻找的结果。
However, as others have pointed out, np.loadtxt()
is the preferred way to convert text files to numpy arrays, and unless the file needs to be human-readable it is usually better to use binary formats instead (e.g. np.load()
/np.save()
).
然而,正如其他人指出的那样,np.loadtxt()
是将文本文件转换为 numpy 数组的首选方法,除非文件需要人类可读,否则通常最好使用二进制格式(例如np.load()
/ np.save()
)。
回答by Sherzod
np.save('data.npy', num_arr) # save
new_num_arr = np.load('data.npy') # load