python 运行 Numpy Meshgrid 时出现内存错误

Question

提问by greye

I have 8823data points with x,y coordinates. I'm trying to follow the answer on how to get a scatter dataset to be represented as a heatmapbut when I go through the

我有x,y 坐标的8823 个数据点。我正在尝试遵循有关如何将散点数据集表示为热图的答案，但是当我通过

X, Y = np.meshgrid(x, y)

instruction with my data arrays I get MemoryError. I am new to numpy and matplotlib and am essentially trying to run this by adapting the examples I can find.

我得到的数据数组的指令MemoryError。我是 numpy 和 matplotlib 的新手，我基本上是在尝试通过调整我能找到的例子来运行它。

Here's how I built my arrays from a file that has them stored:

下面是我如何从一个存储数组的文件构建我的数组：

XY_File = open ('XY_Output.txt', 'r')
XY = XY_File.readlines()
XY_File.close()

Xf=[]
Yf=[]
for line in XY:
    Xf.append(float(line.split('\t')[0]))
    Yf.append(float(line.split('\t')[1]))
x=array(Xf)
y=array(Yf)

Is there a problem with my arrays? This same code worked when put into this examplebut I'm not too sure.

我的阵列有问题吗？将相同的代码放入此示例中时也可以使用，但我不太确定。

Why am I getting this MemoryError and how can I fix this?

为什么我会收到此 MemoryError 以及如何解决此问题？

Answer 1

回答by Andrew Jaffe

Your call to meshgridrequires a lot of memory -- it produces two 8823*8823 floating point arrays. Each of them are about 0.6 GB.

您的调用meshgrid需要大量内存——它产生两个 8823*8823 浮点数组。他们每个人都大约 0.6 GB。

But your screen can't show (and your eye can't really process) that much information anyway, so you should probably think of a way to smooth your data to something more reasonable like 1024*1024 before you do this step.

但是无论如何，您的屏幕无法显示（并且您的眼睛无法真正处理）那么多信息，因此在执行此步骤之前，您可能应该想办法将数据平滑为更合理的值，例如 1024*1024。

Answer 2

回答by jtaylor

in numpy 1.7.0 and newer meshgridhas the sparsekeyword argument. A sparse meshgrid is setup so it broadcasts to a full meshgrid when used. This can save large amounts of memory e.g. when using the meshgrid to index arrays.

在 numpy 1.7.0 和更新版本中meshgrid有sparse关键字参数。设置了一个稀疏网格，因此它在使用时广播到一个完整的网格。这可以节省大量内存，例如在使用网格网格索引数组时。

In [2]: np.meshgrid(np.arange(10), np.arange(10), sparse=True)
Out[2]: 
[array([[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]]), array([[0],
    [1],
    [2],
    [3],
    [4],
    [5],
    [6],
    [7],
    [8],
    [9]])]

Another option is to use smaller integers that are still able to represent the range:

另一种选择是使用仍然能够表示范围的较小整数：

np.meshgrid(np.arange(10).astype(np.int8), np.arange(10).astype(np.int8),
            sparse=True, copy=False)

though as of numpy 1.9 using these smaller integers for indexing will be slower as they will internally be converted back to larger integers in small (np.setbufsize sized) chunks.

尽管从 numpy 1.9 开始，使用这些较小的整数进行索引会变慢，因为它们将在内部转换回较小的（np.setbufsize 大小）块中的较大整数。

Answer 3

回答by Charlie Lee

When you call np.meshgrid for scatter figure, you need to normalize your data if it is too large to process, try this module

当你调用 np.meshgrid 进行散点图时，如果数据太大而无法处理，则需要对数据进行归一化，试试这个模块

    # Feature Scaling
from sklearn.preprocessing import StandardScaler
st = StandardScaler()
X = st.fit_transform(X)

python 运行 Numpy Meshgrid 时出现内存错误

提问by greye

回答by Andrew Jaffe

回答by jtaylor

回答by Charlie Lee

相关推荐

最近更新

标签

python 运行 Numpy Meshgrid 时出现内存错误

提问by greye

回答by Andrew Jaffe

回答by jtaylor

回答by Charlie Lee

相关推荐

python 获取函数导入路径

python Django：使用 Django 表单创建 HTML 输入数组

Python urllib3 以及如何处理 cookie 支持？

Python：安装多处理

相关推荐

最近更新

标签