64 位 Windows 上的 Python 32 位内存限制
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/18282867/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Python 32-bit memory limits on 64bit windows
提问by Erotemic
I'm getting a memory issue I can't seem to understand.
我遇到了一个我似乎无法理解的记忆问题。
I'm on a windows 7 64 bit machine with 8GB of memory and running a 32bit python program.
我在具有 8GB 内存并运行 32 位 python 程序的 Windows 7 64 位机器上。
The programs reads a 5,118 zipped numpy files (npz). Windows reports that the files take up 1.98 GB on disk
这些程序读取 5,118 个压缩的 numpy 文件 (npz)。Windows 报告文件在磁盘上占用 1.98 GB
Each npz file contains two pieces of data: 'arr_0' is of type np.float32 and 'arr_1' is of type np.uint8
每个 npz 文件包含两条数据:'arr_0' 的类型为 np.float32,'arr_1' 的类型为 np.uint8
The python script reads each file appends their data into two lists and then closes the file.
python 脚本读取每个文件,将它们的数据附加到两个列表中,然后关闭文件。
Around file 4284/5118 the program throws a MemoryException
在文件 4284/5118 周围,程序抛出 MemoryException
However, the task manager says that the memory usage of python.exe *32 when the error occurs is 1,854,848K ~= 1.8GB. Much less than my 8 GB limit, or the supposed 4GB limit of a 32bit program.
但是任务管理器说python.exe *32 出现错误时的内存使用量为1,854,848K ~= 1.8GB。远低于我的 8 GB 限制,或者 32 位程序的假定 4GB 限制。
In the program I catch the memory error and it reports: Each list has length 4285. The first list contains a total of 1,928,588,480 float32's ~= 229.9 MB of data. The second list contains 12,342,966,272 uint8's ~= 1,471.3MB of data.
在程序中,我捕获了内存错误并报告:每个列表的长度为 4285。第一个列表包含总共 1,928,588,480 个 float32 的 ~= 229.9 MB 数据。第二个列表包含 12,342,966,272 个 uint8 的 ~= 1,471.3MB 数据。
So, everything seems to be checking out. Except for the part where I get a memory error. I absolutely have more memory, and the file which it crashes on is ~800KB, so its not failing on reading a huge file.
所以,一切似乎都在检验。除了出现内存错误的部分。我绝对有更多的内存,它崩溃的文件是~800KB,所以它在读取一个大文件时不会失败。
Also, the file isn't corrupted. I can read it just fine, if I don't use up all that memory beforehand.
此外,该文件没有损坏。如果我不事先用完所有内存,我可以很好地阅读它。
To make things more confusing, all of this seems to work fine on my Linux machine (although it does have 16GB of memory as opposed to 8GB on my Windows machine), but still, it doesn't seem to be the machine's RAM that is causing this issue.
更令人困惑的是,所有这些在我的 Linux 机器上似乎都可以正常工作(尽管它确实有 16GB 的内存,而我的 Windows 机器上有 8GB),但是,它似乎仍然不是机器的 RAM导致这个问题。
Why is Python throwing a memory error, when I expect that it should be able to allocate another 2GB of data?
为什么 Python 会抛出内存错误,而我希望它应该能够再分配 2GB 的数据?
回答by abarnert
I don't know why you think your process should be able to access 4GB. According to Memory Limits for Windows Releasesat MSDN, on 64-bit Windows 7, a default 32-bit process gets 2GB.* Which is exactly where it's running out.
我不知道您为什么认为您的进程应该能够访问 4GB。根据MSDN 上Windows 版本的内存限制,在 64 位 Windows 7 上,默认的 32 位进程获得 2GB。* 这正是它用完的地方。
So, is there a way around this?
那么,有没有办法解决这个问题?
Well, you could make a custom build of 32-bit Python that uses the IMAGE_FILE_LARGE_ADDRESS_AWARE
flag, and rebuild numpy
and all of your other extension modules. I can't promise that all of the relevant code really is safe to run with the large-address-aware flag; there's a good chance it is, but unless someone's already done it and tested it, "a good chance" is the best anyone is likely to know.
好吧,您可以创建一个使用IMAGE_FILE_LARGE_ADDRESS_AWARE
标志的 32 位 Python 的自定义构建,并重新构建numpy
和所有其他扩展模块。我不能保证所有相关代码在使用大地址感知标志时确实是安全的;很有可能是这样,但除非有人已经做过并测试过,否则“一个好机会”是任何人都可能知道的最好的机会。
Or, more obviously, just use 64-bit Python instead.
或者,更明显的是,只需使用 64 位 Python。
The amount of physical RAM is completely irrelevant. You seem to think that you have an "8GB limit" with 8GB of RAM, but that's not how it works. Your system takes all of your RAM plus whatever swap space it needsand divides it up between apps; an app may be able to get 20GB of virtual memory without getting a memory error even on an 8GB machine. And meanwhile, a 32-bit app has no way of accessing more than 4GB, and the OS will use up some of that address space (half of it by default, on Windows), so you can only get 2GB even on an 8GB machine that's not running anything else. (Not that it's possible to ever be "not running anything else" on a modern OS, but you know what I mean.)
物理 RAM 的数量完全无关。您似乎认为 8GB 内存有“8GB 限制”,但这不是它的工作原理。你的系统会占用你所有的 RAM加上它需要的任何交换空间,并在应用程序之间分配它;即使在 8GB 的机器上,应用程序也可能能够获得 20GB 的虚拟内存而不会出现内存错误。同时,一个 32 位的应用程序无法访问超过 4GB 的空间,并且操作系统将使用该地址空间的一部分(默认情况下是一半,在 Windows 上),因此即使在 8GB 的机器上你也只能获得 2GB那不是在运行其他任何东西。(并不是说在现代操作系统上“不运行任何其他东西”是可能的,但你知道我的意思。)
So, why does this work on your linux box?
那么,为什么这可以在您的 linux 机器上运行?
Because your linux box is configured to give 32-bit processes 3.5GB of virtual address space, or 3.99GB, or… Well, I can't tell you the exact number, but every distro I've seen for many years has been configured for at least 3.25GB.
因为你的 linux box 被配置为给 32 位进程提供 3.5GB 的虚拟地址空间,或者 3.99GB,或者......好吧,我不能告诉你确切的数字,但是我多年来看到的每个发行版都被配置了至少 3.25GB。
* Also note that you don't even really get that full 2GB for your data; your program. Most of what the OS and its drivers make accessible to your code sits in the other half, but some bits sit in your half, along with every DLL you load and any space they need, and various other things. It doesn't add up to too much, but it's not zero.
* 另请注意,您甚至没有真正获得 2GB 的完整数据;你的程序。操作系统及其驱动程序使您的代码可以访问的大部分内容位于另一半,但有些位位于您的另一半,以及您加载的每个 DLL 和它们需要的任何空间,以及其他各种内容。它加起来不会太多,但也不是零。