ruby Errno::ENOMEM: 无法分配内存 - cat
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/15086133/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Errno::ENOMEM: Cannot allocate memory - cat
提问by Atith
I have a job running on production which process xml files. xml files counts around 4k and of size 8 to 9 GB all together.
我有一个在生产中运行的工作,它处理 xml 文件。.xml 文件总共大约 4k,大小为 8 到 9 GB。
After processing we get CSV files as output. I've a cat command which will merge all CSV files to a single file I'm getting:
处理后,我们得到 CSV 文件作为输出。我有一个 cat 命令,它将所有 CSV 文件合并到我得到的单个文件中:
Errno::ENOMEM: Cannot allocate memory
Errno::ENOMEM: 无法分配内存
on cat(Backtick) command.
on cat(反引号)命令。
Below are few details:
下面是一些细节:
- System Memory - 4 GB
- Swap - 2 GB
- Ruby : 1.9.3p286
- 系统内存 - 4 GB
- 交换 - 2 GB
- 红宝石:1.9.3p286
Files are processed using nokogiriand saxbuilder-0.0.8.
使用nokogiri和处理文件saxbuilder-0.0.8。
Here, there is a block of code which will process 4,000 XML files and output is saved in CSV (1 per xml) (sorry, I'm not suppose to share it b'coz of company policy).
在这里,有一段代码将处理 4,000 个 XML 文件,输出保存在 CSV 中(每个 xml 1 个)(抱歉,我不打算分享它,因为公司政策)。
Below is the code which will merge the output files to a single file
下面是将输出文件合并为单个文件的代码
Dir["#{processing_directory}/*.csv"].sort_by {|file| [file.count("/"), file]}.each {|file|
`cat #{file} >> #{final_output_file}`
}
I've taken memory consumption snapshots during processing.It consumes almost all part of the memory, but, it won't fail.
It always fails on catcommand.
我在处理过程中拍摄了内存消耗快照。它消耗了几乎所有部分的内存,但是,它不会失败。它总是按cat命令失败。
I guess, on backtick it tries to fork a new process which doesn't get enough memory so it fails.
我想,在反引号中,它试图分叉一个没有足够内存的新进程,因此它失败了。
Please let me know your opinion and alternative to this.
请让我知道您的意见和替代方案。
采纳答案by Intrepidd
So it seems that your system is running pretty low on memory and spawning a shell + calling cat is too much for the few memory left.
因此,您的系统似乎内存不足,并且产生一个 shell + 调用 cat 对于剩余的少量内存来说太多了。
If you don't mind loosing some speed, you can merge the files in ruby, with small buffers. This avoids spawning a shell, and you can control the buffer size.
如果你不介意降低一些速度,你可以用小缓冲区合并 ruby 中的文件。这避免了生成外壳,并且您可以控制缓冲区大小。
This is untested but you get the idea :
这是未经测试的,但你明白了:
buffer_size = 4096
output_file = File.open(final_output_file, 'w')
Dir["#{processing_directory}/*.csv"].sort_by {|file| [file.count("/"), file]}.each do |file|
f = File.open(file)
while buffer = f.read(buffer_size)
output_file.write(buffer)
end
f.close
end
回答by kenorb
You are probably out of physical memory, so double check that and verify your swap (free -m). In case you don't have a swap space, create one.
您可能没有物理内存,因此请仔细检查并验证您的交换 ( free -m)。如果您没有交换空间,请创建一个.
Otherwise if your memory is fine, the error is most likely caused by shell resource limits. You may check them by ulimit -a.
否则,如果您的内存正常,则错误很可能是由 shell 资源限制引起的。您可以通过 来检查它们ulimit -a。
They can be changed by ulimitwhich can modify shell resource limits (see: help ulimit), e.g.
它们可以通过ulimit它可以修改 shell 资源限制(见:)help ulimit,例如
ulimit -Sn unlimited && ulimit -Sl unlimited
To make these limit persistent, you can configure it by creating the ulimit setting file by the following shell command:
要使这些限制持久化,您可以通过以下 shell 命令创建 ulimit 设置文件来配置它:
cat | sudo tee /etc/security/limits.d/01-${USER}.conf <<EOF
${USER} soft core unlimited
${USER} soft fsize unlimited
${USER} soft nofile 4096
${USER} soft nproc 30654
EOF
Or use /etc/sysctl.confto change the limit globally (man sysctl.conf), e.g.
或用于/etc/sysctl.conf全局更改限制 ( man sysctl.conf),例如
kern.maxprocperuid=1000
kern.maxproc=2000
kern.maxfilesperproc=20000
kern.maxfiles=50000
回答by unixs
I have the same problem, but instead of catit was sendmail(gem mail).
我有同样的问题,但不是cat它是sendmail(gem mail)。
I found problem & solution hereby installing posix-spawngem, e.g.
我通过安装gem在这里找到了问题和解决方案posix-spawn,例如
gem install posix-spawn
and here is the example:
这是示例:
a = (1..500_000_000).to_a
require 'posix/spawn'
POSIX::Spawn::spawn('ls')
This time creating child process should succeed.
这次创建子进程应该会成功。
See also: Minimizing Memory Usage for Creating Application Subprocessesat Oracle.
另请参阅:在 Oracle 中最大限度地减少用于创建应用程序子进程的内存使用。

