Linux java.io.IOException: 错误=11

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/8384000/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-06 03:07:51  来源:igfitidea点击:

java.io.IOException: error=11

javalinuxprocessprocessbuilder

提问by mbatchkarov

I am experiencing a weird problem with the Java ProcessBuilder. The code is shown below (in a slightly simplified form)

我在使用 Java 时遇到了一个奇怪的问题ProcessBuilder。代码如下所示(以稍微简化的形式)

public class Whatever implements Runnable
{

public void run(){
        //someIdentifier is a randomly generated string
        String in = someIdentifier + "input.txt";
        String out = someIdentifier + "output.txt";
        ProcessBuilder builder = new ProcessBuilder("./whateveer.sh", in, out);
        try {
            Process process = builder.start();
            process.waitFor();
        } catch (IOException e) {
            log.error("Could not launch process. Command: " + builder.command(), e);
        } catch (InterruptedException ex) {
            log.error(ex);
        }
}

}

whatever.sh reads:

what.sh 写道:

R --slave --args   <whatever1.R >> r.log    

Loads of instances of Whateverare submitted to an ExecutorServiceof fixed size (35). The rest of the application waits for all of them to finish- implemented with a CountdownLatch. Everything runs fine for several hours (Scientific Linux 5.0, java version "1.6.0_24") before throwing the following exception:

的实例负载Whatever被提交到一个ExecutorService固定大小 (35)。应用程序的其余部分等待所有这些完成 - 使用CountdownLatch. 在抛出以下异常之前,一切正常运行几个小时(Scientific Linux 5.0,java 版本“1.6.0_24”):

java.io.IOException: Cannot run program "./whatever.sh": java.io.IOException: error=11, Resource temporarily unavailable
    at java.lang.ProcessBuilder.start(Unknown Source)
... rest of stack trace omitted...

Does anyone have an idea what this means? Based on the google/bing search results for java.io.IOException: error=11, it is not the most common of exceptions and I am completely baffled.

有谁知道这意味着什么?根据 google/bing 搜索结果java.io.IOException: error=11,这不是最常见的例外情况,我完全感到困惑。

My wild and not so educated guess is that I have too many threads trying to launch the same file at the same time. However, it takes hours of CPU time to reproduce the problem, so I have not tried with a smaller number.

我的疯狂而不那么有根据的猜测是我有太多线程试图同时启动同一个文件。但是,重现问题需要几个小时的 CPU 时间,所以我没有尝试使用较小的数字。

Any suggestions are greatly appreciated.

任何建议都非常感谢。

采纳答案by sarnold

The error=11is almost certainly the EAGAINerror code:

error=11几乎可以肯定是EAGAIN错误代码:

$ grep EAGAIN asm-generic/errno-base.h 
#define EAGAIN      11  /* Try again */

The clone(2)system call documents an EAGAINerror return:

clone(2)系统调用文档的EAGAIN错误回报:

   EAGAIN Too many processes are already running.

The fork(2)system call documents two EAGAINerror returns:

fork(2)系统调用文档提供了两个EAGAIN错误的回报:

   EAGAIN fork() cannot allocate sufficient memory to copy the
          parent's page tables and allocate a task structure for
          the child.

   EAGAIN It was not possible to create a new process because
          the caller's RLIMIT_NPROC resource limit was
          encountered.  To exceed this limit, the process must
          have either the CAP_SYS_ADMIN or the CAP_SYS_RESOURCE
          capability.

If you were really that low on memory, it would almost certainly show in the system logs. Check dmesg(1)output or /var/log/syslogfor any potential messages about low system memory. (Other things would break. This doesn't seem too plausible.)

如果您的内存真的那么低,它几乎肯定会显示在系统日志中。检查dmesg(1)输出或/var/log/syslog有关系统内存不足的任何潜在消息。(其他事情会失败。这似乎不太合理。)

Much more likely is running into either the per-user limit on processes or system-wide maximum number of processes. Perhaps one of your processes isn't properly reapting zombies? This would be very easy to spot by checking ps(1)output over time:

更有可能遇到每个用户的进程限制或系统范围的最大进程数。也许你的一个进程没有正确地收割僵尸?通过检查ps(1)输出随着时间的推移很容易发现:

while true ; do ps auxw >> ~/processes ; sleep 10 ; done

(Maybe check every minute or ten minutes if it really does take hours before you're in trouble.)

(如果真的需要几个小时才能遇到麻烦,也许每分钟或十分钟检查一次。)

If you're not reaping zombies, then read up on whatever you must do to ProcessBuilder to use waitpid(2)to reap your dead children.

如果您没有收割僵尸,那么请阅读您必须对 ProcessBuilder 执行的任何操作以waitpid(2)用于收割死去的孩子。

If you're legitimately running more processes than your rlimits allow, you'll need to use ulimitin your bash(1)scripts (if running as root) or set higher limits in /etc/security/limits.conffor the nprocproperty.

如果您合法运行的进程多于 rlimits 允许的数量,则需要ulimitbash(1)脚本中使用(如果作为 运行root)或/etc/security/limits.conf为该nproc属性设置更高的限制。

If you are instead running into the system-wide process limits, you might need to write a larger value into /proc/sys/kernel/pid_max. See proc(5)for some (short) details.

如果您遇到系统范围的进程限制,则可能需要将更大的值写入/proc/sys/kernel/pid_max. 有关proc(5)一些(简短的)详细信息,请参阅。

回答by Peter Lawrey

errno 11 means "Resource temporarily unavailable" This is usually a memory problem and can prevent a thread or socket being created.

errno 11 表示“资源暂时不可用” 这通常是内存问题,会阻止创建线程或套接字。

errno 12 means "Can't allocate memory". This is a failure to obtain memory is a direct call for memory (rather than a resource which in turn needs memory)

errno 12 表示“无法分配内存”。这是获取内存的失败,是直接调用内存(而不是资源反过来需要内存)

I would try increasing the swap space of your system which should avoid this issue.

我会尝试增加系统的交换空间,这应该可以避免这个问题。