Java 打开的文件句柄太多

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/1661322/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-12 18:49:57  来源:igfitidea点击:

Too many open file handles

javafileiosolaris

提问by dlinsin

I'm working on a huge legacy Java application, with a lot of handwritten stuff, which nowadays you'd let a framework handle.

我正在开发一个巨大的遗留 Java 应用程序,其中包含大量手写内容,现在您可以让框架处理这些内容。

The problem I'm facing right now is that we are running out of file handles on our Solaris Server. I'd like to know what's the best way to track open file handles? Where to look at and what can cause open file handles to run out?

我现在面临的问题是我们的 Solaris 服务器上的文件句柄用完了。我想知道跟踪打开文件句柄的最佳方法是什么?在哪里查看以及什么会导致打开的文件句柄用完?

I cannot debug the application under Solaris, only on my Windows development environment. Is is even reasonable to analyze the open file handles under Windows?

我无法在 Solaris 下调试应用程序,只能在我的 Windows 开发环境中调试。在 Windows 下分析打开的文件句柄是否合理?

采纳答案by Benj

One good thing I've found for tracking down unclosed file handles is FindBugs:

我发现跟踪未关闭文件句柄的一件好事是 FindBugs:

http://findbugs.sourceforge.net/

http://findbugs.sourceforge.net/

It checks many things, but one of the most useful is resource open/close operations. It's a static analysis program that runs on your source code and it's also available as an eclipse plugin.

它检查很多事情,但最有用的一项是资源打开/关闭操作。它是一个在您的源代码上运行的静态分析程序,它也可以作为 Eclipse 插件使用。

回答by C. Ross

It could certainly give you an idea. Since it's Java, the file open/close mechanics should be implemented similarly (unless one of the JVMs are implemented incorrectly). I would recommend using File Monitoron Windows.

它当然可以给你一个想法。由于它是 Java,文件打开/关闭机制应该以类似的方式实现(除非其中一个 JVM 实现不正确)。我建议在 Windows 上使用文件监视器

回答by kdgregory

I would start by asking my sysadmin to get a listing of all open file descriptors for the process. Different systems do this in different ways: Linux, for example, has the /proc/PID/fddirectory. I recall that Solaris has a command (maybe pfiles?) that will do the same thing -- your sysadmin should know it.

我首先要求我的系统管理员获取进程的所有打开文件描述符的列表。不同的系统以不同的方式执行此操作:例如,Linux 具有/proc/PID/fd目录。我记得 Solaris 有一个命令(也许是pfiles?)可以做同样的事情——你的系统管理员应该知道它。

However, unless you see a lot of references to the same file, a fd list isn't going to help you. If it's a server process, it probably has lots of files (and sockets) open for a reason. The only way to resolve the problem is adjust the system limit on open files -- you can also check the per-user limit with ulimit, but in most current installations that equals the system limit.

但是,除非您看到对同一文件的大量引用,否则 fd 列表不会对您有所帮助。如果它是一个服务器进程,它可能出于某种原因打开了大量文件(和套接字)。解决问题的唯一方法是调整打开文件的系统限制——您也可以使用ulimit检查每个用户的限制,但在大多数当前安装中,等于系统限制。

回答by Benj

On windows you can look at open file handles using process explorer:

在 Windows 上,您可以使用进程资源管理器查看打开的文件句柄:

http://technet.microsoft.com/en-us/sysinternals/bb896653.aspx

http://technet.microsoft.com/en-us/sysinternals/bb896653.aspx

On Solaris you can use "lsof" to monitor the open file handles

在 Solaris 上,您可以使用“lsof”来监视打开的文件句柄

回答by NickDK

Not a direct answer to your question but these problems could be the result of releasing file resources incorrectly in your legacy code. By example if you're working with FileOutputsStream classes make sure the close methods are called in a finally block as in this example:

不是您问题的直接答案,但这些问题可能是由于您的遗留代码中错误地释放文件资源造成的。例如,如果您正在使用 FileOutputsStream 类,请确保在 finally 块中调用 close 方法,如下例所示:

FileOutputsStream out = null;
try {
  //You're file handling code
} catch (IOException e) {
  //Handle
} finally {
  if (out != null) {
    try { out.close(): } catch (IOException e) { }
  }
}

回答by gustafc

To answer the second part of the question:

回答问题的第二部分:

what can cause open file handles to run out?

什么会导致打开的文件句柄用完?

Opening a lot of files, obviously, and then not closing them.

显然,打开了很多文件,然后不关闭它们。

The simplest scenario is that the references to whatever objects hold the native handles (e.g., FileInputStream) are thrown away before being closed, which means the files remain open until the objects are finalized.

最简单的情况是对任何持有本机句柄(例如FileInputStream)的对象的引用在关闭之前被丢弃,这意味着文件在对象完成之前保持打开状态。

The other option is that the objects are stored somewhere and not closed. A heap dump might be able to tell you what lingers where (jmapand jhatare included in the JDK, or you can use jvisualvmif you want a GUI). You're probably interested in looking for objects owning FileDescriptors.

另一种选择是对象存储在某处而不是关闭。堆转储可能会告诉您哪些内容存在于何处(jmap并且jhat包含在 JDK 中,或者您可以jvisualvm在需要 GUI 时使用)。您可能对寻找拥有FileDescriptors 的对象感兴趣。

回答by St.Shadow

This little script help me to keep eye on count of opened files when I need test ic count. If was used on Linux, so for Solaris you should patch it (may be :) )

当我需要测试 ic 计数时,这个小脚本可以帮助我留意打开文件的数量。如果在 Linux 上使用,那么对于 Solaris,您应该修补它(可能是 :))

#!/bin/bash
COUNTER=0
HOW_MANY=0
MAX=0
# do not take care about COUNTER - just flag, shown should we continie or not
while [ $COUNTER -lt 10 ]; do
    #run until process with passed pid alive
    if [ -r "/proc/" ]; then
        # count, how many files we have
        HOW_MANY=`/usr/sbin/lsof -p  | wc -l`
        #output for live monitoring
        echo `date +%H:%M:%S` $HOW_MANY
        # uncomment, if you want to save statistics
        #/usr/sbin/lsof -p  > ~/autocount/config_lsof_`echo $HOW_MANY`_`date +%H_%M_%S`.txt

        # look for max value
        if [ $MAX -lt $HOW_MANY ]; then
            let MAX=$HOW_MANY
            echo new max is $MAX
        fi 
        # test every second. if you don`t need so frequenlty test - increase this value
        sleep 1
    else
        echo max count is $MAX
        echo Process was finished
        let COUNTER=11
    fi
done

Also you can try to play with jvm ontion -Xverify:none - it should disable jar verification (if most of opened files is jars...). For leaks through not closed FileOutputStream you can use findbug (mentored above) or try to find article how to patch standard java FileOutputStream/FileInputStream , where you can see, who open files, and forgot close them. Unfortunatly, can not find this article right now, but this is existing :) Also think about increasing of filelimit - for up-to-date *nix kernels is not a problem handle more than 1024 fd.

您也可以尝试使用 jvm ontion -Xverify:none - 它应该禁用 jar 验证(如果大多数打开的文件是 jars...)。对于通过未关闭的 FileOutputStream 泄漏,您可以使用 findbug(上面有指导)或尝试查找如何修补标准 java FileOutputStream/FileInputStream 的文章,您可以在其中看到谁打开文件,而忘记关闭它们。不幸的是,现在找不到这篇文章,但这是存在的:) 还要考虑增加文件限制 - 对于最新的 *nix 内核来说,处理超过 1024 个 fd 不是问题。

回答by Jay

This may not be practical in your case, but what I did once when I had a similar problem with open database connections was override the "open" function with my own. (Conveniently I already had this function because we had written our own connection pooling.) In my function I then added an entry to a table recording the open. I did a stack trace call and saved the identify of the caller, along with the time called and I forget what else. When the connection was released, I deleted the table entry. Then I had a screen where we could dump the list of open entries. You could then look at the time stamp and easily see which connections had been open for unlikely amounts of time, and which functions had done these opens.

这在您的情况下可能不切实际,但是当我在打开数据库连接时遇到类似问题时,我所做的就是用我自己的覆盖“打开”功能。(很方便,我已经有了这个函数,因为我们已经编写了自己的连接池。)然后在我的函数中,我向记录打开的表添加了一个条目。我做了一个堆栈跟踪调用并保存了调用者的身份,以及调用的时间,我忘记了还有什么。当连接被释放时,我删除了表条目。然后我有一个屏幕,我们可以在其中转储打开的条目列表。然后,您可以查看时间戳并轻松查看哪些连接已经打开了不太可能的时间,以及哪些功能完成了这些打开。

From this we were able to quickly track down the couple of functions that were opening connections and failing to close them.

由此,我们能够快速追踪打开连接但未能关闭连接的几个函数。

If you have lots of open file handles, the odds are that you're failing to close them when you're done somewhere. You say you've checked for proper try/finally blocks, but I'd suspect somewhere in the code you either missed a bad one, or you have a function that hands and never makes it to the finally. I suppose it's also possible that you really are doing proper closes every time you open a file, but you are opening hundreds of files simultaneously. If that's the case, I'm not sure what you can do other than a serious program redesign to manipulate fewer files, or a serious program redesign to queue your file accesses. (At this point I add the usual, "Without knowing the details of your application, etc.)

如果您有很多打开的文件句柄,很可能是您在某处完成后未能关闭它们。你说你已经检查了正确的 try/finally 块,但我怀疑在代码中的某个地方你要么错过了一个坏的,要么你有一个函数,但永远不会到达finally。我想也有可能您每次打开文件时确实都在正确关闭,但是您同时打开了数百个文件。如果是这种情况,我不知道除了认真的程序重新设计以处理更少的文件,或者认真的程序重新设计以对文件访问进行排队之外,您还能做什么。(在这一点上,我添加了通常的“不知道您的应用程序的详细信息等”)

回答by Andrzej Doyle

I would double-check the environment settings on your Solaris box. I believe that by default Solaris only allows 256 file handles per process. For a server application, especially if it's running on a dedicated server, this is very low. Figure 50 or more descriptors for opening JRE and library JARs, and then at leastone descriptor for each incoming request and database query, probably more, and you can see how this just won't cut the mustard for a serious server.

我会仔细检查您的 Solaris 机器上的环境设置。我相信默认情况下 Solaris 每个进程只允许 256 个文件句柄。对于服务器应用程序,尤其是在专用服务器上运行时,这是非常低的。图 50 或更多用于打开 JRE 和库 JAR 的描述符,然后每个传入请求和数据库查询至少有一个描述符,可能更多,并且您可以看到这对于严肃的服务器来说是如何不费吹灰之力的。

Have a look at the /etc/systemfile, for the values of rlim_fd_curand rlim_fd_max, to see what your system has set. Then consider whether this is reasonable (you can see how many file descriptors are open while the server is running with the lsofcommand, ideally with the -p [process ID] parameter.

看一看的/etc/system文件,值rlim_fd_currlim_fd_max,看到你的系统设置。然后考虑这是否合理(您可以使用命令查看服务器运行时打开了多少文件描述符lsof,最好使用 -p [进程 ID] 参数。

回答by alasdairg

Its worth bearing in mind that open socketsalso consume file handles on Unix systems. So it could very well be something like a database connection pool leak (e.g. open database connections not being closed and returned to the pool) that is leading to this issue - certainly I have seen this error before caused by a connection pool leak.

值得记住的是,打开的套接字也会消耗 Unix 系统上的文件句柄。因此,很可能是导致此问题的数据库连接池泄漏(例如,打开的数据库连接未关闭并返回到池中)之类的东西 - 当然我之前已经看到过由连接池泄漏引起的此错误。