使用 ServletOutputStream 在 Java servlet 中写入非常大的文件而不会出现内存问题
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/685271/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Using ServletOutputStream to write very large files in a Java servlet without memory issues
提问by Martin
I am using IBM Websphere Application Server v6 and Java 1.4 and am trying to write large CSV files to the ServletOutputStream
for a user to download. Files are ranging from a 50-750MB at the moment.
我正在使用 IBM Websphere Application Server v6 和 Java 1.4,并尝试将大型 CSV 文件写入 CSV 文件以ServletOutputStream
供用户下载。目前文件大小为 50-750MB。
The smaller files aren't causing too much of a problem but with the larger files it appears that it is being written into the heap which is then causing an OutOfMemory error and bringing down the entire server.
较小的文件不会引起太多问题,但是对于较大的文件,它似乎被写入堆中,然后导致 OutOfMemory 错误并关闭整个服务器。
These files can only be served out to authenticated users over HTTPS which is why I am serving them through a Servlet instead of just sticking them in Apache.
这些文件只能通过 HTTPS 提供给经过身份验证的用户,这就是为什么我通过 Servlet 为它们提供服务,而不是仅仅将它们粘贴在 Apache 中。
The code I am using is (some fluff removed around this):
我使用的代码是(在此周围删除了一些绒毛):
resp.setHeader("Content-length", "" + fileLength);
resp.setContentType("application/vnd.ms-excel");
resp.setHeader("Content-Disposition","attachment; filename=\"export.csv\"");
FileInputStream inputStream = null;
try
{
inputStream = new FileInputStream(path);
byte[] buffer = new byte[1024];
int bytesRead = 0;
do
{
bytesRead = inputStream.read(buffer, offset, buffer.length);
resp.getOutputStream().write(buffer, 0, bytesRead);
}
while (bytesRead == buffer.length);
resp.getOutputStream().flush();
}
finally
{
if(inputStream != null)
inputStream.close();
}
The FileInputStream
doesn't seem to be causing a problem as if I write to another file or just remove the write completely the memory usage doesn't appear to be a problem.
在FileInputStream
似乎没有,如果我写到另一个文件或只是删除完全写入内存使用情况似乎并不成为一个问题而导致问题。
What I am thinking is that the resp.getOutputStream().write
is being stored in memory until the data can be sent through to the client. So the entire file might be read and stored in the resp.getOutputStream()
causing my memory issues and crashing!
我在想的是,resp.getOutputStream().write
它被存储在内存中,直到数据可以发送到客户端。所以整个文件可能会被读取并存储在resp.getOutputStream()
导致我的内存问题和崩溃中!
I have tried Buffering these streams and also tried using Channels from java.nio
, none of which seems to make any bit of difference to my memory issues. I have also flushed the OutputStream
once per iteration of the loop and after the loop, which didn't help.
我试过缓冲这些流,也试过使用 Channels from java.nio
,这似乎对我的内存问题没有任何影响。我还在OutputStream
循环的每次迭代和循环之后刷新了一次,这没有帮助。
采纳答案by BalusC
The average decent servletcontainer itself flushes the stream by default every ~2KB. You should really not have the need to explicitly call flush()
on the OutputStream
of the HttpServletResponse
at intervals when sequentially streaming data from the one and same source. In for example Tomcat (and Websphere!) this is configureable as bufferSize
attribute of the HTTP connector.
普通的 servletcontainer 本身默认每 ~2KB 刷新一次流。你真的应该没有需要再调用flush()
上OutputStream
的HttpServletResponse
间隔顺序时,从一个相同的源流数据。例如,在 Tomcat(和 Websphere!)中,这可以配置为bufferSize
HTTP 连接器的属性。
The average decent servletcontainer also just streams the data in chunksif the content length is unknown beforehand (as per the Servlet API specification!) and if the client supports HTTP 1.1.
如果内容长度事先未知(根据Servlet API 规范!)并且客户端支持 HTTP 1.1,那么一般的 servletcontainer 也只是以块的形式流式传输数据。
The problem symptoms at least indicate that the servletcontainer is buffering the entire stream in memory before flushing. This can mean that the content length header is not set and/or the servletcontainer does not support chunked encoding and/or the client side does not support chunked encoding (i.e. it is using HTTP 1.0).
问题症状至少表明 servletcontainer 在刷新之前正在内存中缓冲整个流。这可能意味着未设置内容长度标头和/或 servletcontainer 不支持分块编码和/或客户端不支持分块编码(即它使用 HTTP 1.0)。
To fix the one or other, just set the content length beforehand:
要修复一个或另一个,只需事先设置内容长度:
response.setHeader("Content-Length", String.valueOf(new File(path).length()));
回答by Tom Hawtin - tackline
Does flush
work on the output stream.
是否flush
在输出流工作。
Really I wanted to comment that you should use the three-arg form of write as the buffer is not necessarily fully read (particularly at the end of the file(!)). Also a try/finally would be in order unless you want you server to die unexpectedly.
我真的想评论一下,您应该使用三参数形式的写入,因为缓冲区不一定完全读取(特别是在文件末尾(!))。除非您希望您的服务器意外死亡,否则尝试/最终也是可以的。
回答by james
unrelated to your memory problems, the while loop should be:
与您的内存问题无关,while 循环应该是:
while(bytesRead > 0);
回答by Kevin Hakanson
I have used a class that wraps the outputstream to make it reusable in other contexts. It has worked well for me in getting data to the browser faster, but I haven't looked at the memory implications. (please pardon my antiquated m_ variable naming)
我使用了一个包装输出流的类,使其可在其他上下文中重用。它在更快地将数据发送到浏览器方面对我来说效果很好,但我还没有研究内存影响。(请原谅我过时的 m_ 变量命名)
import java.io.IOException;
import java.io.OutputStream;
public class AutoFlushOutputStream extends OutputStream {
protected long m_count = 0;
protected long m_limit = 4096;
protected OutputStream m_out;
public AutoFlushOutputStream(OutputStream out) {
m_out = out;
}
public AutoFlushOutputStream(OutputStream out, long limit) {
m_out = out;
m_limit = limit;
}
public void write(int b) throws IOException {
if (m_out != null) {
m_out.write(b);
m_count++;
if (m_limit > 0 && m_count >= m_limit) {
m_out.flush();
m_count = 0;
}
}
}
}
回答by david a.
I'm also not sure if flush()
on ServletOutputStream
works in this case, but ServletResponse.flushBuffer()
should send the response to the client (at least per 2.3 servlet spec).
我也不能肯定,如果flush()
在ServletOutputStream
这种情况下工作,但ServletResponse.flushBuffer()
应发送给客户端的响应(每2.3的servlet至少规范)。
ServletResponse.setBufferSize()
sounds promising, too.
ServletResponse.setBufferSize()
听起来也很有希望。
回答by Kostas
So, following your scenario, shouldn't you been flush(ing) inside that while loop (on every iteration), instead of outside of it? I would try that, with a bit larger buffer though.
因此,按照您的情况,您不应该在 while 循环内(在每次迭代中)刷新(ing),而不是在循环之外?我会尝试这样做,但缓冲区要大一些。
回答by SteveL
Kevin's class should close the
m_out
field if it's not null in the close() operator, we don't want to leak things, do we?As well as the
ServletOutputStream.flush()
operator, theHttpServletResponse.flushBuffer()
operation may also flush the buffers. However, it appears to be an implementation specific detail as to whether or not these operations have any effect, or whether http content length support is interfering. Remember, specifying content-length is an option on HTTP 1.0, so things should just stream out if you flush things. But I don't see that
m_out
如果 close() 运算符中的字段不为空,Kevin 的类应该关闭该字段,我们不想泄漏东西,是吗?除了
ServletOutputStream.flush()
操作员之外,该HttpServletResponse.flushBuffer()
操作还可以刷新缓冲区。然而,关于这些操作是否有任何影响,或者 http 内容长度支持是否有干扰,这似乎是一个实现特定的细节。请记住,指定内容长度是 HTTP 1.0 上的一个选项,因此如果您刷新内容,事情应该只是流出来。但我没有看到
回答by eckes
The while condition does not work, you need to check the -1 before using it. And please use a temporary variable for the output stream, its nicer to read and it safes calling the getOutputStream() repeadably.
while 条件不起作用,您需要在使用前检查 -1。并且请为输出流使用一个临时变量,它更易于阅读并且可以安全地重复调用 getOutputStream()。
OutputStream outStream = resp.getOutputStream();
while(true) {
int bytesRead = inputStream.read(buffer);
if (bytesRead < 0)
break;
outStream.write(buffer, 0, bytesRead);
}
inputStream.close();
out.close();
回答by rooparam
your code has an infinite loop.
你的代码有一个无限循环。
do
{
bytesRead = inputStream.read(buffer, offset, buffer.length);
resp.getOutputStream().write(buffer, 0, bytesRead);
}
while (bytesRead == buffer.length);
offsethas the same value thoughout the loop, so if initially offset = 0, it will remain so in every iteration which will cause infinite-loop and which will leads to OOM error.
offset在整个循环中具有相同的值,因此如果最初offset = 0,它将在每次迭代中保持不变,这将导致无限循环并导致 OOM 错误。
回答by zoki
Ibm websphere application server uses asynchronous data transfer for servlets by default. That means that it buffers response. If you have problems with large data and OutOfMemory exceptions, try changing settings on WAS to use synchronous mode.
默认情况下,IBM websphere 应用程序服务器对 servlet 使用异步数据传输。这意味着它缓冲响应。如果您遇到大数据和 OutOfMemory 异常的问题,请尝试更改 WAS 上的设置以使用同步模式。
Setting the WebSphere Application Server WebContainer to synchronous mode
将 WebSphere Application Server WebContainer 设置为同步方式
You must also take care of loading chunks and flush them. Sample for loading from large file.
您还必须注意加载块并刷新它们。从大文件加载的示例。
ServletOutputStream os = response.getOutputStream();
FileInputStream fis = new FileInputStream(file);
try {
int buffSize = 1024;
byte[] buffer = new byte[buffSize];
int len;
while ((len = fis.read(buffer)) != -1) {
os.write(buffer, 0, len);
os.flush();
response.flushBuffer();
}
} finally {
os.close();
}