Java NIO FileChannel 与 FileOutputstream 的性能/实用性
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/1605332/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Java NIO FileChannel versus FileOutputstream performance / usefulness
提问by Keshav
I am trying to figure out if there is any difference in performance (or advantages) when we use nio FileChannel
versus normal FileInputStream/FileOuputStream
to read and write files to filesystem. I observed that on my machine both perform at the same level, also many times the FileChannel
way is slower. Can I please know more details comparing these two methods. Here is the code I used, the file that I am testing with is around 350MB
. Is it a good option to use NIO based classes for File I/O, if I am not looking at random access or other such advanced features?
我试图弄清楚当我们使用 nioFileChannel
与普通FileInputStream/FileOuputStream
来读取和写入文件到文件系统时,性能(或优势)是否有任何差异。我观察到,在我的机器上,两者的性能都处于同一水平,但很多时候FileChannel
速度较慢。我能否了解比较这两种方法的更多细节。这是我使用的代码,我正在测试的文件是350MB
. 如果我不考虑随机访问或其他此类高级功能,将基于 NIO 的类用于文件 I/O 是否是一个不错的选择?
package trialjavaprograms;
import java.io.File;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.InputStream;
import java.nio.ByteBuffer;
import java.nio.channels.FileChannel;
public class JavaNIOTest {
public static void main(String[] args) throws Exception {
useNormalIO();
useFileChannel();
}
private static void useNormalIO() throws Exception {
File file = new File("/home/developer/test.iso");
File oFile = new File("/home/developer/test2");
long time1 = System.currentTimeMillis();
InputStream is = new FileInputStream(file);
FileOutputStream fos = new FileOutputStream(oFile);
byte[] buf = new byte[64 * 1024];
int len = 0;
while((len = is.read(buf)) != -1) {
fos.write(buf, 0, len);
}
fos.flush();
fos.close();
is.close();
long time2 = System.currentTimeMillis();
System.out.println("Time taken: "+(time2-time1)+" ms");
}
private static void useFileChannel() throws Exception {
File file = new File("/home/developer/test.iso");
File oFile = new File("/home/developer/test2");
long time1 = System.currentTimeMillis();
FileInputStream is = new FileInputStream(file);
FileOutputStream fos = new FileOutputStream(oFile);
FileChannel f = is.getChannel();
FileChannel f2 = fos.getChannel();
ByteBuffer buf = ByteBuffer.allocateDirect(64 * 1024);
long len = 0;
while((len = f.read(buf)) != -1) {
buf.flip();
f2.write(buf);
buf.clear();
}
f2.close();
f.close();
long time2 = System.currentTimeMillis();
System.out.println("Time taken: "+(time2-time1)+" ms");
}
}
采纳答案by Stu Thompson
My experience with larger files sizes has been that java.nio
is faster than java.io
. Solidly faster.Like in the >250% range. That said, I am eliminating obvious bottlenecks, which I suggest your micro-benchmark might suffer from. Potential areas for investigating:
我对较大文件大小的经验java.nio
是比java.io
. 确实更快。就像在 >250% 的范围内一样。也就是说,我正在消除明显的瓶颈,我建议您的微基准测试可能会受到影响。潜在的调查领域:
The buffer size.The algorithm you basically have is
缓冲区大小。你基本上拥有的算法是
- copy from disk to buffer
- copy from buffer to disk
- 从磁盘复制到缓冲区
- 从缓冲区复制到磁盘
My own experience has been that this buffer size is ripefor tuning. I've settled on 4KB for one part of my application, 256KB for another. I suspect your code is suffering with such a large buffer. Run some benchmarks with buffers of 1KB, 2KB, 4KB, 8KB, 16KB, 32KB and 64KB to prove it to yourself.
我自己的经验是,这个缓冲区的大小是成熟的调整。我已经确定应用程序的一部分为 4KB,另一部分为 256KB。我怀疑您的代码正在遭受如此大的缓冲区的困扰。用 1KB、2KB、4KB、8KB、16KB、32KB 和 64KB 的缓冲区运行一些基准测试来证明这一点。
Don't perform java benchmarks that read and write to the same disk.
不要执行读取和写入同一磁盘的 Java 基准测试。
If you do, then you are really benchmarking the disk, and not Java. I would also suggest that if your CPU is not busy, then you are probably experiencing some other bottleneck.
如果你这样做了,那么你实际上是在对磁盘进行基准测试,而不是 Java。我还建议,如果您的 CPU 不忙,那么您可能遇到了其他一些瓶颈。
Don't use a buffer if you don't need to.
如果不需要,请不要使用缓冲区。
Why copy to memory if your target is another disk or a NIC? With larger files, the latency incured is non-trivial.
如果您的目标是另一个磁盘或网卡,为什么要复制到内存?对于较大的文件,产生的延迟是非常重要的。
Like other have said, use FileChannel.transferTo()
or FileChannel.transferFrom()
. The key advantage here is that the JVM uses the OS's access to DMA (Direct Memory Access), if present. (This is implementation dependent, but modern Sun and IBM versions on general purpose CPUs are good to go.)What happens is the data goes straight to/from disc, to the bus, and then to the destination... bypassing any circuit through RAM or the CPU.
就像其他人说的那样,使用FileChannel.transferTo()
or FileChannel.transferFrom()
。这里的关键优势是 JVM 使用操作系统对 DMA(直接内存访问)的访问(如果存在)。(这取决于实现,但通用 CPU 上的现代 Sun 和 IBM 版本很好。)发生的情况是数据直接进/出磁盘,到总线,然后到目的地......绕过任何电路RAM 或 CPU。
The web app I spent my days and night working on is very IO heavy. I've done micro benchmarks and real-world benchmarks too. And the results are up on my blog, have a look-see:
我日以继夜地工作的 Web 应用程序的 IO 量非常大。我也做过微基准测试和现实世界的基准测试。结果在我的博客上,看看:
- Real world performance metrics: java.io vs. java.nio
- Real world performance metrics: java.io vs. java.nio (The Sequel)
Use production data and environments
使用生产数据和环境
Micro-benchmarks are prone to distortion. If you can, make the effort to gather data from exactly what you plan to do, with the load you expect, on the hardware you expect.
微基准很容易失真。如果可以,请努力从您计划执行的操作、您期望的负载、您期望的硬件上收集数据。
My benchmarks are solid and reliable because they took place on a production system, a beefy system, a system under load, gathered in logs. Notmy notebook's 7200 RPM 2.5" SATA drive while I watched intensely as the JVM work my hard disc.
我的基准测试是可靠的,因为它们发生在一个生产系统、一个强大的系统、一个负载下的系统上,收集在日志中。 不是我的笔记本电脑的 7200 RPM 2.5" SATA 驱动器,而我密切注视着 JVM 工作我的硬盘。
What are you running on? It matters.
你在跑什么?这很重要。
回答by tangens
My experience is, that NIO is much faster with small files. But when it comes to large files FileInputStream/FileOutputStream is much faster.
我的经验是,NIO 在处理小文件时要快得多。但是当涉及到大文件时 FileInputStream/FileOutputStream 要快得多。
回答by uckelman
If the thing you want to compare is performance of file copying, then for the channel test you should do this instead:
如果您要比较的是文件复制的性能,那么对于通道测试,您应该这样做:
final FileInputStream inputStream = new FileInputStream(src);
final FileOutputStream outputStream = new FileOutputStream(dest);
final FileChannel inChannel = inputStream.getChannel();
final FileChannel outChannel = outputStream.getChannel();
inChannel.transferTo(0, inChannel.size(), outChannel);
inChannel.close();
outChannel.close();
inputStream.close();
outputStream.close();
This won't be slower than buffering yourself from one channel to the other, and will potentially be massively faster. According to the Javadocs:
这不会比从一个通道缓冲到另一个通道慢,而且可能会快得多。根据 Javadocs:
Many operating systems can transfer bytes directly from the filesystem cache to the target channel without actually copying them.
许多操作系统可以将字节直接从文件系统缓存传输到目标通道,而无需实际复制它们。
回答by J?rn Horstmann
I tested the performance of FileInputStream vs. FileChannel for decoding base64 encoded files. In my experients I tested rather large file and traditional io was alway a bit faster than nio.
我测试了 FileInputStream 与 FileChannel 在解码 base64 编码文件方面的性能。在我的经验中,我测试了相当大的文件,而传统的 io 总是比 nio 快一点。
FileChannel might have had an advantage in prior versions of the jvm because of synchonization overhead in several io related classes, but modern jvm are pretty good at removing unneeded locks.
由于多个 io 相关类的同步开销,FileChannel 在之前版本的 jvm 中可能具有优势,但现代 jvm 非常擅长删除不需要的锁。
回答by Erkki Nokso-Koivisto
Based on my tests (Win7 64bit, 6GB RAM, Java6), NIO transferFrom is fast only with small files and becomes very slow on larger files. NIO databuffer flip always outperforms standard IO.
根据我的测试(Win7 64 位、6GB RAM、Java6),NIO transferFrom 仅在处理小文件时速度很快,而在处理大文件时变得非常慢。NIO 数据缓冲区翻转始终优于标准 IO。
Copying 1000x2MB
- NIO (transferFrom) ~2300ms
- NIO (direct datababuffer 5000b flip) ~3500ms
- Standard IO (buffer 5000b) ~6000ms
Copying 100x20mb
- NIO (direct datababuffer 5000b flip) ~4000ms
- NIO (transferFrom) ~5000ms
- Standard IO (buffer 5000b) ~6500ms
Copying 1x1000mb
- NIO (direct datababuffer 5000b flip) ~4500s
- Standard IO (buffer 5000b) ~7000ms
- NIO (transferFrom) ~8000ms
复制 1000x2MB
- NIO (transferFrom) ~2300ms
- NIO(直接datababuffer 5000b翻转)~3500ms
- 标准 IO(缓冲区 5000b)~6000ms
复制 100x20mb
- NIO(直接datababuffer 5000b翻转)~4000ms
- NIO (transferFrom) ~5000ms
- 标准 IO(缓冲区 5000b)~6500ms
复制 1x1000mb
- NIO(直接datababuffer 5000b翻转)~4500s
- 标准 IO(缓冲区 5000b)~7000ms
- NIO (transferFrom) ~8000ms
The transferTo() method works on chunks of a file; wasn't intended as a high-level file copy method: How to copy a large file in Windows XP?
transferTo() 方法处理文件的块;并非旨在作为高级文件复制方法: 如何在 Windows XP 中复制大文件?
回答by eckes
If you are not using the transferTo feature or non-blocking features you will not notice a difference between traditional IO and NIO(2) because the traditional IO maps to NIO.
如果您不使用 transferTo 功能或非阻塞功能,您将不会注意到传统 IO 和 NIO(2) 之间的区别,因为传统 IO 映射到 NIO。
But if you can use the NIO features like transferFrom/To or want to use Buffers, then of course NIO is the way to go.
但是如果你可以使用像 transferFrom/To 这样的 NIO 特性或者想要使用 Buffers,那么当然 NIO 是要走的路。
回答by antak
Answering the "usefulness" part of the question:
回答问题的“有用性”部分:
One rather subtle gotcha of using FileChannel
over FileOutputStream
is that performing any of its blocking operations (e.g. read()
or write()
) from a thread that's in interrupted statewill cause the channel to close abruptly with java.nio.channels.ClosedByInterruptException
.
使用FileChannel
over 的一个相当微妙的问题FileOutputStream
是,从处于中断状态的线程执行其任何阻塞操作(例如read()
或write()
)将导致通道突然关闭。java.nio.channels.ClosedByInterruptException
Now, this could be a good thing if whatever the FileChannel
was used for is part of the thread's main function, and design took this into account.
现在,如果FileChannel
用于的任何内容是线程主要功能的一部分,那么这可能是一件好事,并且设计考虑到了这一点。
But it could also be pesky if used by some auxiliary feature such as a logging function. For example, you can find your logging output suddenly closed if the logging function happens to be called by a thread that's also interrupted.
但如果被某些辅助功能(例如日志记录功能)使用,它也可能令人讨厌。例如,如果日志函数恰好被一个也被中断的线程调用,您会发现日志输出突然关闭。
It's unfortunate this is so subtle because not accounting for this can lead to bugs that affect write integrity.[1][2]
不幸的是,这是如此微妙,因为不考虑这一点可能会导致影响写入完整性的错误。[1][2]