Java Files.walk(),计算总大小
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/22867286/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Files.walk(), calculate total size
提问by Aksel Willgert
I'm trying to calculate the size of the files on my disc. In java-7 this could be done using Files.walkFileTreeas shown in my answer here.
我正在尝试计算光盘上文件的大小。在 java-7 中,这可以使用Files.walkFileTree来完成,如我在此处的回答所示。
However if i wanted to do this using java-8 streams it will work for some folders, but not for all.
但是,如果我想使用 java-8 流执行此操作,它将适用于某些文件夹,但不适用于所有文件夹。
public static void main(String[] args) throws IOException {
long size = Files.walk(Paths.get("c:/")).mapToLong(MyMain::count).sum();
System.out.println("size=" + size);
}
static long count(Path path) {
try {
return Files.size(path);
} catch (IOException | UncheckedIOException e) {
return 0;
}
}
Above code will work well for path a:/files/
but for c:/
it will throw below exception
上面的代码适用于路径,a:/files/
但c:/
它会抛出以下异常
Exception in thread "main" java.io.UncheckedIOException: java.nio.file.AccessDeniedException: c:$Recycle.Bin\S-1-5-20
at java.nio.file.FileTreeIterator.fetchNextIfNeeded(Unknown Source)
at java.nio.file.FileTreeIterator.hasNext(Unknown Source)
at java.util.Iterator.forEachRemaining(Unknown Source)
at java.util.Spliterators$IteratorSpliterator.forEachRemaining(Unknown Source)
at java.util.stream.AbstractPipeline.copyInto(Unknown Source)
at java.util.stream.AbstractPipeline.wrapAndCopyInto(Unknown Source)
at java.util.stream.ReduceOps$ReduceOp.evaluateSequential(Unknown Source)
at java.util.stream.AbstractPipeline.evaluate(Unknown Source)
at java.util.stream.LongPipeline.reduce(Unknown Source)
at java.util.stream.LongPipeline.sum(Unknown Source)
at MyMain.main(MyMain.java:16)
I understand where it is coming from and how to avoid it using Files.walkFileTree API.
我了解它的来源以及如何使用 Files.walkFileTree API 避免它。
But how can this exception be avoided using Files.walk()API?
但是如何使用Files.walk()API避免这个异常呢?
采纳答案by skiwi
No, this exception cannot be avoided.
不,这个例外是无法避免的。
The exception itself occurs inside the the lazy fetch of Files.walk()
, hence why you are not seeing it early and why there is no way to circumvent it, consider the following code:
异常本身发生在 的lazy fetch 中Files.walk()
,因此为什么您没有及早看到它以及为什么没有办法绕过它,请考虑以下代码:
long size = Files.walk(Paths.get("C://"))
.peek(System.out::println)
.mapToLong(this::count)
.sum();
On my system this will print on my computer:
在我的系统上,这将在我的计算机上打印:
C:\
C:$Recycle.Bin
Exception in thread "main" java.io.UncheckedIOException: java.nio.file.AccessDeniedException: C:$Recycle.Bin\S-1-5-18
And as an exception is thrown on the (main) thread on the third file, all further executions on that thread stop.
当在第三个文件的(主)线程上抛出异常时,该线程上的所有进一步执行都会停止。
I believe this is a design failure, because as it stands now Files.walk
is absolutely unusable, because you never can guarantee that there will be no errors when walking over a directory.
我相信这是一个设计失败,因为就目前而言Files.walk
是绝对无法使用的,因为您永远无法保证在遍历目录时不会出现错误。
One important point to notice is that the stacktrace includes a sum()
and reduce()
operation, this is because the path is being lazily loaded, so at the point of reduce()
, the bulk of stream machinery gets called (visible in stacktrace), and then it fetches the path, at which point the UnCheckedIOException
occurs.
需要注意的一个重要点是堆栈跟踪包含一个sum()
andreduce()
操作,这是因为路径被延迟加载,所以在 点reduce()
,大量流机器被调用(在堆栈跟踪中可见),然后它获取路径,在这一点上UnCheckedIOException
发生。
It could possiblybe circumvented if you let every walking operation execute on their own thread. But that is not something you would want to be doing anyway.
如果您让每个行走操作都在它们自己的线程上执行,则可能会绕过它。但这无论如何都不是您想要做的事情。
Also, checking if a file is actually accessible is worthless(though useful to some extent), because you can not guarantee that it is readable even 1ms later.
此外,检查文件是否实际可访问是没有价值的(尽管在某种程度上有用),因为即使 1 毫秒后您也无法保证它是可读的。
Future extension
未来扩展
I believe it can still be fixed, though I do not know how FileVisitOption
s exactly work.
Currently there is a FileVisitOption.FOLLOW_LINKS
, if it operates on a per file basis, then I would suspect that a FileVisitOption.IGNORE_ON_IOEXCEPTION
could also be added, however we cannot correctly inject that functionality in there.
我相信它仍然可以修复,尽管我不知道它FileVisitOption
到底是如何工作的。
当前有一个FileVisitOption.FOLLOW_LINKS
, 如果它在每个文件的基础上运行,那么我怀疑FileVisitOption.IGNORE_ON_IOEXCEPTION
也可以添加 a ,但是我们无法在其中正确注入该功能。
回答by Anthony Accioly
The short answer is you can't.
简短的回答是你不能。
The exception is coming from FileTreeWalker.visit
.
异常来自FileTreeWalker.visit
.
To be precise, it is trying to build a newDirectoryStream
when it fails (this code is out of your control):
准确地说,它试图newDirectoryStream
在失败时构建一个(这段代码是你无法控制的):
// file is a directory, attempt to open it
DirectoryStream<Path> stream = null;
try {
stream = Files.newDirectoryStream(entry);
} catch (IOException ioe) {
return new Event(EventType.ENTRY, entry, ioe); // ==> Culprit <==
} catch (SecurityException se) {
if (ignoreSecurityException)
return null;
throw se;
}
Maybe you should submit a bug.
也许你应该提交一个错误。
回答by Andrejs
I found that using Guava's Files class solved the issue for me:
我发现使用 Guava 的 Files 类为我解决了这个问题:
Iterable<File> files = Files.fileTreeTraverser().breadthFirstTraversal(dir);
long size = toStream( files ).mapToLong( File::length ).sum();
Where toStream
is my static utility function to convert an Iterable to a Stream. Just this:
toStream
我的将 Iterable 转换为 Stream 的静态实用程序函数在哪里。只是这个:
StreamSupport.stream(iterable.spliterator(), false);
回答by Abhishek Dujari
2017 for those who keep arriving here.
2017 年对于那些不断来到这里的人。
Use Files.walk() whenyou are certain of the file system behaviour and really want to stop when there is any error. Generally Files.walk is not useful in standalone apps. I make this mistake so often, perhaps I am lazy. I realize my mistake the moment I see the time taken lasting more than a few seconds for something small like 1 million files.
当您确定文件系统行为并且确实想在出现任何错误时停止时,请使用 Files.walk()。通常 Files.walk 在独立应用程序中没有用。我经常犯这个错误,也许我很懒。当我看到处理 100 万个文件之类的小文件所花费的时间超过几秒钟时,我意识到自己的错误。
I recommend walkFileTree
. Start by implementing the FileVisitor interface, here I only want to count files. Bad class name, I know.
我推荐walkFileTree
。首先实现FileVisitor接口,这里我只想统计文件。不好的班级名称,我知道。
class Recurse implements FileVisitor<Path>{
private long filesCount;
@Override
public FileVisitResult preVisitDirectory(Path dir, BasicFileAttributes attrs) throws IOException {
return FileVisitResult.CONTINUE;
}
@Override
public FileVisitResult visitFile(Path file, BasicFileAttributes attrs) throws IOException {
//This is where I need my logic
filesCount++;
return FileVisitResult.CONTINUE;
}
@Override
public FileVisitResult visitFileFailed(Path file, IOException exc) throws IOException {
// This is important to note. Test this behaviour
return FileVisitResult.CONTINUE;
}
@Override
public FileVisitResult postVisitDirectory(Path dir, IOException exc) throws IOException {
return FileVisitResult.CONTINUE;
}
public long getFilesCount() {
return filesCount;
}
}
Then use your defined Class like this.
然后像这样使用你定义的类。
Recurse r = new Recurse();
Files.walkFileTree(Paths.get("G:"), r);
System.out.println("Total files: " + r.getFilesCount());
I am sure you know how to modify your own class'es implementation of the FileVisitor<Path>
Interface class to do other things like filesize
with the example I posted. Refer to the docs for other methods in this
我相信你知道如何修改你自己的FileVisitor<Path>
类的接口类实现来做其他事情filesize
,比如我发布的例子。参考文档中的其他方法
Speed:
速度:
- Files.walk : 20+ minutes and failing with exception
- Files.walkFileTree: 5.6 seconds, done with perfect answer.
- Files.walk : 20+ 分钟,但异常失败
- Files.walkFileTree:5.6 秒,完美回答。
Edit: As with everything, use tests to confirm the behaviour Handle Exceptions, they do still occur except for the ones we choose not to care about as above.
编辑:与所有事情一样,使用测试来确认处理异常的行为,除了我们选择不关心的异常之外,它们仍然会发生。