如何将 Hadoop Path 对象转换为 Java File 对象

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/3444313/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-10-30 01:53:39  来源:igfitidea点击:

How to convert a Hadoop Path object into a Java File object

javafile-iopathhadoop

提问by akintayo

Is there a way to change a valid and existing Hadoop Path object into a useful Java File object. Is there a nice way of doing this or do I need to bludgeon to code into submission? The more obvious approaches don't work, and it seems like it would be a common bit of code

有没有办法将有效且现有的 Hadoop 路径对象更改为有用的 Java 文件对象。有没有一种很好的方法来做到这一点,或者我是否需要大刀阔斧地提交代码?更明显的方法不起作用,它似乎是一个常见的代码

void func(Path p) {
  if (p.isAbsolute()) {
     File f = new File(p.toURI());
  }
}

This doesn't work because Path::toURI() returns the "hdfs" identifier and Java's File(URI uri) constructor only recognizes the "file" identifier.

这不起作用,因为 Path::toURI() 返回“hdfs”标识符,而 Java 的 File(URI uri) 构造函数只识别“file”标识符。

Is there a way to get Path and File to work together?

有没有办法让路径和文件一起工作?

**

**

Ok, how about a specific limited example.

好的,具体的有限示例怎么样。

Path[] paths = DistributedCache.getLocalCacheFiles(job);

DistributedCache is supposed to provide a localized copy of a file, but it returns a Path. I assume that DistributedCache make a local copy of the file, where they are on the same disk. Given this limited example, where hdfs is hopefully not in the equation, is there a way for me to reliably convert a Path into a File?

DistributedCache 应该提供文件的本地化副本,但它返回一个路径。我假设 DistributedCache 制作文件的本地副本,它们位于同一磁盘上。鉴于这个有限的例子,其中 hdfs 希望不在等式中,有没有办法让我可靠地将路径转换为文件?

**

**

采纳答案by Andrzej Doyle

Not that I'm aware of.

不是我所知道的。

To my understanding, a Pathin Hadoop represents an identifier for a node in their distributed filesystem. This is a different abstraction from a java.io.File, which represents a node on the local filesystem. It's unlikely that a Pathcouldeven have a Filerepresentation that would behave equivalently, because the underlying models are fundamentally different.

据我了解,PathHadoop 中的 a 代表其分布式文件系统中节点的标识符。这是与 a 不同的抽象,ajava.io.File代表本地文件系统上的一个节点。a甚至不太Path可能有一个File表现相同的表示,因为底层模型根本不同。

Hence the lack of translation. I presume by your assertion that Fileobjects are "[more] useful", you want an object of this class in order to use existing library methods? For the reasons above, this isn't going to work very well. If it's your own library, you could rewrite it to work cleanly with Hadoop Paths and then convert any Files into Path objects (this direction works as Paths are a strict superset of Files). If it's a third party library then you're out of luck; the authors of that method didn't take into account the effects of a distributed filesystem and only wrote that method to work on plain old local files.

因此缺乏翻译。根据你的断言,File对象是“[更多] 有用的”,你想要这个类的对象以便使用现有的库方法?由于上述原因,这不会很好地工作。如果它是您自己的库,您可以重写它以与 Hadoop Paths 一起工作,然后将任何 Files 转换为 Path 对象(此方向有效,因为 Paths 是文件的严格超集)。如果它是第三方库,那么你就不走运了;该方法的作者没有考虑分布式文件系统的影响,只编写了该方法来处理普通的旧本地文件。

回答by Eli

I recently had this same question, and there really is a way to get a file from a path, but it requires downloading the file temporarily. Obviously, this won't be suitable for many tasks, but if time and space aren't essential for you, and you just need something to work using files from Hadoop, do something like the following:

我最近遇到了同样的问题,确实有一种方法可以从路径中获取文件,但它需要临时下载文件。显然,这不适用于许多任务,但如果时间和空间对您来说不是必需的,而您只需要使用 Hadoop 中的文件来工作,请执行以下操作:

import java.io.File;
import java.io.IOException;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;

public final class PathToFileConverter {
    public static File makeFileFromPath(Path some_path, Configuration conf) throws IOException {
        FileSystem fs = FileSystem.get(some_path.toUri(), conf);
        File temp_data_file = File.createTempFile(some_path.getName(), "");
        temp_data_file.deleteOnExit();
        fs.copyToLocalFile(some_path, new Path(temp_data_file.getAbsolutePath()));
        return temp_data_file;
    }
}

回答by James Gawron

If you get a LocalFileSystem

如果你有一个 LocalFileSystem

final LocalFileSystem localFileSystem = FileSystem.getLocal(configuration);

You can pass your hadoop Path object to localFileSystem.pathToFile

您可以将您的 hadoop Path 对象传递给 localFileSystem.pathToFile

final File localFile = localFileSystem.pathToFile(<your hadoop Path>);