Java 将输入流转换为文件

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/22704876/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-13 17:24:00  来源:igfitidea点击:

Convert InputStream to File

javaweb-servicesjerseyinputstreamtesseract

提问by

I have a REST webservice built with Jerseythat does OCR (Optical Character Recognition) using Tesseract via the Tess4JJava binding. Now the Tess4J library expects you to send it an image file (png, jpg, tif amongst others), but with Jersey processing I get an InputStream that containsthe image.

我有一个用Jersey构建的 REST web 服务,它通过Tess4JJava 绑定使用 Tesseract 执行 OCR(光学字符识别)。现在 Tess4J 库希望你向它发送一个图像文件(png、jpg、tif 等),但是通过 Jersey 处理我得到一个包含图像的 InputStream 。

How do I convert this InputStream to a file type that Tesseract would recognise? I've tried the following:

如何将此 InputStream 转换为 Tesseract 可以识别的文件类型?我尝试了以下方法:

import org.apache.commons.io.IOUtils;

.....

private static File stream2file (InputStream in) throws IOException {            

    final File tempFile = File.createTempFile("stream2file", ".tmp");
    tempFile.deleteOnExit();

    try (FileOutputStream out = new FileOutputStream(tempFile)) {
        IOUtils.copy(in, out);
    }

    return tempFile;            
}

But then the Tesseract library throws an exception saying that it doesn't accept the file type I'm sending (Which now in this case is 'tmp'). I've tried changing that little 'tmp' to 'tif' and other supported file types but that just yielded the same results, so I'm obviously missing something here.

但是随后 Tesseract 库抛出一个异常,表示它不接受我发送的文件类型(现在在这种情况下是“tmp”)。我已经尝试将那个小 'tmp' 更改为 'tif' 和其他支持的文件类型,但这只是产生了相同的结果,所以我显然在这里遗漏了一些东西。

So how can I take an InputStream, convert it, and forward it to Tesseract as one of the supported file types that it expects?

那么我怎样才能获取 InputStream,转换它,并将它作为它期望的受支持文件类型之一转发到 Tesseract?

回答by nguyenq

The file extension of the temp file has to match that of the original input image file.

临时文件的文件扩展名必须与原始输入图像文件的文件扩展名匹配。

Besides Filetype, Tess4Jalso accepts BufferedImageas input. Just convert inputstream to it, as follows:

除了File类型,Tess4J也接受BufferedImage作为输入。只需将 inputstream 转换为它,如下所示:

BufferedImage image = ImageIO.read(is);

回答by petitsauveur

try (FileOutputStream out = new FileOutputStream(tempFile)). You have got an error at this line. You should use FileOutputStream (String)not FileOutputStream(File). So it should be FileOutputStream(tempfile.getName()).

试试(FileOutputStream out = new FileOutputStream(tempFile))。您在这一行遇到错误。你应该使用FileOutputStream (String)not FileOutputStream(File)。所以应该是FileOutputStream(tempfile.getName())

The parameter you pass to the constructor of FileOutputStreamis a string that is the path to the real file or the name of the file. It's not a Fileobject.

您传递给 的构造函数的参数FileOutputStream是一个字符串,它是真实文件的路径或文件名。它不是一个File对象。