java 无效的标题读取 xls 文件

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/13949792/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-10-31 14:32:36  来源:igfitidea点击:

Invalid header reading xls file

javaexcelapache-poixls

提问by dutchman79

I am reading one excel file on my local system. I am using POI jar Version 3.7, but getting error Invalid header signature; read -2300849302551019537 or in Hex 0xE011BDBFEFBDBFEF , expected -2226271756974174256 or in Hex 0xE11AB1A1E011CFD0.

我正在读取本地系统上的一个 excel 文件。我正在使用 POI jar 版本 3.7,但收到错误 Invalid header signature;读取 -2300849302551019537 或十六进制 0xE011BDBFEFBDBFEF ,预期 -2226271756974174256 或十六进制 0xE11AB1A1E011CFD0。

Opening the xls file with Excel works fine.

用 Excel 打开 xls 文件工作正常。

The codeblock where it happens: Anybody an idea ?

它发生的代码块:有人有想法吗?

/**
 * create a new HeaderBlockReader from an InputStream
 *
 * @param stream the source InputStream
 *
 * @exception IOException on errors or bad data
 */
public HeaderBlockReader(InputStream stream) throws IOException {
    // At this point, we don't know how big our
    //  block sizes are
    // So, read the first 32 bytes to check, then
    //  read the rest of the block
    byte[] blockStart = new byte[32];
    int bsCount = IOUtils.readFully(stream, blockStart);
    if(bsCount != 32) {
        throw alertShortRead(bsCount, 32);
    }

    // verify signature
    long signature = LittleEndian.getLong(blockStart, _signature_offset);

    if (signature != _signature) {
        // Is it one of the usual suspects?
        byte[] OOXML_FILE_HEADER = POIFSConstants.OOXML_FILE_HEADER;
        if(blockStart[0] == OOXML_FILE_HEADER[0] &&
            blockStart[1] == OOXML_FILE_HEADER[1] &&
            blockStart[2] == OOXML_FILE_HEADER[2] &&
            blockStart[3] == OOXML_FILE_HEADER[3]) {
            throw new OfficeXmlFileException("The supplied data appears to be in the Office 2007+ XML. You are calling the part of POI that deals with OLE2 Office Documents. You need to call a different part of POI to process this data (eg XSSF instead of HSSF)");
        }
        if ((signature & 0xFF8FFFFFFFFFFFFFL) == 0x0010000200040009L) {
            // BIFF2 raw stream starts with BOF (sid=0x0009, size=0x0004, data=0x00t0)
            throw new IllegalArgumentException("The supplied data appears to be in BIFF2 format.  "
                    + "POI only supports BIFF8 format");
        }

        // Give a generic error
        throw new IOException("Invalid header signature; read "
                              + longToHex(signature) + ", expected "
                              + longToHex(_signature));
    }

回答by Felix

Just an idee, if you using maven make sure in the resource tag filtering is set to false. Otherwise maven tends to corrupt xls files in the copying phase

只是一个想法,如果您使用 maven,请确保在资源标签过滤中设置为 false。否则 maven 往往会在复制阶段损坏 xls 文件

回答by Gagravarr

That exception is telling you that your file isn't a valid OLE2-based .xls file.

该异常告诉您您的文件不是基于 OLE2 的有效 .xls 文件。

Being able to open the file in Excel is no real guide - Excel will happily open any file it knows about no matter what the extension is on it. If you take a .csv file and rename it to .xls, Excel will still open it, but the renaming hasn't magically made it be in the .xls format so POI won't open it for you.

能够在 Excel 中打开文件并不是真正的指南 - 无论扩展名是什么,Excel 都会很乐意打开它知道的任何文件。如果您将 .csv 文件重命名为 .xls,Excel 仍会打开它,但重命名并没有神奇地使其成为 .xls 格式,因此 POI 不会为您打开它。

If you open the file in Excel and do Save-As, it'll let you write it out as a real Excel file. If you want to know what file it really is, try using Apache Tika- the Tika CLI with --detectought to be able to tell you

如果您在 Excel 中打开该文件并执行另存为,它会让您将其写成一个真正的 Excel 文件。如果您想知道它到底是什么文件,请尝试使用Apache Tika- Tika CLI--detect应该能够告诉您

.

.

How can I be sure it's not a valid file? If you look at the OLE2 file format specification docfrom Microsoft, and head to section 2.2you'll see the following:

我怎么能确定它不是一个有效的文件?如果您查看Microsoft的OLE2 文件格式规范文档,并前往第 2.2 节,您将看到以下内容:

Header Signature (8 bytes): Identification signature for the compound file structure, and MUST be set to the value 0xD0, 0xCF, 0x11, 0xE0, 0xA1, 0xB1, 0x1A, 0xE1.

Header Signature(8 字节):复合文件结构的标识签名,必须设置为值 0xD0、0xCF、0x11、0xE0、0xA1、0xB1、0x1A、0xE1。

Flip those bytes round (OLE2 is little endian) and you get 0xE11AB1A1E011CFD0, the magic number from the exception. Your file doesn't start with that magic number, as so really isn't a valid OLE2 document, and hence POI gives you that exception.

翻转这些字节(OLE2 是小端),您会得到 0xE11AB1A1E011CFD0,这是异常中的幻数。您的文件不以该幻数开头,因此实际上不是有效的 OLE2 文档,因此 POI 为您提供了该例外。

回答by Yaohui Wu

If your project is maven project, the following code may help:

如果您的项目是 maven 项目,以下代码可能会有所帮助:

/**
 * Get input stream of excel.
 * <p>
 *     Get excel from src dir instead of target dir to avoid causing POI header exception.
 * </p>
 * @param fileName file in dir PROJECT_PATH/src/test/resources/excel/ , proceeding '/' is not needed.
 * @return
 */
private static InputStream getExcelInputStream(String fileName){
    InputStream inputStream = null;
    try{
        inputStream = new FileInputStream(getProjectPath() + "/src/test/resources/excel/" + fileName);
    }catch (URISyntaxException uriE){
        uriE.printStackTrace();
    }catch (FileNotFoundException fileE){
        fileE.printStackTrace();
    }
    return inputStream;
}

private static String getProjectPath() throws URISyntaxException{
    URL url = YourServiceImplTest.class.getResource("/");
    Path path = Paths.get(url.toURI());
    Path subPath = path.subpath(0, path.getNameCount() -2);
    return "/" + subPath.toString();
}