Python 错误:不支持的格式,或损坏的文件:预期的 BOF 记录
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/16504975/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Error: Unsupported format, or corrupt file: Expected BOF record
提问by user2353003
I am trying to open a xlsx file and just print the contents of it. I keep running into this error:
我正在尝试打开一个 xlsx 文件并打印它的内容。我一直遇到这个错误:
import xlrd
book = xlrd.open_workbook("file.xlsx")
print "The number of worksheets is", book.nsheets
print "Worksheet name(s):", book.sheet_names()
print
sh = book.sheet_by_index(0)
print sh.name, sh.nrows, sh.ncols
print
print "Cell D30 is", sh.cell_value(rowx=29, colx=3)
print
for rx in range(5):
print sh.row(rx)
print
It prints out this error
它打印出这个错误
raise XLRDError('Unsupported format, or corrupt file: ' + msg)
xlrd.biffh.XLRDError: Unsupported format, or corrupt file: Expected BOF record; found '\xff\xfeT\x00i\x00m\x00'
Thanks
谢谢
采纳答案by jmcnamara
The error message relates to the BOF (Beginning of File) record of an XLS file. However, the example shows that you are trying to read an XLSX file.
该错误消息与 XLS 文件的 BOF(文件开头)记录有关。但是,该示例显示您正在尝试读取 XLSX 文件。
There are 2 possible reasons for this:
这有2个可能的原因:
- Your version of xlrd is old and doesn't support reading xlsx files.
- The XLSX file is encrypted and thus stored in the OLE Compound Document format, rather than a zip format, making it appear to xlrd as an older format XLS file.
- 您的 xlrd 版本较旧,不支持读取 xlsx 文件。
- XLSX 文件经过加密,因此以 OLE 复合文档格式(而不是 zip 格式)存储,使其在 xlrd 看来是旧格式的 XLS 文件。
Double check that you are in fact using a recent version of xlrd. Opening a new XLSX file with data in just one cell should verify that.
仔细检查您实际上使用的是最新版本的 xlrd。在一个单元格中打开一个包含数据的新 XLSX 文件应该可以验证这一点。
However, I would guess the you are encountering the second condition and that the file is encrypted since you state above that you are already using xlrd version 0.9.2.
但是,我猜您遇到了第二个条件并且文件已加密,因为您在上面声明您已经在使用 xlrd 版本 0.9.2。
XLSX files are encrypted if you explicitly apply a workbook password but also if you password protect some of the worksheet elements. As such it is possible to have an encrypted XLSX file even if you don't need a password to open it.
如果您明确应用工作簿密码以及密码保护某些工作表元素,则 XLSX 文件将被加密。因此,即使您不需要密码来打开它,也可以拥有加密的 XLSX 文件。
Update: See @BStew's, third, more probable, answer, that the file is open by Excel.
更新:请参阅@BStew 的第三个更可能的答案,该文件是由 Excel 打开的。
回答by BStew
There is also a third reason. The case when the file is already open by Excel. It generates the same error.
还有第三个原因。文件已被 Excel 打开的情况。它产生相同的错误。
回答by Mike Chan
And maybe the fourth reason, you used read_excel to read a csv file. (That't what happened to me...)
也许是第四个原因,您使用 read_excel 读取 csv 文件。(这不是发生在我身上的事情......)
回答by Pluto
You can get this error when the xlsx file is actually html; you can open it with a text editor to verify this. When I got this error I solved it using pandas:
当 xlsx 文件实际上是 html 时,您可能会收到此错误;您可以使用文本编辑器打开它来验证这一点。当我收到此错误时,我使用 Pandas 解决了它:
import pandas as pd
df_list = pd.read_html('filename.xlsx')
df = pd.DataFrame(df_list[0])
回答by Ali Khan
In my case, the issue was with the shared folder itself.
就我而言,问题出在共享文件夹本身。
CASE IN POINT: I have a shared folder on WIN2012 Server where the user drops the .xlsx file and then uses my python script to load that xlsx file into a database table.
重点案例:我在 WIN2012 服务器上有一个共享文件夹,用户在其中删除 .xlsx 文件,然后使用我的 python 脚本将该 xlsx 文件加载到数据库表中。
Even though, the user deleted the old file and put in the file that was to be loaded, the BOF error kept mentioning a byte string and the name of the user in the byte string -- no where inside of the xlsx file in any worksheet was there the name of the user. On top of it, when I copied the .xlsx into a newly created folder and ran the script referencing that new folder, it worked.
即使用户删除了旧文件并将其放入要加载的文件中,BOF 错误也不断提到字节字符串和字节字符串中的用户名称 - 在任何工作表中的 xlsx 文件中都没有那里有用户的名字。最重要的是,当我将 .xlsx 复制到新创建的文件夹中并运行引用该新文件夹的脚本时,它起作用了。
So in the end, I deleted the shared folder and realized that 5 items got deleted even though only 1 item was visible to me and the user. I think it is down to my lack of windows administration skills but that was the culprit.
所以最后,我删除了共享文件夹,并意识到即使我和用户只能看到 1 个项目,也删除了 5 个项目。我认为这是由于我缺乏 Windows 管理技能,但那是罪魁祸首。
回答by jxshen
I got the same error message. It looks so weird to me because the script works for the xlsx files under another folder and the files are almost the same.
我收到了同样的错误信息。这对我来说看起来很奇怪,因为该脚本适用于另一个文件夹下的 xlsx 文件,并且这些文件几乎相同。
I still don't know why this happened. But finally, I copied all the excel files to another folder and the script worked. An option to try if none of the above suggestions works for you...
我仍然不知道为什么会这样。但最后,我将所有 excel 文件复制到另一个文件夹,脚本工作正常。如果上述建议都不适合您,则可以尝试一个选项...
回答by ken_a
In my case, someone gave me an Excel file ending with extension ".xls". I tried parsing it with xlrd, and got this error:
就我而言,有人给了我一个以扩展名“.xls”结尾的 Excel 文件。我尝试用xlrd解析它,并得到这个错误:
xlrd.biffh.XLRDError: Unsupported format, or corrupt file: Expected BOF record; found "blar blar blar"
After working some time, I found that .xls file actually is a text file. The sender didn't bother to create a real Excel binary file but just put ".xls" to a text file.
工作一段时间后,我发现 .xls 文件实际上是一个文本文件。发件人没有费心创建一个真正的 Excel 二进制文件,而只是将“.xls”放入文本文件中。
Maybe it's worth opening the file with text editor to make sure it is an Excel file. This could have saved me one hour.
也许值得用文本编辑器打开文件以确保它是 Excel 文件。这本可以为我节省一小时。

