在 Pandas Python 中读取 XLSB 文件

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/45019778/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 16:37:44  来源:igfitidea点击:

Read XLSB File in Pandas Python

pythonpandas

提问by Gayatri

There are many questions on this, but there has been no simple answer on how to read an xlsb file into pandas. Is there an easy way to do this?

关于这个问题有很多,但是关于如何将 xlsb 文件读入 Pandas 并没有简单的答案。是否有捷径可寻?

采纳答案by Glen Thompson

With the 1.0.0release of pandas - January 29, 2020, support for binary Excel files was added.

随着1.0.0pandas -的发布,January 29, 2020增加了对二进制 Excel 文件的支持。

import pandas as pd
df = pd.read_excel('path_to_file.xlsb', engine='pyxlsb')

Notes:

笔记:

  • You will need to upgrade pandas - pip3 install pandas --upgrade
  • You will need to install pyxlsb- pip3 install pyxlsb
  • 您将需要升级熊猫 - pip3 install pandas --upgrade
  • 您将需要安装pyxlsb-pip3 install pyxlsb

回答by Finrod Felagund

Hi actually there is a way. Just use pyxlsb library.

你好,其实是有办法的。只需使用 pyxlsb 库。

import pandas as pd
from pyxlsb import open_workbook as open_xlsb

df = []

with open_xlsb('some.xlsb') as wb:
    with wb.get_sheet(1) as sheet:
        for row in sheet.rows():
            df.append([item.v for item in row])

df = pd.DataFrame(df[1:], columns=df[0])

UPDATE: as of pandas version 1.0 read_excel() now can read binary Excel (.xlsb) files by passing engine='pyxlsb'

更新:从 pandas 1.0 版开始 read_excel() 现在可以通过传递 engine='pyxlsb' 来读取二进制 Excel (.xlsb) 文件

Source: https://pandas.pydata.org/pandas-docs/version/1.0.0/whatsnew/v1.0.0.html

来源:https: //pandas.pydata.org/pandas-docs/version/1.0.0/whatsnew/v1.0.0.html

回答by gmar

Pyxlsb indeed is an option to read xlsb file, however, is rather limited.

Pyxlsb 确实是一个读取 xlsb 文件的选项,但是,它是相当有限的。

I suggest using the xlwings package which makes it possible to read and write xlsb files without losing sheet formating, formulas, etc. in the xlsb file. There is extensive documentation available.

我建议使用 xlwings 包,它可以读取和写入 xlsb 文件而不会丢失 xlsb 文件中的表格格式、公式等。有大量可用的文档。

import pandas as pd
import xlwings as xw

app = xw.App()
book = xw.Book('file.xlsb')
sheet = book.sheets('sheet_name')
df = sheet.range('A1').options(pd.DataFrame, expand='table').value
book.close()
app.kill()

'A1' in this case is the starting position of the excel table. To write to xlsb file, simply write:

在这种情况下,'A1' 是 Excel 表格的起始位置。要写入 xlsb 文件,只需编写:

sheet.range('A1').value = df

回答by Rishabh Kaushik

If you want to read a big binary file or any excel file with some ranges you can directly put at this code

如果你想读取一个大的二进制文件或任何具有某些范围的 Excel 文件,你可以直接输入这个代码

range = (your_index_number)
first_dataframe = []
second_dataframe = []
with open_xlsb('Test.xlsb') as wb:
    with wb.get_sheet('Sheet1') as sheet:
        i=0
        for row in sheet.rows():
            if(i!=range):
                first_dataframe.append([item.v for item in row])
                i=i+1
            else:
                second_dataframe.append([item.v for item in row])


first_dataframe = pd.DataFrame(first_dataframe[1:], columns=first[0])
second_dataframe = pd.DataFrame(second_dataframe[:], columns=first.columns)