pandas 使用 xlwings 将整个工作表转换为熊猫数据框

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/34392805/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 00:24:45  来源:igfitidea点击:

A whole sheet into a panda dataframe with xlwings

pythonexcelpandasxlwings

提问by Coolpix

Thanks to panda, we could read a whole sheet into a data frame with the "read_excel" function.

感谢panda,我们可以使用“read_excel”函数将整张表读入数据框。

I would like to use the same method using xlwings. In fact, my Workbook is already open and I don't want to use read_excel function (witch will take too long to execute by the way) but use the power of xlwings to save into a dataframe a whole sheet.

我想使用 xlwings 使用相同的方法。事实上,我的工作簿已经打开,我不想使用 read_excel 函数(顺便说一下,女巫执行时间太长),而是使用 xlwings 的强大功能将整个工作表保存到数据帧中。

In fact with xlwings we could save a range into a dataframe. That mean I have to known the range size. But I guess there is a better (and quicker !) way to do that, isn't it ?

事实上,使用 xlwings 我们可以将一个范围保存到一个数据帧中。这意味着我必须知道范围大小。但我想有更好(更快!)的方法来做到这一点,不是吗?

Do you have some ideas to do that ? Thanks a lot !

你有这样做的想法吗?非常感谢 !

Edit : One exemple of one sheet I would like to transfer into a dataframe as read_excel would do it.

编辑:我想将一张工作表的一个例子传输到数据框中,因为 read_excel 会这样做。

Name Point  Time    Power   Test1   Test2   Test3   Test4 ##
Test    0   1   10  4   24  144
        2   20  8   48  288
        3   30  12  72  432
        4   40  16  96  576
        5   50  20  120 720
        6   60  24  144 864
        7   70  28  168 1008
        8   80  32  192 1152
        9   90  36  216 1296
        10  100 40  240 1440
        11  110 44  264 1584
        12  120 48  288 1728

回答by kateryna

You can use built-in convertersto bring it in one line:

您可以使用内置转换器将其放在一行中:

df = sht.range('A1').options(pd.DataFrame, 
                             header=1,
                             index=False, 
                             expand='table').value

回答by Coolpix

In fact, I could do something like that :

事实上,我可以这样做:

import xlwings as xw
import pandas as pd

def GetDataFrame(Sheet,N,M):
    wb = xw.Workbook.active()
    Data=xw.Range(Sheet,(1,1),(N,M)).value
    Data=pd.DataFrame(Data)
    Data=Data.dropna(how='all',axis=1)
    Data=Data.dropna(how='all',axis=0)
    return Data

回答by Mike Müller

You can read from multiple sheets with pandas:

您可以使用 Pandas 从多张工作表中读取:

excel_file = pd.ExcelFile('myfile.xls')
df1 = excel_file.parse('Sheet1')
df2 = excel_file.parse('Sheet2') 

So, just open one file after the other, read from the sheets you want and process the data frames.

因此,只需一个接一个地打开文件,从您想要的工作表中读取并处理数据框。

回答by u10079791

I spent more time reading a 20M Excel using pandas.read_excel. But xlwings reads Excel very quickly. I will consider reading with xlwings and converting to a Dataframe. I think I have the same needs as the title owner. Xlwings made some adjustments during the four years. So I made some changes to the code of the first answerer. `

我花了更多时间使用 pandas.read_excel 阅读 20M 的 Excel。但是 xlwings 读取 Excel 的速度非常快。我会考虑使用 xlwings 阅读并转换为 Dataframe。我想我和产权所有者有同样的需求。Xlwings在这四年中做了一些调整。所以我对第一个回答者的代码做了一些修改。`

import xlwings as xw
import pandas as pd

def GetDataFrame(wb_file,Sheets_i,N,M):
    wb = xw.books(wb_file)   #open your workbook
         #Specify the value of the cell of the worksheet
    Data=wb.sheets[Sheets_i].range((1,1),(N,M)).value  
    Data=pd.DataFrame(Data)
    Data=Data.dropna(how='all',axis=1)
    Data=Data.dropna(how='all',axis=0)
    return Data

`

`

回答by TheLittleNaruto

xlwings does provide api to load whole sheet. To do that, use used_rangeapi which reads whole used part of the sheet. (Of course we don't want to get unused rows values, do we? ;-)) Anyway here is a snippet code on how to do it:

xlwings 确实提供了 api 来加载整张纸。为此,请使用used_rangeapi 读取工作表的整个使用部分。(当然,我们不想获得未使用的行值,是吗?;-)) 不管怎样,这里有一段关于如何做到这一点的代码片段:

import pandas as pd
import xlwings as xw

workbook = xw.Book('some.xlsx')
sheet1 = workbook.sheets['sheet1'].used_range.value
df = pd.DataFrame(sheet1)

That's all.

就这样。