pandas 使用 xlwings 将整个工作表转换为熊猫数据框
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/34392805/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
A whole sheet into a panda dataframe with xlwings
提问by Coolpix
Thanks to panda, we could read a whole sheet into a data frame with the "read_excel" function.
感谢panda,我们可以使用“read_excel”函数将整张表读入数据框。
I would like to use the same method using xlwings. In fact, my Workbook is already open and I don't want to use read_excel function (witch will take too long to execute by the way) but use the power of xlwings to save into a dataframe a whole sheet.
我想使用 xlwings 使用相同的方法。事实上,我的工作簿已经打开,我不想使用 read_excel 函数(顺便说一下,女巫执行时间太长),而是使用 xlwings 的强大功能将整个工作表保存到数据帧中。
In fact with xlwings we could save a range into a dataframe. That mean I have to known the range size. But I guess there is a better (and quicker !) way to do that, isn't it ?
事实上,使用 xlwings 我们可以将一个范围保存到一个数据帧中。这意味着我必须知道范围大小。但我想有更好(更快!)的方法来做到这一点,不是吗?
Do you have some ideas to do that ? Thanks a lot !
你有这样做的想法吗?非常感谢 !
Edit : One exemple of one sheet I would like to transfer into a dataframe as read_excel would do it.
编辑:我想将一张工作表的一个例子传输到数据框中,因为 read_excel 会这样做。
Name Point Time Power Test1 Test2 Test3 Test4 ##
Test 0 1 10 4 24 144
2 20 8 48 288
3 30 12 72 432
4 40 16 96 576
5 50 20 120 720
6 60 24 144 864
7 70 28 168 1008
8 80 32 192 1152
9 90 36 216 1296
10 100 40 240 1440
11 110 44 264 1584
12 120 48 288 1728
回答by kateryna
You can use built-in convertersto bring it in one line:
您可以使用内置转换器将其放在一行中:
df = sht.range('A1').options(pd.DataFrame,
header=1,
index=False,
expand='table').value
回答by Coolpix
In fact, I could do something like that :
事实上,我可以这样做:
import xlwings as xw
import pandas as pd
def GetDataFrame(Sheet,N,M):
wb = xw.Workbook.active()
Data=xw.Range(Sheet,(1,1),(N,M)).value
Data=pd.DataFrame(Data)
Data=Data.dropna(how='all',axis=1)
Data=Data.dropna(how='all',axis=0)
return Data
回答by Mike Müller
You can read from multiple sheets with pandas:
您可以使用 Pandas 从多张工作表中读取:
excel_file = pd.ExcelFile('myfile.xls')
df1 = excel_file.parse('Sheet1')
df2 = excel_file.parse('Sheet2')
So, just open one file after the other, read from the sheets you want and process the data frames.
因此,只需一个接一个地打开文件,从您想要的工作表中读取并处理数据框。
回答by u10079791
I spent more time reading a 20M Excel using pandas.read_excel. But xlwings reads Excel very quickly. I will consider reading with xlwings and converting to a Dataframe. I think I have the same needs as the title owner. Xlwings made some adjustments during the four years. So I made some changes to the code of the first answerer. `
我花了更多时间使用 pandas.read_excel 阅读 20M 的 Excel。但是 xlwings 读取 Excel 的速度非常快。我会考虑使用 xlwings 阅读并转换为 Dataframe。我想我和产权所有者有同样的需求。Xlwings在这四年中做了一些调整。所以我对第一个回答者的代码做了一些修改。`
import xlwings as xw
import pandas as pd
def GetDataFrame(wb_file,Sheets_i,N,M):
wb = xw.books(wb_file) #open your workbook
#Specify the value of the cell of the worksheet
Data=wb.sheets[Sheets_i].range((1,1),(N,M)).value
Data=pd.DataFrame(Data)
Data=Data.dropna(how='all',axis=1)
Data=Data.dropna(how='all',axis=0)
return Data
`
`
回答by TheLittleNaruto
xlwings does provide api to load whole sheet. To do that, use used_range
api which reads whole used part of the sheet. (Of course we don't want to get unused rows values, do we? ;-))
Anyway here is a snippet code on how to do it:
xlwings 确实提供了 api 来加载整张纸。为此,请使用used_range
api 读取工作表的整个使用部分。(当然,我们不想获得未使用的行值,是吗?;-)) 不管怎样,这里有一段关于如何做到这一点的代码片段:
import pandas as pd
import xlwings as xw
workbook = xw.Book('some.xlsx')
sheet1 = workbook.sheets['sheet1'].used_range.value
df = pd.DataFrame(sheet1)
That's all.
就这样。