将 Excel 列中的数据读入 Python 列表
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/45708626/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Read data in Excel column into Python list
提问by user3848207
I am using python xlwings to read a column of data in Excel 2013. Column A
is populated with numbers. To import this column into a python list py_list
, I have the following code;
我正在使用 python xlwings 读取 Excel 2013 中的一列数据。列A
中填充了数字。要将此列导入 python list py_list
,我有以下代码;
import xlwings as xw
wb = xw.Book('BookName.xlsm')
sht = xw.Book('SheetName')
py_list = sht.range('A2:A40').value
The above code works if the column data is populated at A2:A40
. However, the column data can keep growing. Data can grow and stretch to A2:A46
or A2:A80
. The last row is empty. It is not known at compile time how many rows of data is in this column.
如果列数据填充在A2:A40
. 但是,列数据可以保持增长。数据可以增长和延伸到A2:A46
或A2:A80
。最后一行是空的。编译时不知道该列中有多少行数据。
How can I modify the code to detect the empty cell at the last row so that the range of data can be read by py_list
?
如何修改代码以检测最后一行的空单元格,以便可以读取数据范围py_list
?
I am open to using other python libraries to read the Excel data besides xlwings. I am using python v3.6
我愿意使用其他 python 库来读取 xlwings 之外的 Excel 数据。我正在使用 python v3.6
回答by Stael
I say this a lot about reading files in from csv or excel, but I would use pandas
.
关于从 csv 或 excel 读取文件,我说了很多,但我会使用pandas
.
import pandas as pd
df = pd.read_excel('filename.xlsm', sheetname=0) # can also index sheet by name or fetch all sheets
mylist = df['column name'].tolist()
an alternative would be to use a dynamic formula using soemthing like OFFSET in excel instead of 'A2:A40'
, or perhaps a named range?
另一种方法是使用动态公式,在 excel 中使用诸如 OFFSET 之类的东西而不是'A2:A40'
,或者可能是命名范围?
回答by Bitto Bennichan
I know this is an old question, but you can also use openpyxl
我知道这是一个老问题,但你也可以使用 openpyxl
from openpyxl import load_workbook
wb = load_workbook("BookName.xlsx") # Work Book
ws = wb.get_sheet_by_name('SheetName') # Work Sheet
column = ws['A'] # Column
column_list = [column[x].value for x in range(len(column))]
Notes:
笔记:
Pandasis an awesome library, but installing it just to read an excel column into a list is an overkill IMHO.
xlrdis not maintained anymore. From the xlrd github page
PLEASE NOTE: This library currently has no active maintainers. You are advised to use OpenPyXL instead.
Pandas是一个很棒的库,但是安装它只是为了将 excel 列读入列表是一种矫枉过正的恕我直言。
xlrd不再维护。来自xlrd github 页面
请注意:这个库目前没有活跃的维护者。建议您改用 OpenPyXL。
回答by user3848207
The key to this question is finding out the number of rows in column A
.
这个问题的关键是找出 column 中的行数A
。
The number of rows can be found with this single line using xlwings below;
可以使用下面的 xlwings 找到这一行的行数;
rownum = sht.range('A1').end('down').last_cell.row
One needs to read the API documentation carefully to get the answer.
需要仔细阅读 API 文档才能得到答案。
http://docs.xlwings.org/en/stable/api.html#xlwings.Range
http://docs.xlwings.org/en/stable/api.html#xlwings.Range
Once the number of rows is found, it is easy to figure out the rest.
一旦找到行数,就很容易找出其余的行数。
回答by Rohan Chandratre
I found this as the easiest way to create lists from the entire columns in excel and it only takes the populated excel cells. import pandas as pd import numpy as np
我发现这是从 excel 中的整个列创建列表的最简单方法,它只需要填充的 excel 单元格。将熊猫导入为 pd 将 numpy 导入为 np
#Insert complete path to the excel file and index of the worksheet
df = pd.read_excel("PATH.xlsx", sheet_name=0)
# insert the name of the column as a string in brackets
list1 = list(df['Column Header 1'])
list2 = list(df['Column Header 2'])
print(list1)
print(list2)
回答by eladgl
I went through xlwings documentation to look for something, didn't find something like this, but you can always try and go around this:
我浏览了 xlwings 文档来寻找一些东西,但没有找到类似的东西,但你总是可以尝试解决这个问题:
temp = [x for x in xw.Range('A2:A200').value if x != None] #A200 just put a big number..
or I don't know try this:
或者我不知道试试这个:
from itertools import takewhile
temp =[takewhile(lambda x: x != None, xw.Range('A2:A70').value)]
while True:
try:
next(temp)
except StopIteration:
break
at line 2, at first I tried doing something like this:
在第 2 行,起初我尝试做这样的事情:
temp =[lambda x: x for x in xw.Range('D:D').values if x != None else exit()] #or to replace this with quit() but there is no option to break lambdas as far as I know
another option:
另外一个选项:
temp = iter(xw.Range('A:A').value)
list = []
a = next(temp) #depending your first cell starts at row 1
while a != None: #might want zeros or '' etc
list.append(a)
a = next(temp)