Python 通过列名和工作表名获取列数据
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/38309256/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Get column data by Column name and sheet name
提问by Sillyreduction
Is there a way to access all rows in a column in a specific sheet by using python xlrd.
有没有办法使用 python xlrd 访问特定工作表中列中的所有行。
e.g:
例如:
workbook = xlrd.open_workbook('ESC data.xlsx', on_demand=True)
sheet = workbook.sheet['sheetname']
arrayofvalues = sheet['columnname']
Or do i have to create a dictionary by myself?
还是我必须自己创建一本字典?
The excel is pretty big so i would love to avoid iterating over all the colnames/sheets
excel 非常大,所以我很想避免迭代所有的列名/表
回答by John Y
Yes, you are looking for the col_values()
worksheet method. Instead of
是的,您正在寻找col_values()
工作表方法。代替
arrayofvalues = sheet['columnname']
you need to do
你需要做
arrayofvalues = sheet.col_values(columnindex)
where columnindex
is the number of the column (counting from zero, so column A is index 0, column B is index 1, etc.). If you have a descriptive heading in the first row (or first few rows) you can give a second parameter that tells which row to start from (again, counting from zero). For example, if you have one header row, and thus want values starting in the second row, you could do
其中columnindex
是列的编号(从零开始计数,因此列 A 是索引 0,列 B 是索引 1,等等)。如果您在第一行(或前几行)中有一个描述性标题,您可以提供第二个参数来告诉从哪一行开始(同样,从零开始计数)。例如,如果您有一个标题行,因此希望值从第二行开始,您可以这样做
arrayofvalues = sheet.col_values(columnindex, 1)
Please check out the tutorialfor a reasonably readable discussion of the xlrd
package. (The official xlrd
documentationis harder to read.)
请查看教程以获取有关该xlrd
包的合理可读的讨论。(官方xlrd
文档更难阅读。)
Also note that (1) while you are free to use the name arrayofvalues
, what you are really getting is a Python list, which technically isn't an array, and (2) the on_demand
workbook parameter has no effect when working with .xlsx files, which means xlrd
will attempt to load the entire workbook into memory regardless. (The on_demand
feature works for .xls files.)
另请注意,(1) 虽然您可以自由使用 name arrayofvalues
,但您真正获得的是 Python 列表,从技术上讲它不是数组,并且 (2) on_demand
workbook 参数在使用 .xlsx 文件时无效,这意味着xlrd
无论如何都会尝试将整个工作簿加载到内存中。(该on_demand
功能适用于 .xls 文件。)
回答by khelili miliana
This script allows to trasform a xls file to list of dictinnaries, all dict in list represent a row
此脚本允许将 xls 文件转换为字典列表,列表中的所有字典都代表一行
import xlrd
workbook = xlrd.open_workbook('esc_data.xlss')
workbook = xlrd.open_workbook('esc_data.xlsx', on_demand = True)
worksheet = workbook.sheet_by_index(0)
first_row = [] # Header
for col in range(worksheet.ncols):
first_row.append( worksheet.cell_value(0,col) )
# tronsform the workbook to a list of dictionnaries
data =[]
for row in range(1, worksheet.nrows):
elm = {}
for col in range(worksheet.ncols):
elm[first_row[col]]=worksheet.cell_value(row,col)
data.append(elm)
print data