Python 通过列名和工作表名获取列数据

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/38309256/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 20:37:59  来源:igfitidea点击:

Get column data by Column name and sheet name

pythonexcelxlrd

提问by Sillyreduction

Is there a way to access all rows in a column in a specific sheet by using python xlrd.

有没有办法使用 python xlrd 访问特定工作表中列中的所有行。

e.g:

例如:

workbook = xlrd.open_workbook('ESC data.xlsx', on_demand=True)
sheet = workbook.sheet['sheetname']
arrayofvalues = sheet['columnname']

Or do i have to create a dictionary by myself?

还是我必须自己创建一本字典?

The excel is pretty big so i would love to avoid iterating over all the colnames/sheets

excel 非常大,所以我很想避免迭代所有的列名/表

回答by John Y

Yes, you are looking for the col_values()worksheet method. Instead of

是的,您正在寻找col_values()工作表方法。代替

arrayofvalues = sheet['columnname']

you need to do

你需要做

arrayofvalues = sheet.col_values(columnindex)

where columnindexis the number of the column (counting from zero, so column A is index 0, column B is index 1, etc.). If you have a descriptive heading in the first row (or first few rows) you can give a second parameter that tells which row to start from (again, counting from zero). For example, if you have one header row, and thus want values starting in the second row, you could do

其中columnindex是列的编号(从零开始计数,因此列 A 是索引 0,列 B 是索引 1,等等)。如果您在第一行(或前几行)中有一个描述性标题,您可以提供第二个参数来告诉从哪一行开始(同样,从零开始计数)。例如,如果您有一个标题行,因此希望值从第二行开始,您可以这样做

arrayofvalues = sheet.col_values(columnindex, 1)

Please check out the tutorialfor a reasonably readable discussion of the xlrdpackage. (The official xlrddocumentationis harder to read.)

请查看教程以获取有关该xlrd包的合理可读的讨论。(官方xlrd文档更难阅读。)

Also note that (1) while you are free to use the name arrayofvalues, what you are really getting is a Python list, which technically isn't an array, and (2) the on_demandworkbook parameter has no effect when working with .xlsx files, which means xlrdwill attempt to load the entire workbook into memory regardless. (The on_demandfeature works for .xls files.)

另请注意,(1) 虽然您可以自由使用 name arrayofvalues,但您真正获得的是 Python 列表,从技术上讲它不是数组,并且 (2) on_demandworkbook 参数在使用 .xlsx 文件时无效,这意味着xlrd无论如何都会尝试将整个工作簿加载到内存中。(该on_demand功能适用于 .xls 文件。)

回答by khelili miliana

This script allows to trasform a xls file to list of dictinnaries, all dict in list represent a row

此脚本允许将 xls 文件转换为字典列表,列表中的所有字典都代表一行

import xlrd

workbook = xlrd.open_workbook('esc_data.xlss')
workbook = xlrd.open_workbook('esc_data.xlsx', on_demand = True)
worksheet = workbook.sheet_by_index(0)
first_row = [] # Header
for col in range(worksheet.ncols):
    first_row.append( worksheet.cell_value(0,col) )
# tronsform the workbook to a list of dictionnaries
data =[]
for row in range(1, worksheet.nrows):
    elm = {}
    for col in range(worksheet.ncols):
        elm[first_row[col]]=worksheet.cell_value(row,col)
    data.append(elm)
print data