Python xlrd 数据提取
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/3775695/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Python xlrd data extraction
提问by Hulk
I am using python xlrd http://scienceoss.com/read-excel-files-from-python/to read data from an excel sheet
我正在使用 python xlrd http://scienceoss.com/read-excel-files-from-python/从 Excel 工作表中读取数据
My question is if i read a row with first cell as "Employee name" in the excel sheet
我的问题是,如果我在 Excel 表中读取第一个单元格为“员工姓名”的行
And there is another row named whose first cell is "Employee name"
还有另一行命名,其第一个单元格是“员工姓名”
How can we read the last column starting with the last row which has "Employee name" in the first cell.Ignoring the previous
我们如何从第一个单元格中具有“员工姓名”的最后一行开始读取最后一列。忽略前一个
wb = xlrd.open_workbook(file,encoding_override="cp1252")
wb.sheet_names()
sh = wb.sheet_by_index(0)
num_of_rows = sh.nrows
num_of_cols = sh.ncols
valid_xl_format = 0
invalid_xl_format = 0
if(num_of_rows != 0):
for i in range(num_of_rows):
questions_dict = {}
for j in range(num_of_cols):
xl_data=sh.cell(i,j).value
if ((xl_data == "Employee name")):
# Regardless of how many "Employee name" found in rows first cell,Read only the last "Employee name"
采纳答案by John Machin
I am using python xlrd http://scienceoss.com/read-excel-files-from-python/to read data from an excel sheet
我正在使用 python xlrd http://scienceoss.com/read-excel-files-from-python/从 Excel 工作表中读取数据
You need to think about what you are doing, instead of grabbing some blog code and leaving in totally irrelevant stuff like wb.sheet_names()and omitting parts very relevant to your requirement like first_column = sh.col_values(0).
你需要考虑你在做什么,而不是抓住一些博客代码,留下完全不相关的东西,比如wb.sheet_names()和省略与你的需求非常相关的部分,比如first_column = sh.col_values(0).
Here's how to find the row_index of the last "whatever" in column A (the first column) -- untested:
以下是如何在 A 列(第一列)中找到最后一个“任何”的 row_index —— 未经测试:
import xlrd
wb = xlrd.open_workbook(file_name)
# Why do you think that you need to use encoding_overide?
sheet0 = wb.sheet_by_index(0)
tag = u"Employee name" # or u"Emp name" or ...
column_0_values = sheet0.col_values(colx=0)
try:
max_tag_row_index = column_0_values.rindex(tag)
print "last tag %r found at row_index %d" % (
tag, max_tag_row_index)
except IndexError:
print "tag %r not found" % tag
Now we need to interpret "How can we read the last column starting with the last row which has "Employee name" in the first cell"
现在我们需要解释“我们如何读取从第一个单元格中有“员工姓名”的最后一行开始的最后一列“
Assuming that "the last column" means the one with column_index == sheet0.ncols - 1, then:
假设“最后一列”是指 column_index == sheet0.ncols - 1 的那一列,那么:
last_colx = sheet0.ncols - 1
required_values = sheet0.col_values(colx=last_colx, start_rowx=max_tag_row_index)
required_cells = sheet0.col_slice(colx=last_colx, start_rowx=max_tag_row_index)
# choose one of the above 2 lines, depending on what you need to do
If that's not what you mean (which is quite possible as it is ignoring a whole bunch of data (why do you want to read only the last column?), please try to explain with examples what you do mean.
如果这不是您的意思(这很有可能因为它忽略了一大堆数据(为什么您只想阅读最后一列?),请尝试用示例解释您的意思。
Possibly you want to iterate over the remaining cells:
可能您想遍历剩余的单元格:
for rowx in xrange(max_tag_row_index, sheet0.nrows): # or max_tag_row_index + 1
for colx in xrange(0, sheet0.ncols):
do_something_with_cell_object(sheet0.cell(rowx, colx))
回答by mechanical_meat
It's difficult to understand exactly what you're asking.
Posting sample data might help make your intent more clear.
很难准确理解你在问什么。
发布示例数据可能有助于使您的意图更加清晰。
Have you tried iterating over the dataset in reverse?, e.g.:
您是否尝试过反向迭代数据集?,例如:
for i in reversed(range(num_of_rows)):
...
if xl_data == "Employee name":
# do something
# then break since you've found the final "Employee Name"
break

