Python Pandas:将特定的 Excel 单元格值读入变量
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/43544514/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Pandas: Read specific Excel cell value into a variable
提问by QHarr
Situation:
情况:
I am using pandas
to parse in separate Excel (.xlsx
) sheets from a workbook with the following setup: Python 3.6.0
and Anaconda 4.3.1
on Windows 7 x64.
我使用pandas
在单独的Excel(解析.xlsx
)从与下面的设置工作簿片:Python 3.6.0
和Anaconda 4.3.1
上Windows 7 x64.
Problem:
问题:
I have been unable to find how to set a variable to a specific Excel sheet cell value e.g. var = Sheet['A3'].value
from 'Sheet2'
using pandas
?
我一直无法找到如何将一个变量设置为一个特定的Excel工作表单元格值,例如,var = Sheet['A3'].value
从'Sheet2'
使用pandas
?
Question:
题:
Is this possible? If so, how?
这可能吗?如果是这样,如何?
What i have tried:
我尝试过的:
I have searched through the pandas
documentation on dataframe
and various forums but haven't found an answer to this.
我已经搜索了各种论坛pandas
上的文档,dataframe
但没有找到答案。
I know i can work around this using openpyxl
(where i can specify a cell co-ordinate) but I want:
我知道我可以使用openpyxl
(我可以在其中指定单元格坐标)解决这个问题,但我想要:
- To use
pandas
-if possible; - Only read in the file once.
- 使用
pandas
- 如果可能; - 只读入文件一次。
I have imported numpy
, as well as pandas
, so was able to write:
我已经导入numpy
,以及pandas
,所以能够写:
xls = pd.ExcelFile(filenamewithpath)
data = xls.parse('Sheet1')
dateinfo2 = str(xls.parse('Sheet2', parse_cols = "A", skiprows = 2, nrows = 1, header = None)[0:1]).split('0\n0')[1].strip()
'Sheet1'
being read into 'data'
is fine as i have a function to collect the range i want.
'Sheet1'
被读入'data'
很好,因为我有一个功能来收集我想要的范围。
I am also trying to read in from a separate sheet ('sheet2'
), the value in cell "A3"
, and the code i have at present is clunky. It gets the value out as a string, as required, but is in no way pretty. I only want this cell value and as little additional sheet info as possible.
我还试图从单独的工作表 ( 'sheet2'
) 中读取单元格中的值"A3"
,而我目前拥有的代码很笨拙。它根据需要将值作为字符串输出,但绝不是漂亮的。我只想要这个单元格值和尽可能少的额外工作表信息。
采纳答案by Yannis P.
Elaborating on @FLab's comment use something along those lines:
详细说明@FLab 的评论,请使用以下内容:
Edit:
编辑:
Updated the answer to correspond to the updated question that asks how to read some sheets at once.
So by providing sheet_name=None
to read_excel()
you can read all the sheets at once and pandas return a dict
of DataFrames, where the keys are the Excel sheet names.
更新了与询问如何一次阅读一些工作表的更新问题相对应的答案。因此,通过提供sheet_name=None
给read_excel()
您可以一次读取所有工作表,pandas 返回一个dict
DataFrames,其中键是 Excel 工作表名称。
import pandas as pd
In [10]:
df = pd.read_excel('Book1.xlsx', sheetname=None, header=None)
df
Out[11]:
{u'Sheet1': 0
0 1
1 1, u'Sheet2': 0
0 1
1 2
2 10}
In [13]:
data = df["Sheet1"]
secondary_data = df["Sheet2"]
secondary_data.loc[2,0]
Out[13]:
10
Alternatively, as noted in this post, if your Excel file has several sheets you can pass sheetname
a list of strings, sheet names to parse eg.
另外,如指出这篇文章,如果您的Excel文件有几张你可以传递sheetname
字符串列表,表名称解析如。
df = pd.read_excel('Book1.xlsx', sheetname=["Sheet1", "Sheet2"], header=None)
Credits to user6241235 for digging out the last alternative
感谢 user6241235 挖掘出最后一个选择
回答by Nilanjan
You can use pandas read_excel which has skip_footer argument. This should work, where skipendrows is number of end rows you want to skip.
您可以使用带有skip_footer 参数的pandas read_excel。这应该有效,其中 skipendrows 是您要跳过的结束行数。
data = xls.read_excel(filename, 'Sheet2', parse_cols = "A", skipsrows = 2, skip_footer=skipendrows, header =None)
回答by Arthur D. Howland
Reading an Excel file using Pandas is going to default to a dataframe. You don't need an entire table, just one cell. The way I do it is to make that cell a header, for example:
使用 Pandas 读取 Excel 文件将默认为数据框。您不需要整个表格,只需一个单元格。我这样做的方法是使该单元格成为标题,例如:
# Read Excel and select a single cell (and make it a header for a column)
data = pd.read_excel(filename, 'Sheet2', index_col=None, usecols = "C", header = 10, nrows=0)
Will return a "list" of 1 header(s) and no data. Then isolate that header:
将返回 1 个标题的“列表”并且没有数据。然后隔离该标题:
# Extract a value from a list (list of headers)
data = data.columns.values[0]
print (data)