如何在 Python 中读取 Excel 文件?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/31420351/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to read an excel file in Python?
提问by packet007
I am newbie to Python.
Basically, I want to write a program to read column D
& E
from an excel file, and calculate the total Incoming
and Outgoing
duration.
我是 Python 的新手。基本上,我想编写一个程序来从 excel 文件中读取列D
& E
,并计算总数Incoming
和Outgoing
持续时间。
Which Python module is used to read excel files and how to process data inside it?
哪个Python模块用于读取excel文件以及如何处理其中的数据?
Excel file:
Excel文件:
D E
Incoming 18
Outgoing 99
Incoming 20
Outgoing 59
Incoming 30
Incoming 40
采纳答案by Michael Moura
there are a couple of options depending on the version of excel you are using.
openpyxl - used for reading Excel 2010 files (ie: .xlsx)
xlrd - used for reading older Excel files (ie: .xls)
根据您使用的 excel 版本,有几个选项。
openpyxl - 用于读取 Excel 2010 文件(即:.xlsx)
xlrd - 用于读取较旧的 Excel 文件(即:.xls)
I have only used xlrd, which you could do something like the below
** Note ** code not tested
我只使用过 xlrd,您可以执行以下操作
** 注意 ** 代码未测试
import xlrd
current_row = 0
sheet_num = 1
input_total = 0
output_total = 0
# path to the file you want to extract data from
src = r'c:\temp\excel sheet.xls'
book = xlrd.open_workbook(src)
# select the sheet that the data resids in
work_sheet = book.sheet_by_index(sheet_num)
# get the total number of rows
num_rows = work_sheet.nrows - 1
while current_row < num_rows:
row_header = work_sheet.cell_value(current_row, 4)
if row_header == 'output':
output_total += work_sheet.cell_value(current_row, 5)
elif row_header == 'input':
input_total += work_sheet.cell_value(current_row, 5)
print output_total
print input_total
回答by brenns10
It seems like simply using Excel's =SUMIF()
function would be sufficient. However, you're asking for a Python solution, so here's a Python solution!
似乎简单地使用 Excel 的=SUMIF()
功能就足够了。但是,您需要一个 Python 解决方案,所以这里有一个 Python 解决方案!
Pandas is a library that provides a DataFrame data structure very similar to an Excel spreadsheet. It provides a read_excel()
function, whose documentation you can find here. Once you have a DataFrame, you could do something like this:
Pandas 是一个提供与 Excel 电子表格非常相似的 DataFrame 数据结构的库。它提供了一个read_excel()
函数,您可以在此处找到其文档。一旦你有一个 DataFrame,你可以做这样的事情:
import pandas as pd
table = pd.read_excel('path-to-spreadsheet.xlsx')
incoming_sum = table.E[table.D == 'Incoming'].sum()
outgoing_sum = table.E[table.D == 'Outgoing'].sum()
You can get Pandas for Python on Windows, but it's a bit difficult. The easiest way is a Scientific Python distribution for Windows, like Anaconda. On Linux, installing pandas
is simple as sudo pip install pandas
.
你可以在 Windows 上获得 Pandas for Python,但这有点困难。最简单的方法是 Windows 的科学 Python 发行版,例如Anaconda。在 Linux 上,安装pandas
很简单,就像sudo pip install pandas
.
回答by raymelfrancisco
Using xlrd 0.9.3in Python 3.4.1:
在 Python 3.4.1 中使用xlrd 0.9.3:
It puts all values from row D
and E
in two separate list
.
它把所有的值从行D
,并E
在两个独立的list
。
It then combines each parallel elements of these lists (simply elements with same index) to a tuple using zip()
.
然后,它使用 将这些列表的每个并行元素(只是具有相同索引的元素)组合到一个元组中zip()
。
Then, these generated tuples are combined to a list
. Using sum()
and list comprehension, incoming_sum
and outgoing_sum
are calculated.
然后,这些生成的元组被组合到一个list
. 使用sum()
和列表理解,incoming_sum
和outgoing_sum
计算。
import xlrd
with xlrd.open_workbook('z.xlsx') as book:
# 0 corresponds for 1st worksheet, usually named 'Book1'
sheet = book.sheet_by_index(0)
# gets col D values
D = [ D for D in sheet.col_values(3) ]
# gets col E values
E = [ E for E in sheet.col_values(4) ]
# combines D and E elements to tuples, combines tuples to list
# ex. [ ('Incoming', 18), ('Outgoing', 99), ... ]
data = list( zip(D, E) )
# gets sum
incoming_sum = sum( tup[1] for tup in data if tup[0] == 'Incoming' )
outgoing_sum = sum( tup[1] for tup in data if tup[0] == 'Outgoing' )
print('Total incoming:', incoming_sum)
print('Total outgoing:', outgoing_sum)
Output:
输出:
Total incoming: 108.0
Total outgoing: 158.0
To install xlrd: (Windows)
安装xlrd: (Windows)
- Downloadhere: https://pypi.python.org/pypi/xlrd
Extractto any directory, then change
cmd
's current directory(chdir
) to the directory where you extracted, then type in cmdpython setup.py install
Take note that you will extract
xlrd-0.9.3.tar.gz
two times, first to remove.gz
, second to remove.tar
.The extracted directory (where you will change your cmd's current directory) will look like this:
- 在这里下载:https: //pypi.python.org/pypi/xlrd
解压到任意目录,然后将
cmd
的当前目录(chdir
)更改为解压的目录,然后输入 cmdpython setup.py install
请注意,您将提取
xlrd-0.9.3.tar.gz
两次,第一次删除.gz
,第二次删除.tar
。提取的目录(您将在其中更改 cmd 的当前目录)将如下所示: