如何在 Python 中读取 Excel 文件?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/31420351/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 09:58:10  来源:igfitidea点击:

How to read an excel file in Python?

pythonpython-3.x

提问by packet007

I am newbie to Python. Basically, I want to write a program to read column D& Efrom an excel file, and calculate the total Incomingand Outgoingduration.

我是 Python 的新手。基本上,我想编写一个程序来从 excel 文件中读取列D& E,并计算总数IncomingOutgoing持续时间。

Which Python module is used to read excel files and how to process data inside it?

哪个Python模块用于读取excel文件以及如何处理其中的数据?

Excel file:

Excel文件:

D            E
Incoming    18
Outgoing    99
Incoming    20
Outgoing    59
Incoming    30
Incoming    40

采纳答案by Michael Moura

there are a couple of options depending on the version of excel you are using.
openpyxl - used for reading Excel 2010 files (ie: .xlsx)
xlrd - used for reading older Excel files (ie: .xls)

根据您使用的 excel 版本,有几个选项。
openpyxl - 用于读取 Excel 2010 文件(即:.xlsx)
xlrd - 用于读取较旧的 Excel 文件(即:.xls)

I have only used xlrd, which you could do something like the below
** Note ** code not tested

我只使用过 xlrd,您可以执行以下操作
** 注意 ** 代码未测试

import xlrd


current_row = 0
sheet_num = 1
input_total = 0
output_total = 0

# path to the file you want to extract data from
src = r'c:\temp\excel sheet.xls'

book = xlrd.open_workbook(src)

# select the sheet that the data resids in
work_sheet = book.sheet_by_index(sheet_num)

# get the total number of rows
num_rows = work_sheet.nrows - 1

while current_row < num_rows:
    row_header = work_sheet.cell_value(current_row, 4)

    if row_header == 'output':
        output_total += work_sheet.cell_value(current_row, 5)
    elif row_header == 'input':
        input_total += work_sheet.cell_value(current_row, 5)

print output_total
print input_total

回答by brenns10

It seems like simply using Excel's =SUMIF()function would be sufficient. However, you're asking for a Python solution, so here's a Python solution!

似乎简单地使用 Excel 的=SUMIF()功能就足够了。但是,您需要一个 Python 解决方案,所以这里有一个 Python 解决方案!

Pandas is a library that provides a DataFrame data structure very similar to an Excel spreadsheet. It provides a read_excel()function, whose documentation you can find here. Once you have a DataFrame, you could do something like this:

Pandas 是一个提供与 Excel 电子表格非常相似的 DataFrame 数据结构的库。它提供了一个read_excel()函数,您可以在此处找到其文档。一旦你有一个 DataFrame,你可以做这样的事情:

import pandas as pd
table = pd.read_excel('path-to-spreadsheet.xlsx')
incoming_sum = table.E[table.D == 'Incoming'].sum()
outgoing_sum = table.E[table.D == 'Outgoing'].sum()

You can get Pandas for Python on Windows, but it's a bit difficult. The easiest way is a Scientific Python distribution for Windows, like Anaconda. On Linux, installing pandasis simple as sudo pip install pandas.

你可以在 Windows 上获得 Pandas for Python,但这有点困难。最简单的方法是 Windows 的科学 Python 发行版,例如Anaconda。在 Linux 上,安装pandas很简单,就像sudo pip install pandas.

回答by raymelfrancisco

Using xlrd 0.9.3in Python 3.4.1:

在 Python 3.4.1 中使用xlrd 0.9.3

It puts all values from row Dand Ein two separate list.

它把所有的值从行D,并E在两个独立的list

It then combines each parallel elements of these lists (simply elements with same index) to a tuple using zip().

然后,它使用 将这些列表的每个并行元素(只是具有相同索引的元素)组合到一个元组中zip()

Then, these generated tuples are combined to a list. Using sum()and list comprehension, incoming_sumand outgoing_sumare calculated.

然后,这些生成的元组被组合到一个list. 使用sum()和列表理解,incoming_sumoutgoing_sum计算。

import xlrd

with xlrd.open_workbook('z.xlsx') as book:

    # 0 corresponds for 1st worksheet, usually named 'Book1'
    sheet = book.sheet_by_index(0)

    # gets col D values
    D = [ D for D in sheet.col_values(3) ]

    # gets col E values
    E = [ E for E in sheet.col_values(4) ]

    # combines D and E elements to tuples, combines tuples to list
    # ex. [ ('Incoming', 18), ('Outgoing', 99), ... ]
    data = list( zip(D, E) )

    # gets sum
    incoming_sum = sum( tup[1] for tup in data if tup[0] == 'Incoming' )
    outgoing_sum = sum( tup[1] for tup in data if tup[0] == 'Outgoing' )

    print('Total incoming:', incoming_sum)
    print('Total outgoing:', outgoing_sum)

Output:

输出:

Total incoming: 108.0
Total outgoing: 158.0

To install xlrd: (Windows)

安装xlrd: (Windows)

  1. Downloadhere: https://pypi.python.org/pypi/xlrd
  2. Extractto any directory, then changecmd's current directory( chdir) to the directory where you extracted, then type in cmdpython setup.py install

    • Take note that you will extract xlrd-0.9.3.tar.gztwo times, first to remove .gz, second to remove .tar.

    • The extracted directory (where you will change your cmd's current directory) will look like this:enter image description here

  1. 在这里下载https: //pypi.python.org/pypi/xlrd
  2. 解压到任意目录,然后cmd当前目录chdir)更改为解压的目录,然后输入 cmdpython setup.py install

    • 请注意,您将提取xlrd-0.9.3.tar.gz两次,第一次删除.gz,第二次删除.tar

    • 提取的目录(您将在其中更改 cmd 的当前目录)将如下所示:在此处输入图片说明