python 以编程方式从 Excel 电子表格中提取数据

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/1243545/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-11-03 21:46:10  来源:igfitidea点击:

Programmatically extract data from an Excel spreadsheet

pythonrubyperlexcelcsv

提问by dreeves

Is there a simple way, using some common Unix scripting language (Perl/Python/Ruby) or command line utility, to convert an Excel Spreadsheet file to CSV? Specifically, this one:

有没有一种简单的方法,使用一些常见的 Unix 脚本语言(Perl/Python/Ruby)或命令行实用程序,将 Excel 电子表格文件转换为 CSV?具体来说,这个:

http://www.econ.yale.edu/~shiller/data/ie_data.xls

http://www.econ.yale.edu/~shiller/data/ie_data.xls

And specifically the third sheet of that spreadsheet (the first two being charts).

特别是该电子表格的第三张表(前两张是图表)。

回答by

There is a really good Perl library for xls reading: Spreadsheet::ParseExcel.

有一个非常好的用于 xls 阅读的 Perl 库:Spreadsheet::ParseExcel

回答by HerdplattenToni

Maybe xlrdwill do the Job (in Python)

也许xlrd会完成这项工作(在 Python 中)

edit: I should really learn to read questions. But writing csv shouldn't be a huge problem so maybe you can actually use it.

编辑:我真的应该学会阅读问题。但是编写 csv 应该不是一个大问题,所以也许您可以实际使用它。

回答by nosklo

You can use pyexceleratorin python.

您可以在 python 中使用pyexcelerator

This code (included in the examplesfolder of pyexcelerator as xls2csv.py) extracts all sheets from the spreadsheets and outputs them to stdoutas CSV.

此代码(包含在examplespyexcelerator as 文件夹中xls2csv.py)从电子表格中提取所有工作表并将它们输出stdout为 CSV。

You can easily change the code to do what you want.

您可以轻松更改代码以执行您想要的操作。

The cool thing about pyexcelerator is that you can also use it to write/createexcel xls files, without having excel installed.

pyexcelerator 很酷的一点是,您还可以使用它来编写/创建excel xls 文件,而无需安装 excel。

#!/usr/bin/env python
# -*- coding: windows-1251 -*-
# Copyright (C) 2005 Kiseliov Roman

__rev_id__ = """$Id: xls2csv.py,v 1.1 2005/05/19 09:27:42 rvk Exp $"""


from pyExcelerator import *
import sys

me, args = sys.argv[0], sys.argv[1:]


if args:
    for arg in args:
        print >>sys.stderr, 'extracting data from', arg
        for sheet_name, values in parse_xls(arg, 'cp1251'): # parse_xls(arg) -- default encoding
            matrix = [[]]
            print 'Sheet = "%s"' % sheet_name.encode('cp866', 'backslashreplace')
            print '----------------'
            for row_idx, col_idx in sorted(values.keys()):
                v = values[(row_idx, col_idx)]
                if isinstance(v, unicode):
                    v = v.encode('cp866', 'backslashreplace')
                else:
                    v = str(v)
                last_row, last_col = len(matrix), len(matrix[-1])
                while last_row < row_idx:
                    matrix.extend([[]])
                    last_row = len(matrix)

                while last_col < col_idx:
                    matrix[-1].extend([''])
                    last_col = len(matrix[-1])

                matrix[-1].extend([v])

            for row in matrix:
                csv_row = ','.join(row)
                print csv_row

else:
    print 'usage: %s (inputfile)+' % me

回答by Alexis Perrier

for ruby, the spreadsheet gem is excellent to read write modify, ...excell files

对于 ruby​​,电子表格 gem 非常适合读写修改,...excell 文件

https://github.com/zdavatz/spreadsheet

https://github.com/zdavatz/spreadsheet

回答by vikjam

This is quite late to the game, but I thought I'd add another option via Ruby using the gem "roo":

这对游戏来说已经很晚了,但我想我会使用 gem "roo" 通过 Ruby 添加另一个选项:

    require 'rubygems'
    require 'roo'

    my_excel_file = Excelx.new("path/to/my_excel_file.xlsx")
    my_excel_file.default_sheet = my_excel_file.sheets[2]
    my_excel_file.to_csv("path/to/my_excel_file.csv")

回答by Jasim

In Ruby, here is the code I use: (requires the excellent parseexcel gem) require 'parseexcel'

在 Ruby 中,这是我使用的代码:(需要优秀的 parseexcel gem)需要 'parseexcel'

def excelGetSheet(worksheet)
    sheet=Array.new
    worksheet.each { |row|
      if row != nil   # empty row?
        cells=Array.new
        j=0
        row.each { |cell|
          cells << cell.to_s('latin1')  unless cell == nil
          j=j+1
        }
        sheet << cells
      end
    }
    return sheet
end

workbook = Spreadsheet::ParseExcel.parse("MyExcelFile.xls")
sheet1 = excelGetSheet(workbook.worksheet(0))

puts sheet1.inspect

回答by Eli Bendersky

Options exist for all three languages. The question is - which one are you most familiar with. This is the language you should use, for sure. And if you're not familiar with either, this application is not really a great example of picking between the languages.

所有三种语言都存在选项。问题是 - 你最熟悉哪一个。这是你应该使用的语言,当然。如果您对这两种语言都不熟悉,这个应用程序并不是在语言之间进行选择的一个很好的例子。

Opinionated P.S: if you don't know any of the languages, just learn Python and use xlrd.

自以为是的 PS:如果您不会任何语言,只需学习 Python 并使用xlrd.

回答by dreeves

I may have found an acceptable answer already:

我可能已经找到了一个可以接受的答案:

xls2csv

xls2csv

But interested to hear what other options there are, or about tools in other languages.

但有兴趣了解还有哪些其他选项,或其他语言的工具。

回答by DrAl

For python, there are a number of options, see here, hereand here. Note that the last option will only work on Windows with Excel installed.

对于 python,有许多选项,请参见此处此处此处。请注意,最后一个选项仅适用于安装了 Excel 的 Windows。

回答by chfw

With pyexcellibrary, you can do this:

使用pyexcel库,您可以执行以下操作:

>>> import pyexcel as p
>>> data_sheet=p.get_sheet(file_name='/Users/jaska/Downloads/ie_data.xls', sheet_name='Data')
>>> data_sheet.top_left()
pyexcel sheet:
+---------------------------------------------------------------------------------------------------------+---+---+---+------------+---+---+---+---+---+------------+---+---+---+---+---+---+
|                                                                                                         |   |   |   |            |   |   |   |   |   |            |   |   |   |   |   |   |
+---------------------------------------------------------------------------------------------------------+---+---+---+------------+---+---+---+---+---+------------+---+---+---+---+---+---+
| Stock Market Data Used in "Irrational Exuberance" Princeton University Press, 2000, 2005, 2015, updated |   |   |   |            |   |   |   |   |   | Cyclically |   |   |   |   |   |   |
+---------------------------------------------------------------------------------------------------------+---+---+---+------------+---+---+---+---+---+------------+---+---+---+---+---+---+
| Robert J. Shiller                                                                                       |   |   |   |            |   |   |   |   |   | Adjusted   |   |   |   |   |   |   |
+---------------------------------------------------------------------------------------------------------+---+---+---+------------+---+---+---+---+---+------------+---+---+---+---+---+---+
|                                                                                                         |   |   |   |            |   |   |   |   |   | Price      |   |   |   |   |   |   |
+---------------------------------------------------------------------------------------------------------+---+---+---+------------+---+---+---+---+---+------------+---+---+---+---+---+---+
|                                                                                                         |   |   |   |   Consumer |   |   |   |   |   | Earnings   |   |   |   |   |   |   |
+---------------------------------------------------------------------------------------------------------+---+---+---+------------+---+---+---+---+---+------------+---+---+---+---+---+---+
>>> data_sheet.save_as('ie_data.csv')

And for it to work, you need to install:

为了让它工作,你需要安装:

$ pip install pyexcel
$ pip install pyexcel-xls

What's more, you can install pyexcel-cliin addition and get your csv data in one command line:

更重要的是,您可以另外安装pyexcel-cli并在一个命令行中获取您的 csv 数据:

$ pyexcel transcode --sheet-name 'Data' /your/home/Downloads/ie_data.xls ie_data.csv