使用命令行在 Linux 中将 xlsx 转换为 csv

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/10557360/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-06 06:16:27  来源:igfitidea点击:

Convert xlsx to csv in Linux with command line

linuxexcelcsvconverterxlsx

提问by user1390150

I'm looking for a way to convert xlsx files to csv files on Linux.

我正在寻找一种在 Linux 上将 xlsx 文件转换为 csv 文件的方法。

I do not want to use PHP/Perl or anything like that since I'm looking at processing several millions of lines, so I need something quick. I found a program on the Ubuntu repos called xls2csv but it will only convert xls (Office 2003) files (which I'm currently using) but I need support for the newer Excel files.

我不想使用 PHP/Perl 或类似的东西,因为我正在考虑处理数百万行,所以我需要一些快速的东西。我在 Ubuntu 存储库上找到了一个名为 xls2csv 的程序,但它只能转换 xls (Office 2003) 文件(我目前正在使用),但我需要支持较新的 Excel 文件。

Any ideas?

有任何想法吗?

回答by Pavel Veller

If you are OK to run Java command line then you can do it with Apache POI HSSF's Excel Extractor. It has a mainmethod that says to be the command line extractor. This one seems to just dump everything out. They point out to this example that converts to CSV. You would have to compile it before you can run it but it too has a mainmethod so you should not have to do much coding per se to make it work.

如果您可以运行 Java 命令行,那么您可以使用 Apache POI HSSF 的Excel Extractor 来完成。它有一个main方法说是命令行提取器。这个似乎只是倾倒一切。他们指出这个转换为 CSV 的例子。您必须先编译它,然后才能运行它,但它也有一种main方法,因此您本身不必进行大量编码即可使其工作。

Another option that might fly but will require some work on the other end is to make your Excel files come to you as Excel XML Data or XML Spreadsheetof whatever MS calls that format these days. It will open a whole new world of opportunities for you to slice and dice it the way you want.

另一个可能可行但需要在另一端做一些工作的选择是让您的 Excel 文件以 Excel XML 数据或XML 电子表格的形式出现在您面前,无论 MS 称其为何种格式,这些天。它将为您打开一个全新的机会世界,您可以按照自己的方式对其进行切片和切块。

回答by jmcnamara

The Gnumericspreadsheet application comes with a command line utility called ssconvertthat can convert between a variety of spreadsheet formats:

Gnumeric的电子表格应用程序附带一个名为命令行实用程序ssconvert能够在各种电子表格格式之间进行转换:

$ ssconvert Book1.xlsx newfile.csv
Using exporter Gnumeric_stf:stf_csv

$ cat newfile.csv 
Foo,Bar,Baz
1,2,3
123.6,7.89,
2012/05/14,,
The,last,Line

To install on Ubuntu:

在 Ubuntu 上安装:

apt-get install gnumeric

To install on Mac:

在 Mac 上安装:

brew install gnumeric

回答by spiffytech

You can do this with LibreOffice:

您可以使用 LibreOffice 执行此操作:

libreoffice --headless --convert-to csv $filename --outdir $outdir

For reasons not clear to me, you might need to run this with sudo. You can make LibreOffice work with sudo without requiring a password by adding this line to you sudoers file:

由于我不清楚的原因,您可能需要使用 sudo 运行它。您可以通过将以下行添加到您的 sudoers 文件来使 LibreOffice 与 sudo 一起工作,而无需密码:

users ALL=(ALL) NOPASSWD: libreoffice

回答by neves

In bash, I used this libreoffice command to convert all my xlsx files in the current directory:

在 bash 中,我使用这个 libreoffice 命令来转换当前目录中的所有 xlsx 文件:

for i   in *.xlsx; do  libreoffice --headless --convert-to csv "$i" ; done

It takes care of spaces in the filename.

它负责处理文件名中的空格。

Tried again some years later, and it didn't work. This threadgives some tips, but the quickiest solution was to run as root (or running a sudo libreoffice). Not elegant, but quick.

几年后又试了一次,还是不行。该线程提供了一些提示,但最快的解决方案是以 root 身份运行(或运行 a sudo libreoffice)。不优雅,但很快。

Use the command scalc.exe in Windows

在 Windows 中使用命令 scalc.exe

回答by andrewtweber

If you already have a Desktop environment then I'm sure Gnumeric / LibreOffice would work well, but on a headless server (such as Amazon Web Services), they require dozens of dependencies that you also need to install.

如果您已经拥有桌面环境,那么我确信 Gnumeric/LibreOffice 会运行良好,但是在无外设服务器(例如 Amazon Web Services)上,它们需要许多您还需要安装的依赖项。

I found this Python alternative:

我找到了这个 Python 替代方案:

https://github.com/dilshod/xlsx2csv

https://github.com/dilshod/xlsx2csv

$ easy_install xlsx2csv
$ xlsx2csv file.xlsx > newfile.csv

Took 2 seconds to install and works like a charm.

安装花了 2 秒钟,效果非常好。

If you have multiple sheets you can export all at once, or one at a time:

如果您有多个工作表,您可以一次导出所有工作表,或一次导出一张:

$ xlsx2csv file.xlsx --all > all.csv
$ xlsx2csv file.xlsx --all -p '' > all-no-delimiter.csv
$ xlsx2csv file.xlsx -s 1 > sheet1.csv

He also links to several alternatives built in Bash, Python, Ruby, and Java.

他还链接到几个用 Bash、Python、Ruby 和 Java 构建的替代方案。

回答by Holger Brandl

Another option would be to use R via a small bash wrapper for convenience:

为了方便起见,另一种选择是通过小型 bash 包装器使用 R:

xlsx2txt(){
echo '
require(xlsx)
write.table(read.xlsx2(commandArgs(TRUE)[1], 1), stdout(), quote=F, row.names=FALSE, col.names=T, sep="\t")
' | Rscript --vanilla -  2>/dev/null
}

xlsx2txt file.xlsx > file.txt

回答by Holger Brandl

Use csvkit

使用csvkit

in2csv data.xlsx > data.csv

For details check their excellent docs

有关详细信息,请查看他们的优秀文档

回答by Akavall

If .xlsxfile has many sheets, -sflag can be used to get the sheet you want. For example:

如果.xlsx文件有很多工作表,-s可以使用 flag 来获取您想要的工作表。例如:

xlsx2csv "my_file.xlsx" -s 2 second_sheet.csv

second_sheet.csvwould contain data of 2nd sheet in my_file.xlsx.

second_sheet.csv将包含my_file.xlsx.

回答by Pascal-Louis Perez

Using the Gnumericspreadsheet application which comes which a commandline utility called ssconvert is indeed super simple:

使用Gnumeric电子表格应用程序,其中一个名为 ssconvert 的命令行实用程序确实非常简单:

find . -name '*.xlsx' -exec ssconvert -T Gnumeric_stf:stf_csv {} \;

and you're done!

你就完成了!

回答by Benoit Duffez

As others said, libreofficecan convert xls files to csv. The problem for me was the sheet selection.

正如其他人所说,libreoffice可以将 xls 文件转换为 csv。我的问题是纸张选择。

This libreoffice Python scriptdoes a fine job at converting a single sheet to CSV.

这个libreoffice Python 脚本在将单个工作表转换为 CSV 方面做得很好。

Usage is:

用法是:

./libreconverter.py File.xls:"Sheet Name" output.csv

The only downside (on my end) is that --headlessdoesn't seem to work. I have a LO window that shows up for a second and then quits.
That's OK with me, it's the only tool that does the job rapidly.

唯一的缺点(就我而言)是这--headless似乎不起作用。我有一个 LO 窗口,显示一秒钟然后退出。
我没问题,它是唯一可以快速完成工作的工具。