使用命令行在 Linux 中将 xlsx 转换为 csv
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/10557360/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Convert xlsx to csv in Linux with command line
提问by user1390150
I'm looking for a way to convert xlsx files to csv files on Linux.
我正在寻找一种在 Linux 上将 xlsx 文件转换为 csv 文件的方法。
I do not want to use PHP/Perl or anything like that since I'm looking at processing several millions of lines, so I need something quick. I found a program on the Ubuntu repos called xls2csv but it will only convert xls (Office 2003) files (which I'm currently using) but I need support for the newer Excel files.
我不想使用 PHP/Perl 或类似的东西,因为我正在考虑处理数百万行,所以我需要一些快速的东西。我在 Ubuntu 存储库上找到了一个名为 xls2csv 的程序,但它只能转换 xls (Office 2003) 文件(我目前正在使用),但我需要支持较新的 Excel 文件。
Any ideas?
有任何想法吗?
回答by Pavel Veller
If you are OK to run Java command line then you can do it with Apache POI HSSF's Excel Extractor. It has a main
method that says to be the command line extractor. This one seems to just dump everything out. They point out to this example that converts to CSV. You would have to compile it before you can run it but it too has a main
method so you should not have to do much coding per se to make it work.
如果您可以运行 Java 命令行,那么您可以使用 Apache POI HSSF 的Excel Extractor 来完成。它有一个main
方法说是命令行提取器。这个似乎只是倾倒一切。他们指出这个转换为 CSV 的例子。您必须先编译它,然后才能运行它,但它也有一种main
方法,因此您本身不必进行大量编码即可使其工作。
Another option that might fly but will require some work on the other end is to make your Excel files come to you as Excel XML Data or XML Spreadsheetof whatever MS calls that format these days. It will open a whole new world of opportunities for you to slice and dice it the way you want.
另一个可能可行但需要在另一端做一些工作的选择是让您的 Excel 文件以 Excel XML 数据或XML 电子表格的形式出现在您面前,无论 MS 称其为何种格式,这些天。它将为您打开一个全新的机会世界,您可以按照自己的方式对其进行切片和切块。
回答by jmcnamara
The Gnumericspreadsheet application comes with a command line utility called ssconvertthat can convert between a variety of spreadsheet formats:
该Gnumeric的电子表格应用程序附带一个名为命令行实用程序ssconvert能够在各种电子表格格式之间进行转换:
$ ssconvert Book1.xlsx newfile.csv
Using exporter Gnumeric_stf:stf_csv
$ cat newfile.csv
Foo,Bar,Baz
1,2,3
123.6,7.89,
2012/05/14,,
The,last,Line
To install on Ubuntu:
在 Ubuntu 上安装:
apt-get install gnumeric
To install on Mac:
在 Mac 上安装:
brew install gnumeric
回答by spiffytech
You can do this with LibreOffice:
您可以使用 LibreOffice 执行此操作:
libreoffice --headless --convert-to csv $filename --outdir $outdir
For reasons not clear to me, you might need to run this with sudo. You can make LibreOffice work with sudo without requiring a password by adding this line to you sudoers file:
由于我不清楚的原因,您可能需要使用 sudo 运行它。您可以通过将以下行添加到您的 sudoers 文件来使 LibreOffice 与 sudo 一起工作,而无需密码:
users ALL=(ALL) NOPASSWD: libreoffice
回答by neves
In bash, I used this libreoffice command to convert all my xlsx files in the current directory:
在 bash 中,我使用这个 libreoffice 命令来转换当前目录中的所有 xlsx 文件:
for i in *.xlsx; do libreoffice --headless --convert-to csv "$i" ; done
It takes care of spaces in the filename.
它负责处理文件名中的空格。
Tried again some years later, and it didn't work. This threadgives some tips, but the quickiest solution was to run as root (or running a sudo libreoffice
). Not elegant, but quick.
几年后又试了一次,还是不行。该线程提供了一些提示,但最快的解决方案是以 root 身份运行(或运行 a sudo libreoffice
)。不优雅,但很快。
Use the command scalc.exe in Windows
在 Windows 中使用命令 scalc.exe
回答by andrewtweber
If you already have a Desktop environment then I'm sure Gnumeric / LibreOffice would work well, but on a headless server (such as Amazon Web Services), they require dozens of dependencies that you also need to install.
如果您已经拥有桌面环境,那么我确信 Gnumeric/LibreOffice 会运行良好,但是在无外设服务器(例如 Amazon Web Services)上,它们需要许多您还需要安装的依赖项。
I found this Python alternative:
我找到了这个 Python 替代方案:
https://github.com/dilshod/xlsx2csv
https://github.com/dilshod/xlsx2csv
$ easy_install xlsx2csv
$ xlsx2csv file.xlsx > newfile.csv
Took 2 seconds to install and works like a charm.
安装花了 2 秒钟,效果非常好。
If you have multiple sheets you can export all at once, or one at a time:
如果您有多个工作表,您可以一次导出所有工作表,或一次导出一张:
$ xlsx2csv file.xlsx --all > all.csv
$ xlsx2csv file.xlsx --all -p '' > all-no-delimiter.csv
$ xlsx2csv file.xlsx -s 1 > sheet1.csv
He also links to several alternatives built in Bash, Python, Ruby, and Java.
他还链接到几个用 Bash、Python、Ruby 和 Java 构建的替代方案。
回答by Holger Brandl
Another option would be to use R via a small bash wrapper for convenience:
为了方便起见,另一种选择是通过小型 bash 包装器使用 R:
xlsx2txt(){
echo '
require(xlsx)
write.table(read.xlsx2(commandArgs(TRUE)[1], 1), stdout(), quote=F, row.names=FALSE, col.names=T, sep="\t")
' | Rscript --vanilla - 2>/dev/null
}
xlsx2txt file.xlsx > file.txt
回答by Holger Brandl
回答by Akavall
If .xlsx
file has many sheets, -s
flag can be used to get the sheet you want. For example:
如果.xlsx
文件有很多工作表,-s
可以使用 flag 来获取您想要的工作表。例如:
xlsx2csv "my_file.xlsx" -s 2 second_sheet.csv
second_sheet.csv
would contain data of 2nd sheet in my_file.xlsx
.
second_sheet.csv
将包含my_file.xlsx
.
回答by Pascal-Louis Perez
回答by Benoit Duffez
As others said, libreoffice
can convert xls files to csv. The problem for me was the sheet selection.
正如其他人所说,libreoffice
可以将 xls 文件转换为 csv。我的问题是纸张选择。
This libreoffice Python scriptdoes a fine job at converting a single sheet to CSV.
这个libreoffice Python 脚本在将单个工作表转换为 CSV 方面做得很好。
Usage is:
用法是:
./libreconverter.py File.xls:"Sheet Name" output.csv
The only downside (on my end) is that --headless
doesn't seem to work. I have a LO window that shows up for a second and then quits.
That's OK with me, it's the only tool that does the job rapidly.
唯一的缺点(就我而言)是这--headless
似乎不起作用。我有一个 LO 窗口,显示一秒钟然后退出。
我没问题,它是唯一可以快速完成工作的工具。