如何使用 bash 命令将 csv 转换为二进制文件?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/37613688/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How do I convert a csv to a binary file with a bash command?
提问by JVE999
I have a csv
file which is just a simple comma-separated list of numbers. I want to convert this csv
file into a binary file (just a sequence of bytes, with each interpreted number being a number from the csv
file).
我有一个csv
文件,它只是一个简单的逗号分隔的数字列表。我想将此csv
文件转换为二进制文件(只是一个字节序列,每个解释的数字都是csv
文件中的一个数字)。
The reason I am doing this is to be able to import audio data from a spreadsheet of values. In my import (I am using audacity), I have a few formats to choose from for the binary file:
我这样做的原因是能够从值的电子表格中导入音频数据。在我的导入中(我使用的是 audacity),我有几种格式可供选择用于二进制文件:
Encoding:
Signed 8, 24, 16, or 32 bit PCM
Unsigned 8 bit PCM
32 bit or 64 bit float
U-Law
A-Law
GSM 6.10
12, 16, or 24 bit DWVW
VOX ADPCM
Byte Order:
No endianness
Big endian
Little endian
I was moving along the lines of big endian 32-bit float
to keep things simple. I wanted to keep things as simple as possible, so I was thinking bash
would be the optimal tool.
我正在沿着big endian 32-bit float
使事情简单的路线前进。我想让事情尽可能简单,所以我认为bash
这是最佳工具。
回答by Dummy00001
I have a
csv
file which is just a simple comma-separated list of numbers. I want to convert thiscsv
file into a binary file [...]I was moving along the lines of
big endian 32-bit float
to keep things simple.
我有一个
csv
文件,它只是一个简单的逗号分隔的数字列表。我想将此csv
文件转换为二进制文件 [...]我正在沿着
big endian 32-bit float
使事情简单的路线前进。
Not sure how to do it in pure bash
(actually doubt that it is doable, since float as binary is non-standard conversion).
不确定如何以纯方式执行bash
(实际上怀疑它是否可行,因为 float 作为二进制是非标准转换)。
But here it is with a simple Perl one-liner:
但这里有一个简单的 Perl 单行:
$ cat example1.csv
1.0
2.1
3.2
4.3
$ cat example1.csv | perl -ne 'print pack("f>*", split(/\s*,\s*/))' > example1.bin
$ hexdump -C < example1.bin
00000000 3f 80 00 00 40 06 66 66 40 4c cc cd 40 89 99 9a |[email protected]@L..@...|
00000010
It uses the Perl's pack functionwith f
to convert floats to binary, and <
to convert them into BE. (I have also added the split in case of multiple numbers per CSV line.)
它使用 Perl 的pack 函数withf
将浮点数转换为二进制,<
并将它们转换为 BE。(我还添加了拆分,以防每个 CSV 行有多个数字。)
P.S. The command to convert to integers to 16-bit shorts with native endianness:
PS 将整数转换为具有本机字节序的 16 位 short 的命令:
perl -ne 'print pack("s*", split(/\s*,\s*/))'
Use "s>*"
for BE, or "s<*"
for LE, instead of the "s*"
.
使用"s>*"
的是或"s<*"
为LE,而不是"s*"
。
P.P.S. If it is audio data, you can also check the sox
tool. Haven't used it in ages, but IIRC it could convert anything PCM-like from literally any format to any format, while also applying effects.
PPS 如果是音频数据,也可以查看sox
工具。很久没有使用它了,但是 IIRC 它可以将任何类似 PCM 的格式从任何格式转换为任何格式,同时还可以应用效果。
回答by Brian Cain
I would recommend Python over bash
. For this particular task, it's simpler/saner IMO.
我会推荐 Python 而不是bash
. 对于此特定任务,IMO 更简单/更理智。
#!/usr/bin/env python
import array
with open('input.csv', 'rt') as f:
text = f.read()
entries = text.split(',')
values = [int(x) for x in entries]
# do a scalar here: if your input goes from [-100, 100] then
# you may need to translate/scale into [0, 2^16-1] for
# 16-bit PCM
# e.g.:
# values = [(val * scale) for val in values]
with open('output.pcm', 'wb') as out:
pcm_vals = array.array('h', values) # 16-bit signed
pcm_vals.tofile(out)
You could also use Python's wave
moduleinstead of just writing raw PCM.
您还可以使用Python 的wave
模块,而不仅仅是编写原始 PCM。
Here's how the example above works:
下面是上面例子的工作原理:
$ echo 1,2,3,4,5,6,7 > input.csv
$ ./so_pcm.py
$ xxd output.pcm
0000000: 0100 0200 0300 0400 0500 0600 0700 ..............
xxd
shows the binary values. It used my machine's native endianness (little).
xxd
显示二进制值。它使用了我机器的本机字节序(小)。