Linux 如何转储二进制文件的一部分
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/9451890/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to dump part of binary file
提问by theta
I have binary and want to extract part of it, starting from know byte string (i.e. FF D8 FF D0) and ending with known byte string (AF FF D9)
我有二进制文件,想提取其中的一部分,从已知字节字符串(即 FF D8 FF D0)开始,以已知字节字符串(AF FF D9)结尾
In the past I've used dd
to cut part of binary file from beginning/ending but this command doesn't seem to support what I ask.
过去我曾经dd
从头/尾剪切二进制文件的一部分,但这个命令似乎不支持我的要求。
What tool on terminal can do this?
终端上的什么工具可以做到这一点?
采纳答案by jfg956
In a single pipe:
在单个管道中:
xxd -c1 -p file |
awk -v b="ffd8ffd0" -v e="aaffd9" '
found == 1 {
print $ xxd -g0 input.bin | grep -im1 FFD8FFD0 | awk -F: '{print }'
0000cb0
$ ^FFD8FFD0^AFFFD9^
0009590
$ dd ibs=1 count=$((0x9590-0xcb0+1)) skip=$((0xcb0)) if=input.bin of=output.bin
str = str file=<yourfile>
outfile=<youroutputfile>
startpattern="ff d8 ff d0"
endpattern="af ff d9"
xxd -g0 -c1 -ps ${file} | tr '\n' ' ' > ${file}.hex
start=$((($(grep -bo "${startpattern}" ${file}.hex\
| head -1 | awk -F: '{print }')-1)/3))
len=$((($(grep -bo "${endpattern}" ${file}.hex\
| head -1 | awk -F: '{print }')-1)/3-${start}))
dd ibs=1 count=${len} skip=${start} if=${file} of=${outfile}
if (str == e) {found = 0; exit}
if (length(str) == length(e)) str = substr(str, 3)}
found == 0 {
str = str xxd -c1 -p file |
tr "\n" " " |
sed -n -e 's/.*\(ff d8 ff d0.*aa ff d9\).*//p' |
xxd -r -p > new_file
if (str == b) {found = 1; print str; str = ""}
if (length(str) == length(b)) str = substr(str, 3)}
END{ exit found }' |
xxd -r -p > new_file
test ${PIPESTATUS[1]} -eq 0 || rm new_file
The idea is to use awk
between two xxd
to select the part of the file that is needed. Once the 1st pattern is found, awk
prints the bytes until the 2nd pattern is found and exit.
这个想法是在awk
两者之间使用xxd
来选择所需的文件部分。一旦找到第一个模式,awk
打印字节直到找到第二个模式并退出。
The case where the 1st pattern is found but the 2nd is not must be taken into account. It is done in the END
part of the awk
script, which return a non-zero exit status. This is catch by bash
's ${PIPESTATUS[1]}
where I decided to delete the new file.
必须考虑找到第一个模式但没有找到第二个模式的情况。它是在脚本的END
一部分中完成的awk
,它返回一个非零退出状态。这是渔获bash
的${PIPESTATUS[1]}
,我决定删除新的文件。
Note that en empty file also mean that nothing has been found.
请注意, en 空文件也意味着未找到任何内容。
回答by kev
Locate the start/end position, then extract the range.
定位开始/结束位置,然后提取范围。
xxd -c1 -p file |
sed -n -e '1{N;N;N}' -e '/ff\nd8\nff\nd0/{:begin;p;s/.*//;n;bbegin}' -e 'N;D' |
sed -n -e '1{N;N}' -e '/aa\nff\nd9/{p;Q1}' -e 'P;N;D' |
xxd -r -p > new_file
test ${PIPESTATUS[2]} -eq 1 || rm new_file
回答by Laurent Grégtheitroade
回答by Laurent Grégtheitroade
This should work with standard tools (xxd, tr, grep, awk, dd). This correctly handles the "pattern split across line" issue, also look for the pattern only aligned at byte offset (not nibble).
这应该适用于标准工具(xxd、tr、grep、awk、dd)。这可以正确处理“跨行模式拆分”问题,还可以查找仅在字节偏移量(不是半字节)处对齐的模式。
##代码##Note:The script above use a temporary file to prevent having the binary>hex conversion twice. A space/time trade-off is to pipe the result of xxd
directly into the two grep
. A one-liner is also possible, at the expense of clarity.
注意:上面的脚本使用一个临时文件来防止二进制>十六进制转换两次。空间/时间权衡是将 的结果xxd
直接通过管道传输到两者中grep
。单衬也是可能的,但要以清晰度为代价。
One could also use tee
and named pipe to prevent having to store a temporary file and converting output twice, but I'm not sure it would be faster (xxd is fast) and is certainly more complex to write.
还可以使用tee
命名管道来防止必须存储临时文件和两次转换输出,但我不确定它会更快(xxd 很快)并且编写起来肯定更复杂。
回答by jfg956
A variation on the awk
solution that assumes that your binary file, once converted in hex with spaces, fits in memory:
该awk
解决方案的一个变体假设您的二进制文件一旦以带有空格的十六进制转换后适合内存:
回答by jfg956
Another solution in sed
, but using less memory:
中的另一种解决方案sed
,但使用较少的内存:
The 1st sed
prints from ff d8 ff d0
till the end of file. Note that you need as much N
in -e '1{N;N;N}'
as there is bytes in your 1st pattern less one.
第一个sed
打印ff d8 ff d0
到文件末尾。请注意,你需要尽可能多N
的-e '1{N;N;N}'
,因为在你的第一个模式字节少一个。
The 2nd sed
prints from the beginning of the file to aa ff d9
. Note again that you need as much N
in -e '1{N;N}'
as there is bytes in your 2nd pattern less one.
第二个sed
从文件的开头打印到aa ff d9
. 再次注意,你需要尽可能多N
的-e '1{N;N}'
,因为在你的第二个模式字节少一个。
Again, a test is needed to check if the 2nd pattern is found, and delete the file if it is not.
同样,需要进行测试以检查是否找到了第二个模式,如果没有,则删除该文件。
Note that the Q
command is a GNU extension to sed
. If you do not have it, you need to trash the rest of the file once the pattern is found (in a loop like the 1st sed
, but not printing the file), and check after hex to binary conversion that the new_file end with the wright pattern.
请注意,该Q
命令是sed
. 如果你没有它,你需要在找到模式后删除文件的其余部分(在第一个循环中sed
,但不打印文件),并在十六进制到二进制转换后检查 new_file 是否以 wright 结尾图案。