Linux 如何转储二进制文件的一部分

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/9451890/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-06 04:49:43  来源:igfitidea点击:

How to dump part of binary file

linuxbashterminaldump

提问by theta

I have binary and want to extract part of it, starting from know byte string (i.e. FF D8 FF D0) and ending with known byte string (AF FF D9)

我有二进制文件,想提取其中的一部分,从已知字节字符串(即 FF D8 FF D0)开始,以已知字节字符串(AF FF D9)结尾

In the past I've used ddto cut part of binary file from beginning/ending but this command doesn't seem to support what I ask.

过去我曾经dd从头/尾剪切二进制文件的一部分,但这个命令似乎不支持我的要求。

What tool on terminal can do this?

终端上的什么工具可以做到这一点?

采纳答案by jfg956

In a single pipe:

在单个管道中:

xxd -c1 -p file |
  awk -v b="ffd8ffd0" -v e="aaffd9" '
    found == 1 {
      print 
$ xxd -g0 input.bin | grep -im1 FFD8FFD0  | awk -F: '{print }'
0000cb0
$ ^FFD8FFD0^AFFFD9^
0009590
$ dd ibs=1 count=$((0x9590-0xcb0+1)) skip=$((0xcb0)) if=input.bin of=output.bin
str = str
file=<yourfile>
outfile=<youroutputfile>
startpattern="ff d8 ff d0"
endpattern="af ff d9"
xxd -g0 -c1 -ps ${file} | tr '\n' ' ' > ${file}.hex 
start=$((($(grep -bo "${startpattern}" ${file}.hex\
    | head -1 | awk -F: '{print }')-1)/3))
len=$((($(grep -bo "${endpattern}" ${file}.hex\
    | head -1 | awk -F: '{print }')-1)/3-${start}))
dd ibs=1 count=${len} skip=${start} if=${file} of=${outfile}
if (str == e) {found = 0; exit} if (length(str) == length(e)) str = substr(str, 3)} found == 0 { str = str
xxd -c1 -p file |
  tr "\n" " " |
  sed -n -e 's/.*\(ff d8 ff d0.*aa ff d9\).*//p' |
  xxd -r -p > new_file
if (str == b) {found = 1; print str; str = ""} if (length(str) == length(b)) str = substr(str, 3)} END{ exit found }' | xxd -r -p > new_file test ${PIPESTATUS[1]} -eq 0 || rm new_file

The idea is to use awkbetween two xxdto select the part of the file that is needed. Once the 1st pattern is found, awkprints the bytes until the 2nd pattern is found and exit.

这个想法是在awk两者之间使用xxd来选择所需的文件部分。一旦找到第一个模式,awk打印字节直到找到第二个模式并退出。

The case where the 1st pattern is found but the 2nd is not must be taken into account. It is done in the ENDpart of the awkscript, which return a non-zero exit status. This is catch by bash's ${PIPESTATUS[1]}where I decided to delete the new file.

必须考虑找到第一个模式但没有找到第二个模式的情况。它是在脚本的END一部分中完成的awk,它返回一个非零退出状态。这是渔获bash${PIPESTATUS[1]},我决定删除新的文件。

Note that en empty file also mean that nothing has been found.

请注意, en 空文件也意味着未找到任何内容。

回答by kev

Locate the start/end position, then extract the range.

定位开始/结束位置,然后提取范围。

xxd -c1 -p file |
  sed -n -e '1{N;N;N}' -e '/ff\nd8\nff\nd0/{:begin;p;s/.*//;n;bbegin}' -e 'N;D' | 
  sed -n -e '1{N;N}' -e '/aa\nff\nd9/{p;Q1}' -e 'P;N;D' |
  xxd -r -p > new_file
test ${PIPESTATUS[2]} -eq 1 || rm new_file

回答by Laurent Grégtheitroade

See this linkfor a way to do binary grep. Once you have the start and end offset, you should be able with ddto get what you need.

有关执行二进制 grep 的方法,请参阅此链接。一旦你有了开始和结束的偏移量,你应该能够dd得到你需要的东西。

回答by Laurent Grégtheitroade

This should work with standard tools (xxd, tr, grep, awk, dd). This correctly handles the "pattern split across line" issue, also look for the pattern only aligned at byte offset (not nibble).

这应该适用于标准工具(xxd、tr、grep、awk、dd)。这可以正确处理“跨行模式拆分”问题,还可以查找仅在字节偏移量(不是半字节)处对齐的模式。

##代码##

Note:The script above use a temporary file to prevent having the binary>hex conversion twice. A space/time trade-off is to pipe the result of xxddirectly into the two grep. A one-liner is also possible, at the expense of clarity.

注意:上面的脚本使用一个临时文件来防止二进制>十六进制转换两次。空间/时间权衡是将 的结果xxd直接通过管道传输到两者中grep。单衬也是可能的,但要以清晰度为代价。

One could also use teeand named pipe to prevent having to store a temporary file and converting output twice, but I'm not sure it would be faster (xxd is fast) and is certainly more complex to write.

还可以使用tee命名管道来防止必须存储临时文件和两次转换输出,但我不确定它会更快(xxd 很快)并且编写起来肯定更复杂。

回答by jfg956

A variation on the awksolution that assumes that your binary file, once converted in hex with spaces, fits in memory:

awk解决方案的一个变体假设您的二进制文件一旦以带有空格的十六进制转换后适合内存:

##代码##

回答by jfg956

Another solution in sed, but using less memory:

中的另一种解决方案sed,但使用较少的内存:

##代码##

The 1st sedprints from ff d8 ff d0till the end of file. Note that you need as much Nin -e '1{N;N;N}'as there is bytes in your 1st pattern less one.

第一个sed打印ff d8 ff d0到文件末尾。请注意,你需要尽可能多N-e '1{N;N;N}',因为在你的第一个模式字节少一个

The 2nd sedprints from the beginning of the file to aa ff d9. Note again that you need as much Nin -e '1{N;N}'as there is bytes in your 2nd pattern less one.

第二个sed从文件的开头打印到aa ff d9. 再次注意,你需要尽可能多N-e '1{N;N}',因为在你的第二个模式字节少一个

Again, a test is needed to check if the 2nd pattern is found, and delete the file if it is not.

同样,需要进行测试以检查是否找到了第二个模式,如果没有,则删除该文件。

Note that the Qcommand is a GNU extension to sed. If you do not have it, you need to trash the rest of the file once the pattern is found (in a loop like the 1st sed, but not printing the file), and check after hex to binary conversion that the new_file end with the wright pattern.

请注意,该Q命令是sed. 如果你没有它,你需要在找到模式后删除文件的其余部分(在第一个循环中sed,但不打印文件),并在十六进制到二进制转换后检查 new_file 是否以 wright 结尾图案。