Linux 使用 Bash 按列拆分命令的输出?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/1629908/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-03 17:49:44  来源:igfitidea点击:

Split output of command by columns using Bash?

linuxbashpipe

提问by flybywire

I want to do this:

我想做这个:

  1. run a command
  2. capture the output
  3. select a line
  4. select a column of that line
  1. 运行命令
  2. 捕获输出
  3. 选择一行
  4. 选择该行的一列

Just as an example, let's say I want to get the command name from a $PID(please note this is just an example, I'm not suggesting this is the easiest way to get a command name from a process id - my real problem is with another command whose output format I can't control).

举个例子,假设我想从 a 获取命令名称$PID(请注意这只是一个例子,我并不是说这是从进程 ID 获取命令名称的最简单方法 - 我真正的问题是另一个我无法控制其输出格式的命令)。

If I run psI get:

如果我跑ps我得到:


  PID TTY          TIME CMD
11383 pts/1    00:00:00 bash
11771 pts/1    00:00:00 ps

Now I do ps | egrep 11383and get

现在我做的ps | egrep 11383,并得到

11383 pts/1    00:00:00 bash

Next step: ps | egrep 11383 | cut -d" " -f 4. Output is:

下一步:ps | egrep 11383 | cut -d" " -f 4。输出是:

<absolutely nothing/>

The problem is that cutcuts the output by single spaces, and as psadds some spaces between the 2nd and 3rd columns to keep some resemblance of a table, cutpicks an empty string. Of course, I could use cutto select the 7th and not the 4th field, but how can I know, specially when the output is variable and unknown on beforehand.

问题是cut通过单个空格切割输出,并ps在第 2 列和第 3 列之间添加一些空格以保持表格的一些相似性,cut选择一个空字符串。当然,我可以cut用来选择第 7 个而不是第 4 个字段,但是我怎么知道,特别是当输出是可变的并且事先未知时。

采纳答案by unwind

One easy way is to add a pass of trto squeeze any repeated field separators out:

一种简单的方法是添加一个 passtr以挤出任何重复的字段分隔符:

$ ps | egrep 11383 | tr -s ' ' | cut -d ' ' -f 4

回答by soulmerge

Getting the correct line (example for line no. 6) is done with head and tail and the correct word (word no. 4) can be captured with awk:

使用 head 和 tail 获取正确的行(例如第 6 行),并且可以使用 awk 捕获正确的单词(第 4 个单词):

command|head -n 6|tail -n 1|awk '{print }'

回答by brianegge

I think the simplest way is to use awk. Example:

我认为最简单的方法是使用awk。例子:

$ echo "11383 pts/1    00:00:00 bash" | awk '{ print ; }'
bash

回答by James Anderson

try

尝试

ps |&
while read -p first second third fourth etc ; do
   if [[ $first == '11383' ]]
   then
       echo got: $fourth
   fi       
done

回答by P Shved

Instead of doing all these greps and stuff, I'd advise you to use ps capabilities of changing output format.

我建议您不要使用所有这些 grep 和东西,而是使用 ps 更改输出格式的功能。

ps -o cmd= -p 12345

You get the cmmand line of a process with the pid specified and nothing else.

您将获得指定 pid 的进程的 cmdand 行,而没有其他任何内容。

This is POSIX-conformant and may be thus considered portable.

这是符合 POSIX 的,因此可以被认为是可移植的。

回答by frayser

Using array variables

使用数组变量

set $(ps | egrep "^11383 "); echo 

or

或者

A=( $(ps | egrep "^11383 ") ) ; echo ${A[3]}

回答by Xennex81

Please note that the tr -s ' 'option will not remove any single leading spaces. If your column is right-aligned (as with pspid)...

请注意,该tr -s ' '选项不会删除任何单个前导空格。如果您的列是右对齐的(与pspid 一样)...

$ ps h -o pid,user -C ssh,sshd | tr -s " "
 1543 root
19645 root
19731 root

Then cutting will result in a blank line for some of those fields if it is the first column:

如果它是第一列,那么切割将导致其中一些字段的空行:

$ <previous command> | cut -d ' ' -f1

19645
19731

Unless you precede it with a space, obviously

除非你在它前面加一个空格,显然

$ <command> | sed -e "s/.*/ &/" | tr -s " "

Now, for this particular case of pid numbers (not names), there is a function called pgrep:

现在,对于 pid 数字(不是名称)的这种特殊情况,有一个名为 的函数pgrep

$ pgrep ssh


Shell functions


外壳函数

However, in general it is actually still possible to use shell functionsin a concise manner, because there is a neat thing about the readcommand:

但是,总的来说,实际上仍然可以以简洁的方式使用shell 函数,因为该read命令有一个巧妙的地方:

$ <command> | while read a b; do echo $a; done

The first parameter to read, a, selects the first column, and if there is more, everything elsewill be put in b. As a result, you never need more variables than the number of your column +1.

要读取的第一个参数a,选择第一列,如果还有更多,其他所有内容都会放入b。因此,您永远不需要比列数更多的变量+1

So,

所以,

while read a b c d; do echo $c; done

will then output the 3rd column. As indicated in my comment...

然后将输出第三列。正如我的评论中指出的那样......

A piped read will be executed in an environment that does not pass variables to the calling script.

管道读取将在不将变量传递给调用脚本的环境中执行。

out=$(ps whatever | { read a b c d; echo $c; })

arr=($(ps whatever | { read a b c d; echo $c $b; }))
echo ${arr[1]}     # will output 'b'`


The Array Solution


阵列解决方案

So we then end up with the answer by @frayser which is to use the shell variable IFS which defaults to a space, to split the string into an array. It only works in Bash though. Dash and Ash do not support it. I have had a really hard time splitting a string into components in a Busybox thing. It is easy enough to get a single component (e.g. using awk) and then to repeat that for every parameter you need. But then you end up repeatedly calling awk on the same line, or repeatedly using a read block with echo on the same line. Which is not efficient or pretty. So you end up splitting using ${name%% *}and so on. Makes you yearn for some Python skills because in fact shell scripting is not a lot of fun anymore if half or more of the features you are accustomed to, are gone. But you can assume that even python would not be installed on such a system, and it wasn't ;-).

因此,我们最终得到@frayser 的答案,即使用默认为空格的 shell 变量 IFS 将字符串拆分为数组。不过它只适用于 Bash。Dash 和 Ash 不支持它。在 Busybox 中,我很难将一个字符串拆分为多个组件。获取单个组件(例如使用 awk)然后为您需要的每个参数重复该组件是很容易的。但是你最终会在同一行上反复调用 awk,或者在同一行上反复使用带有 echo 的读取块。这既不高效也不漂亮。所以你最终分裂使用 ${name%% *}等等。让你渴望掌握一些 Python 技能,因为实际上,如果你习惯的一半或更多功能都消失了,shell 脚本编写就不再有趣了。但是你可以假设在这样的系统上甚至不会安装 python,它不是;-)。

回答by fedorqui 'SO stop harming'

Your command

你的命令

ps | egrep 11383 | cut -d" " -f 4

misses a tr -sto squeeze spaces, as unwind explains in his answer.

错过 atr -s来挤压空间,正如 unwind 在他的回答中所解释的那样。

However, you maybe want to use awk, since it handles all of these actions in a single command:

但是,您可能想要使用awk,因为它在单个命令中处理所有这些操作:

ps | awk '/11383/ {print }'

This prints the 4th column in those lines containing 11383. If you want this to match 11383if it appears in the beginning of the line, then you can say ps | awk '/^11383/ {print $4}'.

这将打印包含11383. 如果您希望它匹配11383出现在行首,那么您可以说ps | awk '/^11383/ {print $4}'.

回答by Chris Koknat

Similar to brianegge's awk solution, here is the Perl equivalent:

类似于 brianegge 的 awk 解决方案,这里是 Perl 的等价物:

ps | egrep 11383 | perl -lane 'print $F[3]'

-aenables autosplit mode, which populates the @Farray with the column data.
Use -F,if your data is comma-delimited, rather than space-delimited.

-a启用自动拆分模式,该模式@F使用列数据填充数组。
使用-F,如果你的数据是用逗号分隔的,而不是空格分隔。

Field 3 is printed since Perl starts counting from 0 rather than 1

因为 Perl 从 0 而不是 1 开始计数,所以会打印字段 3

回答by dman

Bash's setwill parse all output into position parameters.

Bashset会将所有输出解析为位置参数。

For instance, with set $(free -h)command, echo $7will show "Mem:"

例如,使用set $(free -h)命令,echo $7将显示“Mem:”