bash 为什么 ps o/p 在管道之后列出 grep 进程?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/6893714/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-18 00:29:20  来源:igfitidea点击:

Why does ps o/p list the grep process after the pipe?

linuxbashpipeps

提问by Ankur Agarwal

When I do

当我做

$ ps -ef | grep cron

I get

我得到

root      1036     1  0 Jul28 ?        00:00:00 cron
abc    21025 14334  0 19:15 pts/2    00:00:00 grep --color=auto cron

My question is why do I see the second line. From my understanding, pslists the processes and pipes the list to grep. grephasn't even started running while psis listing processes, then how come grepprocess is listed in the o/p ?

我的问题是为什么我会看到第二行。根据我的理解,ps列出进程并将列表通过管道传输到grep. grepps列出进程时甚至还没有开始运行,那么进程如何grep列在 o/p 中?

Related second question:

相关第二个问题:

When I do

当我做

$ ps -ef | grep [c]ron

I get only

我只得到

root      1036     1  0 Jul28 ?        00:00:00 cron

What is the difference between first and second grepexecutions?

第一次和第二次grep执行有什么区别?

回答by dAm2K

When you execute the command:

执行命令时:

ps -ef | grep cron

the shell you are using

您正在使用的外壳

(...I assume bash in your case, due to the color attribute of grep I think you are running a gnu system like a linux distribution, but it's the same on other unix/shell as well...)

(...我假设 bash 在你的情况下,由于 grep 的颜色属性,我认为你正在运行一个像 Linux 发行版一样的 gnu 系统,但它在其他 unix/shell 上也是一样的......)

will execute the pipe()call to create a FIFO, then it will fork()(make a running copy of itself). This will create a new child process. This new generated child process will close()its standard output file descriptor (fd 1) and attach the fd 1 to the write side of the pipe created by the father process (the shell where you executed the command). This is possible because the fork()syscall will maintain, for each, a valid open file descriptor (the pipe fd in this case). After doing so it will exec()the first (in your case) pscommand found in your PATHenvironment variable. With the exec()call the process will become the command you executed.

将执行pipe()调用以创建一个 FIFO,然后它会fork()(制作自己的运行副本)。这将创建一个新的子进程。这个新生成的子进程将close()其标准输出文件描述符 (fd 1) 并将 fd 1 附加到由父进程(您执行命令的 shell)创建的管道的写入端。这是可能的,因为fork()系统调用将为每个人维护一个有效的打开文件描述符(在这种情况下是管道 fd)。这样做之后,它将exec()ps在您的PATH环境变量中找到的第一个(在您的情况下)命令。通过exec()调用,该过程将成为您执行的命令。

So, you now have the shell process with a child that is, in your case, the pscommand with -efattributes.

因此,您现在拥有带有子进程的 shell 进程,在您的情况下,它是ps具有-ef属性的命令。

At this point, the parent (the shell) fork()s again. This newly generated child process close()s its standard input file descriptor (fd 0) and attaches the fd 0 to the read side of the pipe created by the father process (the shell where you executed the command).

此时,父(shell)fork()再次出现。这个新生成的子进程使用close()它的标准输入文件描述符 (fd 0) 并将 fd 0 附加到由父进程(执行命令的 shell)创建的管道的读取端。

After doing so it will exec()the first (in your case) grepcommand found in your PATH environment variable.

执行此操作后,它将在您的 PATH 环境变量中找到exec()第一个(在您的情况下)grep命令。

Now you have the shell process with two children (that are siblings) where the first one is the pscommand with -efattributes and the second one is the grepcommand with the cronattribute. The read side of the pipe is attached to the STDINof the grepcommand and the write side is attached to the STDOUTof the pscommand: the standard output of the pscommand is attached to the standard input of the grepcommand.

现在,您拥有带有两个子项(即兄弟项)的 shell 进程,其中第一个是ps具有-ef属性的命令,第二个是grep具有cron属性的命令。管的读取端附接至STDIN所述的grep命令和写入侧附接至STDOUT所述的ps命令:所述的标准输出ps命令被附接到的标准输入grep命令。

Since psis written to send on the standard output info on each running process, while grep is written to get on its standard input something that has to match a given pattern, you'll have the answer to your first question:

由于ps编写是为了发送每个正在运行的进程的标准输出信息,而编写 grep 是为了获取必须匹配给定模式的标准输入,因此您将得到第一个问题的答案:

  1. the shell runs: ps -ef;
  2. the shell runs: grep cron;
  3. pssends data (that even contains the string "grep cron") to grep
  4. grepmatches its search pattern from the STDINand it matches the string "grep cron" because of the "cron" attribute you passed in to grep: you are instructing grepto match the "cron" string and it does because "grep cron" is a string returned by psat the time grephas started its execution.
  1. 外壳运行: ps -ef;
  2. 外壳运行: grep cron;
  3. ps将数据(甚至包含字符串“grep cron”)发送到 grep
  4. grep匹配它的搜索模式STDIN并且它匹配字符串“grep cron”,因为你传入的“cron”属性grep:你正在指示grep匹配“cron”字符串,它匹配是因为“grep cron”是一个由返回的字符串ps当时grep已经开始执行。

When you execute:

执行时:

ps -ef | grep '[c]ron'

the attribute passed instructs grepto match something containing "c" followed by "ron". Like the first example, but in this case it will break the match string returned by psbecause:

传递的属性指示grep匹配包含“c”后跟“ron”的内容。像第一个例子,但在这种情况下,它会破坏返回的匹配字符串,ps因为:

  1. the shell runs: ps -ef;
  2. the shell runs: grep [c]ron;
  3. pssends data (that even contains the string grep [c]ron) to grep
  4. grepdoes not match its search pattern from the stdin because a string containing "c" followed by "ron" it's not found, but it has found a string containing "c" followed by "]ron"
  1. 外壳运行: ps -ef;
  2. 外壳运行: grep [c]ron;
  3. ps将数据(甚至包含字符串grep [c]ron)发送到grep
  4. grep与标准输入中的搜索模式不匹配,因为未找到包含“c”后跟“ron”的字符串,但它已找到包含“c”后跟“]ron”的字符串

GNU grepdoes not have any string matching limit, and on some platforms (I think Solaris, HPUX, aix) the limit of the string is given by the "$COLUMN" variable or by the terminal's screen width.

GNUgrep没有任何字符串匹配限制,在某些平台(我认为 Solaris、HPUX、aix)上,字符串的限制由“$COLUMN”变量或终端的屏幕宽度给出。

Hopefully this long response clarifies the shell pipe process a bit.

希望这个长响应能稍微澄清一下 shell 管道过程。

TIP:

提示:

ps -ef | grep cron | grep -v grep

回答by GoldenNewby

In your command

在你的命令中

ps -ef | grep 'cron'

Linux is executing the "grep" command before the ps -ef command. Linux then maps the standard output (STDOUT) of "ps -ef" to the standard input (STDIN) of the grep command.

Linux 在 ps -ef 命令之前执行“grep”命令。Linux 然后将“ps -ef”的标准输出 (STDOUT) 映射到 grep 命令的标准输入 (STDIN)。

It does not execute the ps command, store the result in memory, and them pass it to grep. Think about that, why would it? Imagine if you were piping a hundred gigabytes of data?

它不执行 ps 命令,将结果存储在内存中,然后将其传递给 grep。想一想,为什么会这样?想象一下,如果您正在传输 100 GB 的数据?

Edit In regards to your second question:

编辑关于你的第二个问题:

In grep (and most regular expression engines), you can specify brackets to let it know that you'll accept ANY character in the brackets. So writing [c] means it will accept any charcter, but only c is specified. Similarly, you could do any other combination of characters.

在 grep(和大多数正则表达式引擎)中,您可以指定方括号以使其知道您将接受方括号中的任何字符。所以写 [c] 意味着它可以接受任何字符,但只指定了 c。同样,您可以进行任何其他字符组合。

ps aux | grep cron
root      1079  0.0  0.0  18976  1032 ?        Ss   Mar08   0:00 cron
root     23744  0.0  0.0  14564   900 pts/0    S+   21:13   0:00 grep --color=auto cron

^ That matches itself, because your own command contains "cron"

^ 匹配自身,因为您自己的命令包含“cron”

ps aux | grep [c]ron
root      1079  0.0  0.0  18976  1032 ?        Ss   Mar08   0:00 cron

That matches cron, because cron contains a c, and then "ron". It does not match your request though, because your request is [c]ron

这与 cron 匹配,因为 cron 包含 ac,然后是“ron”。但是它与您的请求不匹配,因为您的请求是 [c]ron

You can put whatever you want in the brackets, as long as it contains the c:

你可以把任何你想要的放在括号里,只要它包含 c:

ps aux | grep [cbcdefadq]ron
root      1079  0.0  0.0  18976  1032 ?        Ss   Mar08   0:00 cron

If you remove the C, it won't match though, because "cron", starts with a c:

如果删除 C,它不会匹配,因为“cron”以 ac 开头:

ps aux | grep [abedf]ron

^ Has no results

^ 没有结果

Edit 2

编辑 2

To reiterate the point, you can do all sorts of crazy stuff with grep. There's no significance in picking the first character to do this with.

重申这一点,您可以使用 grep 做各种疯狂的事情。选择第一个字符来执行此操作没有任何意义。

ps aux | grep [c][ro][ro][n]
root      1079  0.0  0.0  18976  1032 ?        Ss   Mar08   0:00 cron

回答by Ben Hymanson

The shell constructs your pipeline with a series of fork(), pipe()and exec()calls. Depending on the shell any part of it may be constructed first. So grepmay already be running before pseven starts. Or, even if psstarts first it will be writing into a 4k kernel pipe buffer and will eventually block (while printing a line of process output) until grepstarts up and begins consuming the data in the pipe. In the latter case if psis able to start and finish before grepeven starts you may not see the grep cronin the output. You may have noticed this non-determinism at play already.

Shell 使用一系列fork(),pipe()exec()调用构建您的管道。根据外壳的不同,它的任何部分都可以先构建。所以grep可能在ps开始之前就已经在运行了。或者,即使ps首先启动,它也会写入 4k 内核管道缓冲区并最终阻塞(在打印一行进程输出时),直到grep启动并开始使用管道中的数据。在后一种情况下,如果ps能够在开始之前开始和结束,grep您可能看不到grep cron输出中的 。您可能已经注意到这种非确定性在起作用。

回答by Zac Thompson

You wrote: "From my understanding, ps lists the processes and pipes the list to grep. grep hasn't even started running while ps is listing processes".

您写道:“根据我的理解,ps 列出了进程并将列表通过管道传递给 grep。当 ps 列出进程时,grep 甚至还没有开始运行”。

Your understanding is incorrect.

你的理解是错误的。

That is not how a pipeline works. The shell does notrun the first command to completion, remember the output of the first command, and then afterwardsrun the next command using that data as input. No. Instead, bothprocesses execute and their inputs/outputs are connected. As Ben Hymanson wrote, there is nothing to particularly guarantee that the processes run at the same time, if they are both very short-lived, and if the kernel can comfortably manage the small amount of data passing through the connection. In that case, it really could happen the way you expect, only by chance. But the conceptual model to keep in mind is that they run in parallel.

这不是管道的工作方式。壳并没有运行的第一个命令完成,记住所述第一命令的输出,然后后来运行使用该数据作为输入的下一个命令。否。相反,两个进程都执行并且它们的输入/输出是连接的。正如 Ben Hymanson 所写,没有什么可以特别保证进程同时运行,如果它们都非常短暂,并且内核可以轻松地管理通过连接传递的少量数据。在那种情况下,它真的可能以您期望的方式发生,只是偶然。但要记住的概念模型是它们并行运行。

If you want official sources, how about the bash man page:

如果你想要官方资源,bash 手册页怎么样:

  A pipeline is a sequence of one or more commands separated by the character |.  The format for a pipeline is:

         [time [-p]] [ ! ] command [ | command2 ... ]

  The  standard  output  of command is connected via a pipe to the standard input of command2.  This connection is
  performed before any redirections specified by the command (see REDIRECTION below).

  ...

  Each command in a pipeline is executed as a separate process (i.e., in a subshell).


As for your second question (which is not really related at all, I am sorry to say), you are just describing a feature of how regular expressions work. The regular expression cronmatches the string cron. The regular expression [c]rondoes notmatch the string [c]ron. Thus the first grep command will find itself in a process list, but the second one will not.

至于你的第二个问题(我很抱歉地说,这根本不相关),你只是在描述正则表达式如何工作的一个特性。正则表达式cron匹配字符串cron。正则表达式[c]ron没有匹配的字符串[c]ron。因此,第一个 grep 命令会在进程列表中找到自己,而第二个则不会。

回答by Michael Berkowski

Your actual question has been answered by others, but I'll offer a tip: If you would like to avoid seeing the grepprocess listed, you can do it this way:

其他人已经回答了您的实际问题,但我会提供一个提示:如果您想避免看到grep列出的过程,可以这样做:

$ ps -ef | grep [c]ron

回答by Felipe Alvarez

pgrepis sometimes better than ps -ef | grep wordbecause it exclude the grep. Try

pgrep有时比ps -ef | grep word因为它排除grep. 尝试

pgrep -f bash
pgrep -lf bash

回答by Sudhir Meena

$ ps -ef | grep cron

Linux Shell always execute command from right to left. so, before ps -ef execution grep cron already executed that's why o/p show's the command itself.

Linux Shell 总是从右到左执行命令。因此,在 ps -ef 执行之前 grep cron 已经执行,这就是为什么 o/p show 是命令本身的原因。

$ ps -ef | grep [c]ron

But in this u specified grep ron followed by only c. so, o/p is without command line because in command there is [c]ron.

但是在这个 u 中指定了 grep ron 后跟只有 c。所以,o/p 没有命令行,因为在命令中有 [c]ron。