bash 我可以链接多个命令并使所有命令都从标准输入获取相同的输入吗?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/986017/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-17 20:57:58  来源:igfitidea点击:

Can I chain multiple commands and make all of them take the same input from stdin?

bashunixshellawk

提问by Brian Agnew

In bash, is there a way to chain multiple commands, all taking the same input from stdin? That is, one command reads stdin, does some processing, writes the output to a file. The next command in the chain gets the same input as what the first command got. And so on.

在 bash 中,有没有办法链接多个命令,所有命令都从 stdin 获取相同的输入?也就是说,一个命令读取标准输入,进行一些处理,将输出写入文件。链中的下一个命令获得的输入与第一个命令获得的输入相同。等等。

For example, consider a large text file to be split into multiple files by filtering the content. Something like this:

例如,考虑通过过滤内容将大文本文件拆分为多个文件。像这样的东西:

cat food_expenses.txt | grep "coffee" > coffee.txt | grep "tea" > tea.txt | grep "honey cake" > cake.txt

This obviously does not work, because the second grep gets the first grep's output, not the original text file. I tried inserting tee'sbut that does not help. Is there some bash magic that can cause the first grep to send its inputto the pipe, not the output?

这显然行不通,因为第二个 grep 获取的是第一个 grep 的输出,而不是原始文本文件。我尝试插入T 恤,但这无济于事。是否有一些 bash 魔法可以导致第一个 grep 将其输入发送到管道,而不是输出?

And by the way, splitting a file was a simple example. Consider splitting (filering by pattern search) a continuous live text stream coming over a network and writing the output to different named pipes or sockets. I would like to know if there is an easy way to do it using a shell script.

顺便说一下,拆分文件是一个简单的例子。考虑拆分(通过模式搜索归档)来自网络的连续实时文本流,并将输出写入不同的命名管道或套接字。我想知道是否有使用 shell 脚本的简单方法。

(This question is a cleaned up version of my earlier one , based on responses that pointed out the unclearness)

(这个问题是我之前的一个清理版本,基于指出不清楚的回答)

采纳答案by Nate Kohl

I like Stephen'sidea of using awkinstead of grep.

我喜欢斯蒂芬的想法,awk而不是使用grep.

It ain't pretty, but here's a command that uses output redirection to keep all data flowing through stdout:

它并不漂亮,但这是一个使用输出重定向来保持所有数据流过的命令stdout

cat food.txt | 
awk '/coffee/ {print 
cat food.txt | 
awk '/coffee/ {print 
tee <food_expenses.txt \
  >(grep "coffee" >coffee.txt) \
  >(grep "tea" >tea.txt) \
  >(grep "honey cake" >cake.txt)
> "/dev/stderr"}
$ (grep coffee food_expenses.txt > coffee.txt) && (grep tea food_expenses.txt > tea.txt) 
!~ /coffee/ {print
#!/bin/bash
tail -F food_expenses.txt | \
while read line
do
    for word in "coffee" "tea" "honey cake"
    do
        if [[ $line != ${line#*$word*} ]]
        then
            echo "$line"|grep "$word" >> ${word#* }.txt # use the last word in $word for the filename (i.e. cake.txt for "honey cake")
        fi
    done
done
}' 2> coffee.txt | awk '/tea/ {print
#!/bin/bash
tail -F food_expenses.txt | \
while read line
do
    for word in "coffee" "tea" "honey cake"
    do
        if [[ $line != ${line#*$word*} ]] # does the line contain the word?
        then
            echo "$line" >> ${word#* }.txt # use the last word in $word for the filename (i.e. cake.txt for "honey cake")
        fi
    done
done;
> "/dev/stderr"}
awk 'BEGIN {
         list = "coffee tea"; 
         split(list, patterns)
     }
     {
         for (pattern in patterns) {
             if (
$ cat split.awk
BEGIN{}
/^coffee/ {
    print 
awk '/Coffee/ { print "Coffee" } /Tea/ { print "Tea" > "/dev/stderr" }' inputfile > coffee.file.txt 2> tea.file.txt
>> "/tmp/coffee.txt" ; next; } /^tea/ { print
grep coffee food_expanses.txt> coffee.txt
grep tea food_expanses.txt> tea.txt
>> "/tmp/tea.txt" ; next; } { # default print ##代码## >> "/tmp/other.txt" ; } END {} $
~ patterns[pattern]) { print > patterns[pattern] ".txt" } } }' food_expenses.txt
!~ /tea/ {print ##代码##}' 2> tea.txt
> "/dev/stderr"} {print ##代码##}' 2> coffee.txt | awk '/tea/ {print ##代码## > "/dev/stderr"} {print ##代码##}' 2> tea.txt

As you can see, it uses awkto send all lines matching 'coffee' to stderr, and all lines regardless of content to stdout. Then stderris fed to a file, and the process repeats with 'tea'.

如您所见,它用于awk将与“咖啡”匹配的所有行发送到stderr,并将所有行(无论内容如何)发送到stdout。然后stderr被送入一个文件,这个过程用“tea”重复。

If you wanted to filter out content at each step, you might use this:

如果你想在每一步过滤掉内容,你可以使用这个:

##代码##

回答by Mark Edgar

For this example, you should use awk as semiuseless suggests.

对于这个例子,你应该像 semiuseless 建议的那样使用 awk。

But in general to have N arbitrary programs read a copy of a single input stream, you can use teeand bash's process output substitution operator:

但通常要让 N 个任意程序读取单个输入流的副本,您可以使用teebash 的进程输出替换运算符:

##代码##

Note that >(command)is a bash extension.

请注意,这>(command)是一个 bash 扩展。

回答by Brian Agnew

The obvious question is why do you want to do this within one command ?

显而易见的问题是为什么要在一个命令中执行此操作?

If you don't want to write a script, and you want to run stuff in parallel, bash supports the concepts of subshells, and these can run in parallel. By putting your command in brackets, you can run your greps (or whatever) concurrently e.g.

如果您不想编写脚本,并且想并行运行某些东西,那么 bash 支持subshel​​ls的概念,并且它们可以并行运行。通过将您的命令放在括号中,您可以同时运行您的 grep(或其他),例如

##代码##

Note that in the above your catmay be redundant since greptakes an input file argument.

请注意,在上面你cat可能是多余的,因为grep需要一个输入文件参数。

You can (instead) play around with redirecting output through different streams. You're not limited to stdout/stderr but can assign new streams as required. I can't advise more on this other than direct you to examples here

您可以(相反)尝试通过不同的流重定向输出。您不仅限于 stdout/stderr,还可以根据需要分配新流。除了引导您查看此处的示例外,我无法就此提供更多建议

回答by Paused until further notice.

Here are two bashscripts without awk. The second one doesn't even use grep!

这里有两个bash脚本没有awk。第二个根本用不着grep

With grep:

使用 grep:

##代码##

Without grep:

没有grep:

##代码##

Edit:

编辑:

Here's an AWK method:

这是一个 AWK 方法:

##代码##

Working with patterns which include spaces remains to be resolved.

处理包含空格的模式仍有待解决。

回答by Stan Graves

I am unclear why the filtering needs to be done in different steps. A single awk program can scan all the incoming lines, and dispatch the appropriate lines to individual files. This is a very simple dispatch that can feed multiple secondary commands (i.e. persistent processes that monitor the output files for new input, or the files could be sockets that are setup ahead of time and written to by the awk process.).

我不清楚为什么过滤需要在不同的步骤中完成。单个 awk 程序可以扫描所有传入的行,并将适当的行分派到各个文件。这是一个非常简单的分派,可以提供多个辅助命令(即监视输出文件以获取新输入的持久进程,或者文件可以是提前设置并由 awk 进程写入的套接字。)。

If there is a reason to have every filter see every line, then just remove the "next;" statements, and every filter will see every line.

如果有理由让每个过滤器看到每一行,那么只需删除“下一个”;语句,每个过滤器都会看到每一行。

##代码##

回答by Stephen Darlington

You could use awkto split into up to two files:

您可以使用awk拆分为最多两个文件:

##代码##

回答by nik

You can probably write a simple AWK script to do this in one shot. Can you describe the format of your file a little more?

您可能可以编写一个简单的 AWK 脚本来一次性完成此操作。你能再描述一下你的文件格式吗?

  • Is it space/comma separated?
  • do you have the item descriptions on a specific 'column' where columns are defined by some separator like space, comma or something else?
  • 它是空格/逗号分隔的吗?
  • 您是否在特定“列”上有项目描述,其中列由空格、逗号或其他分隔符等分隔符定义?

If you can afford multiple grep runs this will work,

如果您能负担得起多次 grep 运行,这将起作用,

##代码##

and, so on.

等等。

回答by Aftermathew

Assuming that your input is not infinite (as in the case of a network stream that you never plan on closing) I might consider using a subshell to put the data into a temp file, and then a series of other subshells to read it. I haven't tested this, but maybe it would look something like this { cat inputstream > tempfile }; { grep tea tempfile > tea.txt }; { grep coffee tempfile > coffee.txt};

假设您的输入不是无限的(例如在您从未计划关闭的网络流的情况下),我可能会考虑使用子外壳将数据放入临时文件中,然后使用一系列其他子外壳来读取它。我还没有测试过这个,但它可能看起来像这样 { cat inputstream > tempfile }; { grep 茶临时文件 > 茶.txt }; { grep coffee tempfile > coffee.txt};

I'm not certain of an elegant solution to the file getting too large if your input stream is not bounded in size however.

如果您的输入流没有大小限制,我不确定文件变得太大的优雅解决方案。