GNU Parallel 和 Bash 函数:如何运行手册中的简单示例

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/23814360/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-18 10:30:39  来源:igfitidea点击:

GNU Parallel and Bash functions: How to run the simple example from the manual

bashgnu-parallel

提问by Fortran

I'm trying to learn GNU Parallel because I have a case where I think I could easily parallelize a bash function. So in trying to learn, I went to the GNU Parallel manualwhere there is an example...but I can't even get it working! To wit:

我正在尝试学习 GNU Parallel,因为我有一个案例,我认为我可以轻松地并行化 bash 函数。所以为了学习,我去了GNU Parallel 手册,那里有一个例子......但我什至无法让它工作!以机智:

(232) $ bash --version
GNU bash, version 4.1.2(1)-release (x86_64-redhat-linux-gnu)
Copyright (C) 2009 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>

This is free software; you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
(233) $ cat tpar.bash
#!/bin/bash

echo `which parallel`
doit() {
  echo Doing it for 
  sleep 2
  echo Done with 
}
export -f doit
parallel doit ::: 1 2 3
doubleit() {
  echo Doing it for  
  sleep 2
  echo Done with  
}
export -f doubleit
parallel doubleit ::: 1 2 3 ::: a b

(234) $ bash tpar.bash
/home/mathomp4/bin/parallel
doit: Command not found.
doit: Command not found.
doit: Command not found.
doubleit: Command not found.
doubleit: Command not found.
doubleit: Command not found.
doubleit: Command not found.
doubleit: Command not found.
doubleit: Command not found.

As you can see, I can't even get the simple example to run. Thus, I'm probably doing something amazingly stupid and basic...but I'm at a loss.

如您所见,我什至无法运行简单的示例。因此,我可能正在做一些非常愚蠢和基本的事情……但我不知所措。

ETA: As suggested by commenters (chmod +x, set -vx):

ETA:正如评论者所建议的(chmod +x,set -vx):

(27) $ ./tpar.bash

echo `which parallel`
which parallel
++ which parallel
+ echo /home/mathomp4/bin/parallel
/home/mathomp4/bin/parallel

doit() {
  echo Doing it for 
  sleep 2
  echo Done with 
}
export -f doit
+ export -f doit
parallel doit ::: 1 2 3
+ parallel doit ::: 1 2 3
doit: Command not found.
doit: Command not found.
doit: Command not found.
doubleit() {
  echo Doing it for  
  sleep 2
  echo Done with  
}
export -f doubleit
+ export -f doubleit
parallel doubleit ::: 1 2 3 ::: a b
+ parallel doubleit ::: 1 2 3 ::: a b
doubleit: Command not found.
doubleit: Command not found.
doubleit: Command not found.
doubleit: Command not found.
doubleit: Command not found.
doubleit: Command not found.

ETA2: Note, I can, in the script, just call 'doit 1', say, and it will do that. So the function is valid, it just isn't...exported?

ETA2:请注意,我可以在脚本中只调用“doit 1”,它就会这样做。所以这个函数是有效的,只是不是……导出?

回答by Gilles 'SO- stop being evil'

You cannot call a shell function from outside the shell where it was defined. A shell function is a concept inside the shell. The parallelcommand itself has no way to access it.

您不能从定义它的 shell 外部调用 shell 函数。shell 函数是 shell 内部的一个概念。该parallel命令本身有没有办法来访问它。

Calling export -f doitin bash exports the function via the environment so that it is picked up by child processes. But only bash understands bash functions. A (grand)*child bash process can call it, but not other programs, for example not other shells.

export -f doit在 bash 中调用会通过环境导出函数,以便子进程获取它。但只有 bash 理解 bash 函数。(grand)*child bash 进程可以调用它,但不能调用其他程序,例如不能调用其他 shell。

Going by the message “Command not found”, it appears that your preferred shell is (t)csh. You need to tell parallelto invoke bash instead. parallelinvokes the shell indicated by the SHELLenvironment variable1, so set it to point to bash.

根据消息“找不到命令”,您的首选 shell 似乎是 (t)csh。您需要告诉parallel改为调用 bash。parallel调用由SHELL环境变量 1指示的 shell ,因此将其设置为指向 bash。

export SHELL=$(type -p bash)
doit () { … }
export -f doit
parallel doit ::: 1 2 3

If you only want to set SHELLfor the execution of the parallelcommand and not for the rest of the script:

如果您只想为命令SHELL的执行设置parallel而不是为脚本的其余部分设置:

doit () { … }
export -f doit
SHELL=$(type -p bash) parallel doit ::: 1 2 3

I'm not sure how to deal with remote jobs, you may need to pass --env=SHELLin addition to --env=doit(note that this assumes that the path to bashis the same everywhere).

我不确定如何处理远程作业,您可能需要通过--env=SHELL除了--env=doit(请注意,这假设到的路径bash在任何地方都相同)。

And yes, this oddity should be mentioned more prominently in the manual. There's a brief note in the description of the commandargument, but it isn't very explicit (it should explain that the commandwords are concatenated with a space as a separator and then passed to $SHELL -c), and SHELLisn't even listed in the environment variablessection. (I encourage you to report this as a bug; I'm not doing it because I hardly ever use this program.)

是的,应该在手册中更突出地提到这种奇怪之处。在command参数的描述中有一个简短的注释,但它不是很明确(它应该解释说这些command词是用空格连接起来作为分隔符然后传递给的$SHELL -c),SHELL甚至没有列在环境变量部分. (我鼓励您将此报告为错误;我不会这样做,因为我几乎从未使用过该程序。)

1 which is bad design, since SHELLis supposed to indicate a user interface preference for an interactive command line shell, and not to change the behavior of programs.

1这是糟糕的设计,因为SHELL它应该表明交互式命令行 shell 的用户界面偏好,而不是改变程序的行为。

回答by Ole Tange

Since version 20160722 you can instead use env_parallel:

从版本 20160722 开始,您可以改为使用env_parallel

doit() { echo "$@"; }
echo world | env_parallel doit Hello

You just need to activate env_parallelby adding it to .bashrc. You can add it to .bashrcby running this once:

您只需要将其env_parallel添加到.bashrc. 您可以.bashrc通过运行一次将其添加到:

env_parallel --install