bash 将命令行参数传递给气流 BashOperator
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/42016491/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Passing a command line argument to airflow BashOperator
提问by Shiva
Is there a way to pass a command line argument to Airflow BashOperator. Currently, I have a python script that accepts a date argument and performs some specific activities like cleaning up specific folders older than given date.
有没有办法将命令行参数传递给 Airflow BashOperator。目前,我有一个 python 脚本,它接受一个日期参数并执行一些特定的活动,比如清理早于给定日期的特定文件夹。
In simplified code with just one task, what I would like to do is
在只有一项任务的简化代码中,我想做的是
from __future__ import print_function
from airflow.operators import BashOperator
from airflow.models import DAG
from datetime import datetime, timedelta
default_args = {
'owner' : 'airflow'
,'depends_on_past' : False
,'start_date' : datetime(2017, 01, 18)
,'email' : ['[email protected]']
,'retries' : 1
,'retry_delay' : timedelta(minutes=5)
}
dag = DAG(
dag_id='data_dir_cleanup'
,default_args=default_args
,schedule_interval='0 13 * * *'
,dagrun_timeout=timedelta(minutes=10)
)
cleanup_task = BashOperator(
task_id='task_1_data_file_cleanup'
,bash_command='python cleanup.py --date $DATE 2>&1 >> /tmp/airflow/data_dir_cleanup.log'
#--------------------------------------^^^^^^-- (DATE variable which would have been given on command line)
#,env=env
,dag=dag
)
Thanks in advance,
提前致谢,
回答by Bolke de Bruin
The BashOperator is templated with Jinja2, meaning that you can pass arbitrary values. In your case it would be something like:
BashOperator 是使用 Jinja2 模板化的,这意味着您可以传递任意值。在你的情况下,它会是这样的:
cleanup_task = BashOperator(
task_id='task_1_data_file_cleanup'
,bash_command="python cleanup.py --date {{ DATE }} 2>&1 >> /tmp/airflow/data_dir_cleanup.log"
,params = {'DATE' : 'this-should-be-a-date'}
,dag=dag
)
See also: https://airflow.incubator.apache.org/tutorial.html#templating-with-jinjafor a broader example.
另请参阅:https: //airflow.incubator.apache.org/tutorial.html#templating-with-jinja以获得更广泛的示例。
回答by Andreas Vrangas
You can try the following (worked for me):
您可以尝试以下操作(对我有用):
cmd_command = "python path_to_task/[task_name.py] '{{ execution_date }}' '{{ prev_execution_date }}'"
t = BashOperator(
task_id = 'some_id',
bash_command = cmd_command,
dag = your_dag_object_name)
When I did so, it rendered the variables, and it worked well. I believe it work for all variables (notice that i've put the word 'python' in the start of my command because I want to run a .py script.
当我这样做时,它呈现了变量,并且运行良好。我相信它适用于所有变量(请注意,我在命令的开头加上了“python”这个词,因为我想运行一个 .py 脚本。
My task is written properly in order to read those variables as command line arguments (sys.argv attribute).
我的任务编写正确,以便将这些变量作为命令行参数(sys.argv 属性)读取。
回答by wei
Try os.system("YOUR COMMAND HERE")
尝试 os.system("YOUR COMMAND HERE")
回答by user7126545
BashOperator is Jinja templated, so params can be passed as dictionary.
BashOperator 是 Jinja 模板化的,因此参数可以作为字典传递。
Airflow will schedule the task and does not prompt you for param, so when you said "need to pass specific date as command line param" that's not possible. Though Airflow has a notion of EXECUTION DATE, which is the date on which dag is scheduled to run and that can be passed in BashOperator params using macro {{ ds }} or {{ ds_nodash }} (https://airflow.incubator.apache.org/code.html#macros)
Airflow 将安排任务并且不会提示您输入参数,因此当您说“需要将特定日期作为命令行参数传递”时,这是不可能的。尽管 Airflow 有一个 EXECUTION DATE 的概念,它是 dag 计划运行的日期,并且可以使用宏 {{ ds }} 或 {{ ds_nodash }} ( https://airflow.incubator.incubator. apache.org/code.html#macros)
env = {}
env['DATE'] = '{{ ds }}'
cleanup_task = BashOperator(
task_id='task_1_data_file_cleanup'
,bash_command='python cleanup.py --date $DATE 2>&1 >> /tmp/airflow/data_dir_cleanup.log'
,params=env
,dag=dag
)
That "DATE" param will be passed to bash script and can be used as any other bash variable with $DATE
该“DATE”参数将传递给 bash 脚本,并可用作任何其他带有 $DATE 的 bash 变量