Python Airbnb Airflow 与 Apache Nifi

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/39399065/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 22:12:45  来源:igfitidea点击:

Airbnb Airflow vs Apache Nifi

pythonapache-nifiairflow

提问by CMPE

Are Airflow and Nifi perform the same job on workflows? What are the pro/con for each one? I need to read some json files, add more custom metadata to it and put it in a Kafka queue to be processed. I was able to do it in Nifi. I am still working on Airflow. I am trying to choose the best workflow engine for my project Thank you!

Airflow 和 Nifi 在工作流程上执行相同的工作吗?每个人的利弊是什么?我需要读取一些 json 文件,向其中添加更多自定义元数据并将其放入 Kafka 队列中进行处理。我能够在 Nifi 中做到这一点。我仍在研究 Airflow。我正在尝试为我的项目选择最好的工作流引擎 谢谢!

回答by JDP10101

For a great overview of Airflow and Apache NiFi checkout this reddit post: https://www.reddit.com/r/bigdata/comments/51mgk6/comparing_airbnb_airflow_and_apache_nifi/

有关 Airflow 和 Apache NiFi 的精彩概述,请查看此 reddit 帖子:https: //www.reddit.com/r/bigdata/comments/51mgk6/comparing_airbnb_airflow_and_apache_nifi/

For your specific use-case of ingesting Json files, enriching them and routing them to Kafka I believe NiFi is the right tool for the job. A couple of processors you could potentially use, as well as documentation for each, are below:

对于摄取 Json 文件、丰富它们并将它们路由到 Kafka 的特定用例,我相信 NiFi 是适合这项工作的工具。您可能会使用的几个处理器以及每个处理器的文档如下:

GetFile: https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-standard-nar/1.9.2/org.apache.nifi.processors.standard.GetFile/index.html

GetFile:https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-standard-nar/1.9.2/org.apache.nifi.processors.standard.GetFile/index.html html

JoltTransformJSON: https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-standard-nar/1.9.2/org.apache.nifi.processors.standard.JoltTransformJSON/index.html

JoltTransformJSON:https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-standard-nar/1.9.2/org.apache.nifi.processors.standard.JoltTransformJSON/index 。 html

PublishKafka (or PublishKafka_0_10 depending on your version): https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-kafka-0-9-nar/1.9.2/org.apache.nifi.processors.kafka.pubsub.PublishKafka/index.html

PublishKafka(或 PublishKafka_0_10 取决于您的版本):https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-kafka-0-9-nar/1.9.2/org 。 apache.nifi.processors.kafka.pubsub.PublishKafka/index.html