java 分布式作业调度、管理和报告

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/1914884/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-10-29 18:30:08  来源:igfitidea点击:

Distributed Job scheduling, management, and reporting

javahadoopdistributed-computingjob-scheduling

提问by teabot

I recently had a play around with Hadoopand was impressed with it's scheduling, management, and reporting of MapReduce jobs. It appears to make the distribution and execution of new jobs quite seamless, allowing the developer to concentrate on the implementation of their jobs.

我最近玩了Hadoop,对它的 MapReduce 作业的调度、管理和报告印象深刻。它似乎使新工作的分配和执行变得非常无缝,使开发人员能够专注于他们工作的实施。

I am wondering if anything exists in the Java domain for the distributed execution of jobs that are not easily expressed as MapReduce problems? For example:

我想知道在 Java 域中是否存在用于分布式执行不容易表达为 MapReduce 问题的作业?例如:

  • Jobs that require task co-ordination and synchronization. For example, they may involve sequential execution of tasks yet it is feasible to execute some tasks concurrently:

                   .-- B --.
            .--A --|       |--.
            |      '-- C --'  |
    Start --|                 |-- Done
            |                 |
            '--D -------------'
    
  • CPU intensive tasks that you'd like to distribute but don't provide any outputs to reduce - image conversion/resizing for example.

  • 需要任务协调和同步的作业。例如,它们可能涉及任务的顺序执行,但同时执行一些任务是可行的:

                   .-- B --.
            .--A --|       |--.
            |      '-- C --'  |
    Start --|                 |-- Done
            |                 |
            '--D -------------'
    
  • 您想要分发但不提供任何输出以减少的 CPU 密集型任务 - 例如图像转换/调整大小。

So is there a Java framework/platform that provides such a distributed computing environment? Or is this sort of thing acceptable/achievable using Hadoop - and if so are there any patterns/guidelines for these sorts of jobs?

那么有没有提供这种分布式计算环境的Java框架/平台呢?或者这种事情是否可以使用 Hadoop 来接受/实现——如果是这样,这些工作是否有任何模式/指南?

采纳答案by teabot

I have since found Spring Batchand Spring Batch Integrationwhich appear to address many of my requirements. I will let you know how I get on.

从那以后,我发现Spring BatchSpring Batch Integration似乎可以满足我的许多要求。我会让你知道我的进展如何。

回答by Upgradingdave

Take a look at Quartz. I think it supports stuff like managing jobs remotely and clustering several machines to run jobs.

看看石英。我认为它支持远程管理作业和集群多台机器来运行作业之类的东西。

回答by Youri

ProActive Schedulerseems to fit your requirements, especially the complex workflows you mentionned with tasks coordination. It is open source and Java based. You can use it to run anything, Hadoop jobs, scripts, Java code,...

ProActive Scheduler似乎符合您的要求,尤其是您提到的与任务协调有关的复杂工作流。它是开源的并且基于 Java。你可以用它来运行任何东西,Hadoop 作业、脚本、Java 代码……

Disclaimer: I work for the companybehind it

免责声明:我为背后的公司工作

回答by Nikita Koksharov

Try Redissonframework. It provides easy api to execute and schedule java.util.concurrent.Callableand java.lang.Runnabletasks. Here is documentation about distributed Executor serviceand Scheduler service

尝试Redisson框架。它提供了简单的API来执行和进度java.util.concurrent.Callablejava.lang.Runnable任务。这里是关于分布式执行器服务调度器服务的文档

回答by Alexey Kalmykov

I guess you are looking for a workflow engine for CPU intensive tasks (also know "scientific workflow", e.g. http://www.extreme.indiana.edu/swf-survey). But I'm not sure how distributed do you want it to be. Usually all workflow engines have a "single point of failure".

我猜您正在寻找用于 CPU 密集型任务的工作流引擎(也知道“科学工作流”,例如http://www.extreme.indiana.edu/swf-survey)。但我不确定你希望它有多分布。通常所有工作流引擎都有一个“单点故障”。

回答by Fried Hoeben

I believe quite a few problems can be expressed as map-reduce problems.

我相信很多问题都可以表示为 map-reduce 问题。

For problems that you can't modify to fit the structure your can look at setting up your own using Java's ExecutorService. But it will be limited to one JVM and it will be quite low level. It will allow for easy coordination and synchronization however.

对于您无法修改以适应结构的问题,您可以查看使用 Java 的ExecutorService设置您自己的问题。但它将仅限于一个 JVM,而且级别相当低。但是,它将允许轻松协调和同步。