java Hystrix:自定义断路器和恢复逻辑

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/27066033/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-11-02 11:09:13  来源:igfitidea点击:

Hystrix: Custom circuit breaker and recovery logic

javafault-tolerancehystrix

提问by smeeb

I just read the Hystrixguide and am trying to wrap my head around how the default circuit breaker and recovery period operate, and then how to customize their behavior.

我刚刚阅读了Hystrix指南,并试图围绕默认断路器和恢复期的操作方式以及如何自定义它们的行为进行思考。

Obviously, if the circuit is tripped, Hystrix will automatically call the command's getFallBack()method; this much I understand. But what criteria go into making the circuit tripped in the first place? Ideally, I'd like to try hitting a backing service several times (say, a max of 3 attempts) before we consider the service to be offline/unhealthy and trip the circuit breaker. How could I implement this, and where?

显然,如果电路跳闸,Hystrix 会自动调用命令的getFallBack()方法;这我明白了。但是首先使电路跳闸的标准是什么?理想情况下,在我们认为服务离线/不健康并触发断路器之前,我想尝试多次访问支持服务(例如,最多尝试 3 次)。我怎么能实现这个,在哪里?

But I imagine that if I override the default circuit breaker, I must also override whatever mechanism handles the default recovery period. If a backing service goes down, it could be for any one of several reasons:

但我想,如果我覆盖默认断路器,我还必须覆盖处理默认恢复期的任何机制。如果支持服务出现故障,可能是由于以下原因之一:

  • There is a network outage between the client and server
  • The service was deployed with a bug that makes it incapable of returning valid responses to the client
  • The client was deployed with a bug that makes it incapable of sending valid requests to the server
  • Some weird, momentary service hiccup (perhaps the service is doing a major garbage collection, etc.)
  • etc.
  • 客户端和服务器之间存在网络中断
  • 该服务部署时存在一个错误,使其无法向客户端返回有效响应
  • 客户端部署有一个错误,使其无法向服务器发送有效请求
  • 一些奇怪的、暂时的服务打嗝(也许服务正在做一个主要的垃圾收集等)
  • 等等。

In most of these cases, it is not sufficient to have a recovery period that merely waits Nseconds and then tries again. If the service has a bug in it, or if someone pulled some network cables in the data center, we will alwaysget failures from this service. Only in a small number of cases will the client-service automagically heal itself without any human interaction.

在大多数情况下,仅仅等待N秒然后再次尝试的恢复期是不够的。如果服务有bug,或者有人在数据中心拔了一些网线,我们总是会从这个服务中得到失败。只有在少数情况下,客户端服务会在没有任何人工交互的情况下自动自我修复。

So I guess my next question is partially "How do I customize the default recovery period strategy?", but I guess it is mainly: "How do I use Hystrix to notify devops when a service is down and requires manual intervention?"

所以我猜我的下一个问题部分是“如何自定义默认恢复期策略?”,但我猜主要是:“当服务关闭并需要手动干预时,我如何使用 Hystrix 通知 devops?

回答by ahus1

there are basically four reasons for Hystrix to call the fallback method: an exception, a timeout, too many parallel requests, or too many exceptions in the previous calls.

Hystrix 调用 fallback 方法基本上有四个原因:异常、超时、并行请求过多或之前调用中的异常过多。

You might want to do a retry in your run() method if the return code or the exception you receive from your service indicate that a retry makes sense.

如果您从服务收到的返回代码或异常表明重试有意义,您可能希望在 run() 方法中进行重试。

In your fallback method of the command you might retry when there was a timeout - when there where too many parallel requests or too many exceptions it usually makes no sense to call the same service again.

在您的命令回退方法中,您可能会在超时时重试 - 当有太多并行请求或太多异常时,再次调用相同的服务通常没有意义。

As also asked how to notify devops: You should connect a monitoring system to Hystrix that polls the status of the circuit breaker and the ratio of successful and unsuccessful calls. You can use the metrics publishersprovided, JMX, or write your own adapter using Hystrix' API. I've written two adapters for Riemann and Zabbix in a tutorial I prepared; you'll new very few lines of code for that.

还询问如何通知 devops:您应该将监控系统连接到 Hystrix,以轮询断路器的状态以及成功和失败调用的比率。您可以使用提供的指标发布者JMX,或使用 Hystrix 的 API 编写您自己的适配器。我在我准备教程中为 Riemann 和 Zabbix 编写了两个适配器;您将为此添加很少的代码行。

The tutorial also has a example application and a load driver to try out some scenarios.

本教程还有一个示例应用程序和一个加载驱动程序来尝试一些场景。

Br, Alexander.

兄弟,亚历山大。