- don't do it if it hurts - when an Integraion Points become problemeatic, stop calling it.
- use together with Timeouts - using a Timeout helps to identify a problem with an Integration Point. That information can be used to trigger a Circuit Breaker.
- expose, track and report state changes - tripping a Circuit Breaker always indicates a serious problem and should be visible to operations. Circuit Breaker activity should be reported, trended and correlated.
Tuesday, April 13, 2010
Release It! - Chapter 5.2 Circuit Breaker
A software Circuit Breaker attempts to act like an electrical circuit breaker in that if the system the Circuit Breaker bridges gets sick, the Circuit Breaker opens and interaction with the sick system is prohibited. As the Circuit Breaker is used, it keeps track of failures to the bridged system. If a failure threshold is reached, the Circuit Breaker opens and access to the faulty system is prevented. Software Circuit Breakers, unlike electrical ones, can be configured to retry a call to the sick system to see if it has recovered. If it has, the Circuit Breaker is closed and traffic flows normally. If not, the Circuit Breaker remains open until it is time to attempt another check on the sick system. It is a good idea that throw a excpetion from the Circuit Breaker that lets the caller know that that failure is due to the Circuit Breaker tripping, giving the caller the opportunity to apply different logic in that scenario. Tripped Circuit Breakers will degrade your system so it is important to discuss what should be done in that scenario. Operations will surely want to know when a breaker is tripped so make sure that that state of breaker is logged and targeted to them. You should probably also provide a way to query or monitor the breaker's state. Keeping track of tripped breakers is a good way to monitor changes over time with an Integration Point. You have some ammunition with a vendor if you can cite specific data points. It is also useful to allow a manual way to trip or reset a Circuit Breaker. Circut Breakers are a guard against Integration Points, Cascading Failures, Unbalanced Capacities and Slow Responses.