Modern systems rely heavily on the network and networks break. Waiting for an answer that is never going to come is not a wise move. I like this tagline: "Hope is not a design method." Make sure your code doesn't wait around forever for an answer to its request. Ensure that any resource pool implementation that blocks a thread until a resource is available, should have a timeout enabled. In Java, always use the form of the concurrency APIs that take timeout, never the no-arg ones. Creating reusable code that deals with the sticky issues around thread blocking and timeouts is desirable, not to mention good programming. That way, a particular set of thread interactions are understood and shared throughout the system. Use QueryObject and Gateway to encapsulate database access logic, making it easier to apply Circuit Breaker. Some code attempts to retry after a failure but, generally speaking, that is not a wise thing to do. Networks and servers don't heal quickly and making a client wait is usually not a good thing. A better tactic is to return a result, which might be an error code or an indicator that you've queued up the request for retry at a future time. Making the client wait will likely cause a cascading failure as his callers have to sit around waiting to get their answer from him. Store-and-Forward is generally a robust solution to timeouts but each application has its own definition of "fast enough" which you need to account for. Timeouts and Circuit Breakers are a good combination because the Circuit Breaker can trip if timeouts become the norm instead of the exception. Timeouts coupled with Fail Fast are another common combination. Timeout protects you against somebody else's failure while Fail Fast is used to report to your callers why you can't complete their request. Timeouts also can take a role in Unbounded Results in that it might take too much time to load those million records you accidentally asked for.
- apply Timeout to Integration Points, Blocked Threads, and Slow Response to avert Cascading Failures
- apply Timeout as a way to recover from unexected failures. Sometimes you can't know the precise cause of the failure but you need to give up and move on.
- consider delayed retries. Immediate retries are likely to fail and end up delaying the layer calling you. Queing up the work and trying again later is usually a better alternative.
No comments:
Post a Comment