Search This Blog

Sunday, April 25, 2010

Release It! - Chapter 10. 2 Use Caching Carefully

Caching can be a powerful response to a performance problem. It can reduce the load on the database server and cut response times to a fraction of what they would be without caching. When misused, however, caching can create new problems. The maximum memory usage of all application-level caches should be configurable. No matter what memory size you set on the cache, you need to monitor hit rates for the cached items to see whether most items are being used from cache. If hit rates are very low, then the cache is not buying any performance gains and might actually be slower than not using the cache. It’s also wise to avoid caching things that are cheap to generate. In Java, caches should be built using SoftReference objects to hold the cached item itself. In extreme cases, it might be necessary to move to a multilevel caching approach. In this approach, you keep the most frequently accessed data in memory but use disk storage for a secondary cache. Precomputing results can reduce or eliminate the need for caching. Finally, any cache presents a risk of stale data. Every cache should have an invalidation strategy to remove items from cache when their source data changes. The strategy you choose can have a major impact on your system’s capacity.
  • Limit cache sizes - Unbounded caches consume memory that is better spent handling requests. Holding every object you’ve ever loaded in memory doesn’t do the users any good.
  • Build a flush mechanism - Whether it’s based on the clock, the calendar, or an event on the network, every cache needs to be flushed sooner or later. A cache flush can be expensive, though, so consider limiting how often a cache flush can be triggered, or you just might end up with attacks of self-denial.
  • Don’t cache trivial objects - Not every domain object and HTML fragment is worth caching. Seldom-used, tiny, or inexpensive objects aren’t worth caching: the cost of bookkeeping and reduced free memory outweighs the performance gain.
  • Compare access and change frequency - Don’t cache things that are likely to change before they get used again.
The advice to monitor cache hits is a great one.  I've worked on systems were a developer swears that caching XYZ will save so much time but could never prove it because the home grown caching solution didn't provide visibility into the cache.  I'm thinking that using an established caching solution, such as Ehcache, is a good place to start since they often provide the features the book suggests.

No comments:

Post a Comment