Search This Blog

Thursday, April 15, 2010

Release It! - Chapter 5.4 Steady State

Every time a human touches a server is an opportunity for a mistake to be made. Try and keep people off of the production system so strive to make the system run without human intervention.  A primary reason humans log into a system is to purge unwanted resources, such as log files or history tables, so automate the purge process.  For every mechanism that accumulates a resource, some other mechanism must recycle that resource. Common types of sludge that can build up:
  • Old Data - cleaning out obsolete data from db tables is a good thing but requires careful attention to make sure that integrity is maintained.
  • Log Files - can fill up disks and are mostly useless.  Don't leave log files on production, copy them somewhere else for analysis.  Use a RollingFile appender and rotate the logs by size.  Find a way to purge logs or they will sure to be the cause for a support call.
  • In-Memory Caching - make sure to use some form of cache invalidation. Memory caches lead to memory leaks which lead to crashes.
Tips:

  • avoid fiddling - human intervention leads to problems so eliminate the need for recurring human intervention.
  • purge data with application logic - letting a DBA write your purge scripts puts the app at risk because they don't know your ORM tool or your application logic.  It is usually better to do it yourself.
  • limit caching - cap the amount of RAM a cache can consume
  • roll the logs - keep a limited number of logs and rotate them based on size.  Any logs that need to be retained should be copied off of the server.

No comments:

Post a Comment