Every time a human touches a server is an opportunity for a mistake to be made. Try and keep people off of the production system so strive to make the system run without human intervention. A primary reason humans log into a system is to purge unwanted resources, such as log files or history tables, so automate the purge process. For every mechanism that accumulates a resource, some other mechanism must recycle that resource. Common types of sludge that can build up:
- Old Data - cleaning out obsolete data from db tables is a good thing but requires careful attention to make sure that integrity is maintained.
- Log Files - can fill up disks and are mostly useless. Don't leave log files on production, copy them somewhere else for analysis. Use a RollingFile appender and rotate the logs by size. Find a way to purge logs or they will sure to be the cause for a support call.
- In-Memory Caching - make sure to use some form of cache invalidation. Memory caches lead to memory leaks which lead to crashes.
Tips:
- avoid fiddling - human intervention leads to problems so eliminate the need for recurring human intervention.
- purge data with application logic - letting a DBA write your purge scripts puts the app at risk because they don't know your ORM tool or your application logic. It is usually better to do it yourself.
- limit caching - cap the amount of RAM a cache can consume
- roll the logs - keep a limited number of logs and rotate them based on size. Any logs that need to be retained should be copied off of the server.
No comments:
Post a Comment