Implementation
Use log4j2/logback/name-your-modern-logging-framework
As soon as you can, move to a newer logging framework. Modern API/frameworks have higher performance, have a better API and more useful features.I have settled for log4j2 although I use it via slf4j - for historic reasons. But others may do well too (namely logback!).
Use logOnce technique
When a LOG statement is in a loop and (a) that loop is infinite or (b) has the potential to be running many times, there's a risk to spam the log messages with the same entry on and on. This could make the file grow indefinitely making it hard to read/analyse; it may also roll over causing loss of important logged information. The logOnce technique prevents these nasty side effect and it works as following:- define a special logger L that logs a log entry.
- log a message within a loop using the logger L, remember the last logged message
- next time you need to log on the same logger, the new message with the last logged message.
- if matches skip the statement log; if not go back to step 2.
- match new log statements using regex to allow for differences in some dynamic data
- log every N entry: even if the new line matches the last logged line, log every N occurrences
- record some metrics - like number of occurrences logged/skipped etc
- scheduled threads like Heartbeats
- Log statements occurring in a for/while loop or recursive functions. especially when logs are on high log levels like ERROR or WARN
Other implementation details
- Logging statements must be side effect free. Besides appending the statement to the configured appender, a LOG statement MUSTN'T change the internal application state.
- Make sure that a log actually happens: either log BEFORE the action Or try { action } finally { LOG after } to make sure the LOG after always happens.
- all nodes share the same wall clock - use NTP to sync
- all applications share the same configuration; or at least the same message pattern layout.
Log levels
Logs *are* a user interface. And, as such, they deserve the right attention to details... (Yeah, right!)... At a basic level I personally group INFO and above as forming Administration/Support user interfaces to the system runtime logs.- ERROR (aliases severe/fatal): for messages that need to be looked at immediately because a transaction or part of the system is in an un-recoverable state (albeit temporary).
- WARN : for messages that need to be looked at immediately because a transaction or part of the system is went wrong but managed to recover.
- INFO: descriptive messages about a meaningful system transaction happened. Or for long transactions start/stop.
- DEBUG (alias fine): follow system transactions' execution with the intention of reporting the runtime flow.
- TRACE (alias finer/finest): uber-details - to report application state or transmitted data, etc.
Markers
Experiment with Markers to provide a better partition of logs depending on application specific contexts.Design
For transactions spawning across several threads in the same process OR different processes in the same host OR different apps in different hosts it's important that- Each log entry, in each file reports some ID to trace back the transaction
- For example an operation starting from a browser client and going via a web services layer, consuming various microservices and/or middleware components should ideally report a single transaction id in all systems.
- Speeds up diagnostics because it won't require trawling multiple files, dispersed in various locations
- If you're logging latency of code blocks or entry/exit of code blocks, don't clutter the block, rather create a wrapper and use the wrapper (or use AOP)
- traffic data should go on a file,
- application logs on another,
- performance and metrics on another.
Antipattens
Antipattern 1 - Have to log caller class and line numberIf you find the need to log the caller's file name and line number, something's wrong with your log OR with your pattern layout. Logging caller file and line number is very inefficient. Antipattern 2 - Need to write a custom log file parser
If you find the need to write a parser to parse the logs, somethings wrong with your logs or with the way you use logs. Possible causes include:
- You're logging too much and - for a given transaction - you cant work out what's going on
- Your logs are too complex to read (Log statement don't have a constant and predictable format or are split in multiple lines - which will make writing the parser hard anyway)
- You are using log files where you should instead use a proper "inter process communication" mechanism, or a proper auditing mechanism, etc
Your logs suck. The system performances will be degraded and you'll spend your days staring at your screen trying to figure out what's going on... the classic needle in the haystack.
No comments:
Post a Comment