Last updated at Mon, 06 Nov 2017 20:59:49 GMT

Signal-AND-Noise-The-Best-of-All-Worlds-for-Logging

One of the absolute, classic pieces of advice that you’ll hear when it comes to logging is what I think of as the iconic Goldilocks logging advice.  It goes something like this.

When it comes to logging, you don’t want to miss anything important because logging helps you understand your application’s behavior.  But you also don’t want to log too much.  If you log too much, the log becomes useless.  You want to log just the right amount.

Sage advice, to be sure.  Right?

Or, maybe, when you stop and think about it, this is actually borderline tautological.  It’s sort of like telling people looking at the prospect of a diet, “well, you don’t want to eat too little because you will become malnourished and die, but you also don’t want to eat too much because you’ll become morbidly obese.”  Well, yeah, thanks.  But maybe we could get a little more specific.

Back in the realm of logging, the Goldilocks logging advice suffers not only from vagueness, but from the broader problem that the sweet spot, “just-right” amount of logging will necessarily vary per application, and quite widely at that.  So, I purport to attack the subject from a different angle.

In a recent post, I talked about the idea of signal and noise and the signal to noise ratio.  The Goldilocks logging advice steers you away from too little signal on one end and too much noise on the other end.  The only thing that can really vary is the metaphorical volume.  But what if we got better at learning to pick out signal from the noise?  Then, we could crank up the volume as loud as we wanted without worrying about lost signal and without worrying about trying to walk a very fine, very subjective line.

Before I proceed, however, let me be clear about one point.  The superior solution is to get the amount of logging just right.  Audit your logs, look for useless stuff, and eliminate it, by all means.  But, if we assume that getting this just right is actually pretty hard, the advice that I’ll offer should be helpful.  So let’s start with the assumption that you’re going to err on the side of too much logging, and learn how to make the logs useful anyway.

Obvious Slicing Strategies

There are a couple of pretty standard tactics that almost everyone uses with their log files, but I’d be remiss if I didn’t mention them, at least briefly.  When you go to investigate a log file, one of the most straightforward ways to find the information that you want is to conceive of a way to filter it.  This could be as simple as a “find” in a text editor or loading the thing in Excel and using Excel’s filter functionality.

Toward that end, when recording the log entries, you should be adding information with a mind toward filtering.  And two of the most drop-dead, obvious, and standard ways to do that are with timestamp information and with logging levels.  These allow recording of each entry by time and by (more or less) importance, respectively.  The value here is that you can easily find what you’re looking for if you know when it happened and/or, if you know what level it would likely have been logged at.

12-23-15 08:41:23 [DEBUG] – Fetching customer list from database

12-23-15 08:41:26 [DEBUG] – Customer list contains 12 records

12-23-15 08:41:44 [ERROR] – Current customer not found in list

Less Obvious Slicing Strategies

Carrying that standard pattern forward, you can look for less obvious ways to slice your application.  These may be more dependent on the nature of your application.  Is it heavily layered?  If so, perhaps you want to record the layer at which each message is occurring.

12-23-15 08:41:23 [DEBUG] [Data Access] – Fetching customer list from database

12-23-15 08:41:26 [DEBUG] [Data Access] – Customer list contains 12 records

12-23-15 08:41:44 [ERROR] [Service] – Current customer not found in list

On the other hand, if your application is sliced vertically, by, say, bounded contexts, you may want to record that, wherein the layer designations would be replaced by the contexts.  I could go on with additional items, but this is truly going to depend on the nature of your application.  You will have to figure out how it makes sense to differentiate among the concerns in your application that tend to log.

Consider a Tagging Scheme

One thing that you’ll notice above is that the more slicing you add, the more unwieldy each entry can get.  This also creates a historical hiccup if you decide at some point to add or remove one of the slicing data points, because the parsing will no longer be consistent.

A means for addressing this is to implement the concept of “tagging” with which you may be familiar from Stack Overflow or an ALM/project management tool.  In short, this is a way to apply the slices we’ve been talking about dynamically.  Stalwarts such as timestamp and log level will probably remain, but consider the following scheme.

[12-23-15 08:41:23] [DEBUG] [Tags: Data Access, Customer, Fetch] – Fetching customer list from database

[12-23-15 08:41:26] [DEBUG] [Tags: Data Access, Customer] – Customer list contains 12 records

[12-23-15 08:41:44] [ERROR] [Tags: Service, Exception, Logout] – Current customer not found in list

Now, rather than having a lot of slices, we’re back to three.  There is the timestamp, the log level, and then a single “tags” slice, within which 0 to N tags may be applied.  While not quite as straightforward, parsing here is still relatively simple.  Look for the 3rd set of brackets, and then gather the tokens after the colon.  For only a bit of complexity overhead, this gives you a whole lot more power for categorizing the entries in your log file.

Write Your Entries To Be Parsed

We’ve been building to this section, but it’s worth specifically calling this out.  You should design the entries in the log file specifically with the idea that they need to be parsed and mined for information by algorithms and not just by humans.  The tagging scheme was a preview of that.

By all means, have anecdotal and human readable information in each entry, but structure them in such a way that the data can be parsed.  Let’s take a look, for instance, at how we could improve the working example.

[12-23-15 08:41:23] [DEBUG] [Tags: Data Access, Customer, Fetch] – Fetching customer list from database {‘listId’:3341}

[12-23-15 08:41:26] [DEBUG] [Tags: Data Access, Customer] – Customer list contains {‘recordCount’:12}

[12-23-15 08:41:44] [ERROR] [Tags: Service, Exception, Logout] – Current customer not found in list

Now what you’ve done is separated the human readable description from relevant application variables.  With the previous scheme, looking for all of the customer list retrieval counts would have required some sort of wildcard or regex expression.  Now I can just parse the JSON at the end of each entry, looking for that variable.

Up Your Parsing Game

And, speaking of parsing, the other side of the log file is just as important.  Up until now, everything we’ve talked about has been a strategy for making the log file more parse-able.  But the flip side of that is that you can also make sure you’re really good at parsing it.  You can look for common utilities, you can roll your own, and you can use a service like the one offered by Logentries to help you sift through all of this information.  But, whatever you do, make sure you’re automating this!

Best of All Worlds

It’s worth repeating.  If you’re clever enough to hit the Goldilocks zone of logging, then you should absolutely do that.  But that’s hard, and falling a bit short absolutely does not prevent you from extracting tons of value from your logs.  Write your log files to be clear, to be amenable to filtering, and to be machine parse-able, and you’re in good shape.  Take advantage of all of the tools at your disposal to help with the parsing, and you’ve got your cake and can eat it too.


Start capturing and analyzing all of your log data today with a free Logentries account.