New Automated Log Parsing

new-automated-log-parsingThe Logentries product is always improving and advancing. There are some exciting new features available today, and another great feature coming this month!

Included in today’s announcement: 

  1. Automated Log Parsing
  2. Nested JSON Support

Automated Log Parsing

If you’ve ever sent data to Logentries you may have noticed keys being automatically highlighted and available for grouping from the Querybuilder dropdown menus.

In February, we released new log formats that are automatically parsed by Logentries: Combined /Common Log Format and Syslog Tags, making it easy to search and analyze data using keys.

Apache and Nginx logs

We’ll now parse implied keys out of your Nginx and Apache logs.

So if you send us some data like this:

192.0.2.1 - - [07/Mar/2004:16:43:54 -0800] "GET /unencrypted_password_list HTTP/1.1" 418 9001 "http://passwords.hackz0r" "Mozilla/4.08 [en] (Win95)"

We know that the format of apache access logs are:

*addr* - *user* *timestamp* "*method* *path* *version*" *status* *bytes* *referer* *agent*        

And you’ll be able to parse those implied keys immediately for groupby queries and calculations. So from the example above:

Implied Key Value
addr 1920.2.1
agent “Mozilla/4.08 [en] (Win95)”
bytes 9001
method GET
path /unencrypted_password_list
referer “http://passwords.hackz0r”
status 418 I’m a teapot
timestamp 07/Mar/2004:16:43:54 -0800
user
version HTTP/1.1

Using this data, you can quickly see what urls are hit most often with groupby(path) calculate(count) sort(desc)

You can see the average bytes sent with calculate(average:bytes)

And you can see which addresses you get hit from the most often with calculate(count:addr) sort(desc)

You can see this new feature in action with an Apache Access log by checking out this video, here.

Syslog Tags

We now parse the implied keys from Syslog Tags!

So if you send us some data like this:

1 Feb 22 17:16:34 test-VirtualBox kernel: [292] Accidentally deleted folder=system32

We know that the format of syslog is:

*pri* *version* *timestamp* *hostname* *appname* *procid*

And you’ll be able to parse those implied keys immediately for groupby queries and calculations. So from the example above:

Implied Key Value
appname kernel
hostname test-VirtualBox
pri 165
procid 292
timestamp Feb 22 17:16:34
version 1

And you’ll still be able to query against folder as a normal Key.

This means you can quickly narrow down your syslog log entries by appname with a quick appname=kernel query.

Or you could see what the priority of your various log events are with groupby(pri) calculate(count)

These updates make it easier to search, group and analyze semi-structured data or complex data objects without needing to use Regular Expressions!


Nested JSON

This month we will start parsing nested JSON objects in full. This will allow you to query nested JSON objects using dot notation. (parent.child querying)

Now you can use the JSON hierarchy for queries and alerts.

Say you log your music player stats as JSON:

{
    "volume": "blaring",
    "current" : {
        "band": "rednex",
        "song": "Cotton Eye Joe"
    },
    "next" : {
        "band": "The Dubliners",
        "song": "The Sick Note"
    }
}

You can list only the loud songs with where(volume="blaring"), but you’ll also be able to find out the current playing song with current.song, and the next band is found at next.band!

You could even set up an alert if someone tries to stick some nickelback on with a where("next.band"="nickelback) → fire email.


Want to try these new features out for yourself? Create a free Logentries account today!

Tagged with: , , , , , , , , , ,
Posted in Log Analysis, Log Management, Logentries

Leave a Reply