Catching Inactivity Before It Catches You

What does not occur is just as important as what did. And unfortunately because IT Operations is usually just one step behind the activity of production, it often forces us to only consider what happened, and sometimes neglecting or missing what didn’t. The impact of a missed script run, or system update, has a ripple effect, and is hard to catch. This is why inactivity monitoring is so important.

Typical event monitoring only has value when a log entry is created, thus something to analyze. However the absence of a log entry can also tell you something incredibly valuable to maintaining ongoing availability and performance.

For example, you have a cronjob that runs a regular update of the tax rates that apply to your clients, but for some reason a computer is rebooted and the scheduled process didn’t run. This isn’t actually an error, so you will not see an error in the log because nothing had a chance to happen. And although it’s not an error, you do want to know about the lack of activity! Another example is getting a confirmation for a (n) (in) correctly processed order payment. Usually within a few minutes you get a Fail or OK notification, but what happens if that verification message just never comes? This is a scenario you must be prepared to manage because you have to not only been prepared for handling errors; you have to be prepared for when… nothing happens, or important activity stops.

These are cases that affect business performance with possible financial impact and in some cases more serious results. Logging sensory data can report near real-time readings of earthquake locations and magnitude, and can accurately forecast tsunamis using data from buoys and sea-floor sensors by evaluating wave heights. They serve the purpose of improving warnings in close-in regions, where people may only have minutes to react. However, sensory networks and their delivery systems can break down due to link or node failures, intrusion attacks, or issues with the measuring instruments. Missing data can cause prediction inaccuracy or problems in the continuous events handling process.Real-time Inactivity Alerting

This is where functionality like Logentries Inactivity Monitoring can help. While most DevOps teams are alerted when events happen, inactivity alerting functionality offers a way to know in real-time when a service stops logging events of a specific type. Hence a probable symptom of impending issues. You can set-up the period when a pattern or an entire log or group of logs becomes inactive, and alerting can be used to notify you on your mobile device. The level of expertise required to use it is low, because the inactivity alert has a simple, intuitive interface that lets navigate the procedure with ease and you can get things done with minimal effort and time.

The pattern recognition capability offers the flexibility needed to enlarge or narrow the search scope enough to accurately get as many real cases as possible. Receiving the alerts is a cornerstone of successful response to these instances, so having third-party integrations is important. Tools like: iOS Notification, HipChat, PagerDuty, Web hooks, and Campfire. Messages are sent to individuals, groups or the entire team, so you can sleep soundly being sure that in case something does not happen that should have, all the appropriate people will be warned.

In addition to being dangerous to the health of production applications, missed activity is hard to track down, and compounds over time. Low-risk missed events, like service pack updates, may not have a direct impact on production right now, but could over time. And a series of such events could be the eventual cause of a major outage. Because they are transparent issues, they are hard to find unless someone specifically remembers activities, and notices when one is missing. This requires special knowledge, and is not transferable.

It’s very easy to pay attention to what happened, but having the foresight to think about what didn’t, could be the difference between a happy production environment one day, and then 404 pages for the rest of the week.

Check it out – Try Logentries Inactivity Alerting free for the next 30 days! 

Posted in Alerts, Application Performance Monitoring, Log Analysis, Log Management, Tips & Tricks

Leave a Reply