Last week the team here at Logentries had a very busy week. We were invited to participate in the Dublin Web Summit, Europe’s largest tech conference with over 10,000 people streaming through the doors over the 3 days. We participated in a panel discussion on the cloud stage around, “Cloud Intelligence and Big Data”:
And shared our insights with a presentation on log data and the cloud, “Big Insights from Little Data: The Intersection of the Cloud and Machine Generated Log Data”:
If you missed them check them, it’s definitely worth it.
Our First Research Spotlight
We were also very excited to release our first research spotlight, Big Insights from Little Data: A Spotlight on Unlocking Insights from the Log Data That Matters, led by Trevor and Benoit, working with our research team.
We’re obviously very big fans of DevOps. We’re not only living the DevOps life within our teams by continually updating our service with new capabilities while working to improve our operations and reliability simultaneously, but also helping many of our DevOps customers who are running their applications. As a side note, I’m personally fascinated by the challenge of people in a DevOps role given the inherent tension between continually building and improving applications, while also running the applications and ensuring stability and high levels of performance.
This is partly why our research team decided to focus on the use case of DevOps running an application on the Heroku platform…We’re not only fans of DevOps but we’re also big fans of Heroku and PaaS. As our research highlighted, usually someone in a DevOps role is pretty interested in Heroku error codes and application exceptions to help them understand the performance and reliability of their application. These error codes are important to not only troubleshoot issues but they enable DevOps to uncover potential issues before they affect their application.
As we did the research, we quickly realized that not only were we dealing with the modern, cloud equivalent of the needle in the haystack, but the needles were hidden much more than we would have originally thought. In this case, the Heroku error codes (~39,600,000 of them) are the needles and the haystack is the log stream (over 22B events sampled in our research from over 6,000 applications). With an actual breakdown of the events highlighting the valuable information that could be found, 5% fatal events, 29% critical events, 17% exceptions and 49% warnings.
It’s worth noting that in our research we defined a particular use case for someone playing a DevOps role – i.e. someone looking for a specific set of error codes and application exceptions which would be typical of a performance/reliability troubleshooting scenario. That same person may dive into the other ~99% of data for other use cases. When they are looking for the error codes we defined, however, they’re going to have to dig through a lot of other data – which we expect is not that uncommon – and this what we were highlighting as a real world example of how challenging it can be when looking for the needle in the haystack.
Another important note on the research and Logentries in general, we take our user data and customer privacy very seriously and continually work hard to earn and maintain the trust our customers have placed in us. The research was focused on Heroku error codes. We did not use any customer specific information and were very careful to anonymize and aggregate any info we did use.
In Defense of the 99.82%
The research was not only intended to highlight a specific use case but to also stimulate a broader conversation about the value of log data. As most people who love logs would likely say, the remaining ~99% of the stream is still extremely valuable and you don’t just discard it. Rather, it’s important to first understand what role a person has in an organization and what information they might find valuable. Logs are an incredibly powerful data source. DevOps is probably most interested in application health and performance, whereas marketing may be interested in user metrics and business performance. And InfoSec would be most interested in audit and security events. Etc. Etc. In fact, the same log events that may represent noise for one user may contain the info that allows you to take action in another scenario.
The great part about log data is that it’s a universal data source. Logs contain tons of information, known and unknown, for multiple different use cases. The problem is that for most folks logs have been traditionally very difficult to access and use. This is where I’d normally put in a plug for Logentries, but I think the bigger point is most organizations aren’t aware of what’s available in log data, lack the resources to discover it themselves, and would benefit from the expertise of a third party service provider to help them dig through the noise and find the important bits that matter most. By the way, pretty much all the panelists from last week agreed on this point. Logs are underleveraged and we as an industry and as service providers can be doing a better job of helping users obtain actionable insights from log data.
Thus, our team’s mission is pretty simple: to make the value of log data accessible to anyone. In fact logs are data, and as a community we should be able to better use our logs as data for more and more use cases going forward! We want to help you get there!