Charting Stacks

DevOps and OpenSource Log Aggregation Tools – LogStash has Competition.

Share via Twitter Share via Facebook Share via Linkedin Share via Reddit

 

In the world of DevOps you are nothing without logs. As we move to more complex architectures developers can, and do, add in various types of advance telemetry and instrumentation – which of course opens up the debate as to what to log v’s what to instrument, as noted by developers such as Peter Bourgon.  Here at RedMonk we are all fascinated by all things logging and telemetry related, recognising the value held in such data.

For many organisations though, particularly those that are just starting on their DevOps journeys (which is far more companies than one might think), just getting the aggregation and analysis of logs in place is a great starting place. As the quote which is often attributed to statistician Karl Pearson says:

“That which is measured improves. That which is measured and reported improves exponentially.”

The starting point with logs in any reasonable scale system is aggregation. Among purely opensource solutions LogStash has the undoubted lead, propelled by its close integration with Elastic and Kibana – the ubiquitous ELK stack.

In our discussions around log aggregation we frequently hear about two other solutions – fluentd, which is used with Kubernetes, and Graylog2 – both of which garner far less attention than logstash on forums such as stackoverflow.

allthelogs-stackoverflow

However, when we look on github we can see a growth in interest In all three, with logstash and fluentd following reasonably similar trajectories.

LogStash v GrayLog v fluentd Github Stars

 

Issues and Commits

Now github stars are an interesting proxy for interest, but when it comes to actual usage looking at the issues logged, and more importantly if they are coming from the community can be far more interesting data point. We can see that almost 70% of the issues reported against both LogStash and fluentd, while GrayLog2 is far closer to a 50/50 split.

logstash-graylog-fluentd-issues-barchart

The trends are pretty consistent over the life of the projects as well.

issues-logstash-c-v-c

issues-fluentd-c-v-cissues-graylog-c-v-c

Similarly for commits, with both LogStash and fluentd the community accounts for approximately 70% of the activity, although for graylog direct company contributions account for almost all commits.

logstash-graylog-fluentd-commits-barchart

The trends over the lifetime of the project are also similar for commits

commits-logstash-c-v-c

commits-fluentd-c-v-c

commits-graylog-c-v-c

 

By far the most interesting part of all this though is the rise in overall interest since early 2014. As people have come to realise just how much value is in in their log data we see more and more log aggregation solutions being deployed. The first part of a journey, but a very important part none the less.

There are some caveats with this research to note:

  1. We have endeavoured to match major committers to their companies, but that is not always possible.
  2. Plugins are dealt with differently in each product, so we have limited the scope of our analysis to the main repo.
  3. Issues can, and do, come from commercial support contracts. The actual issue logged will generally come from the main company contributing to the project.

Disclaimer: Treasure Data (major contributors to fluentd) are a current RedMonk client.

Found this interesting? Sign up for my newsletter.

No Comments

Leave a Reply

Your email address will not be published. Required fields are marked *