In this conversation, Rachel Stephens, the research director at RedMonk, discusses the evolution of observability in the tech industry. She highlights:
– the shift from traditional monitoring to a broader understanding of data collection and analysis
– the importance of OpenTelemetry as a standard for data ingestion
– the challenges organizations face in managing and analyzing vast amounts of data
– how AI is enhancing observability, particularly in helping users query data more effectively and derive insights without needing extensive technical knowledge
This is a RedMonk video, sponsored by Dynatrace.
Transcript
I am Rachel Stephens. I am the research director at RedMonk. At RedMonk, we follow technology adoption trends, particularly from the view of the developer and the practitioner.
So some of the trends that RedMonk is seeing in the observability market is one ~ everything is observability now. You may have noticed that we don’t call anything monitoring anymore. is the whole market has shifted to this wider view of more context and data that has come together in a way where people can better understand the signals and noise. Part of that is that we are seeing kind of the commoditization of data collection as part of things.
So if you kind of think of the evolution of the industry, you kind of think back to the APM days where people had very proprietary agents, all the data lived in kind of a siloed stack and the observability or at the time it was monitoring data was all kind of in a place and they lived there and only the magical select few could actually access it and use it so you had to know the right query language, had to be able to do all of the things. We started to see that ecosystem open up early, so we saw like the ELK Stack was one example that we had things come through with like Prometheus and Grafana, and really what we saw a big shift in how this ecosystem is coming together was when OpenCensus and OpenTracing came together to form the OTel the OpenTelemetry project. And the ecosystem has really rallied around that project as a standard and as a way for people to have their data ingested in common formats so that your data doesn’t necessarily have to be instrumented one particular way for one particular vendor and instead your data now has a way to just go into whatever system you need it to go into.
If I’m thinking about the role of the open source community in conjunction with the vendor community, all of them have roles to play and we find people who are coming up with solutions all across that spectrum of fully open to fully vendorized and then a whole bunch of flavors in between. So when we’re looking at people who have decided that they want to do it fully themselves. Sometimes it is they want that they have control and concerns on that front. Sometimes it’s because they have cost concerns. Sometimes they’re just bootstrapping something. Usually when we’re seeing large scale organizations with production based observability, they have brought a vendor in, sometimes many vendors in actually. So it just depends. But what we are seeing is that people want at least that promise of portability of their data and being able to use the value add of the vendor and the vendor system on top of their data that they’re adjusting sometimes using proprietary agents but oftentimes more now using OTel instrumentation.
Right now we’re seeing the volume of data that people are trying to collect and sift through has always been, I think that’s the perennial thing when people are talking about logs, it’s like how am I supposed to actually find the needle in the haystack and a whole bunch of logs. I think we’re seeing that to an even greater degree right now, not just in logs but in everything we see. have so much data that we can possibly be observing, especially as you start to look into worlds where we are bringing in new technologies like LLMs. It’s like, how are we supposed to be understanding how these technologies are performing on a whole bunch of different axes in terms of performance and like see all of these things. One of the things that is a challenge for people is trying to figure out what data am I supposed to keep? What is the data that has value? do I have this cost of basically high density amount of data compared to the cost that I pay for it? And that is what people are trying to figure out in terms of what do I store? Where does this go? And so one of the things that we’re seeing people play with is can some of this data live in like a generic data lake and rehydrate it as needed? Do we need to sample? Do we need to figure out how to ingest only certain things? Are some things going to be changed from logs to metrics? Like all of these things are in play all of the time and people are just trying to perpetually hone what it is that they are doing so that their data is not absolutely exploding and their costs are not exploding.
I think the quality of the data is absolutely in question for a lot of people. like, do I need to instrument better or do I even have the capability to instrument better or can the tool help me do that? Or even more to the future, it’s like, can LLMs help me understand what my data is without kind of that heavy lift on the front end? All of that is fair, but it’s also really are an expensive way to solve problems most of the time. And so if you can get to a point where you have found an observability vendor that can help you sift through the data, find that signal and noise, help you proactively find what it is that is an issue and surface things for you so that you’re not. If your solution is just like, I’m going to throw all my raw logs into the LLM and the LLM will take… That’s not a good solution. People have solved this problem already. There are tons of vendors on the show floor who have figured out at least some of those basics. You don’t have to reinvent the wheel, guess is what I’m saying. I think there’s a lot of promise in some of these augmentation tools that are coming via AI. So can we help people detect possible root causes faster. Can we help them kind of correlate events so things are less noisy? Can we help make it so that people are able to kind of try multiple paths in parallel? Possibly it’s like, is it this or is it that? And that can kind of explore at the same time rather than having to do everything serially. All of those are really great potential use cases for applying AI into your observability.
But it’s also still really early we start to say that AI is going to fix some of these problems because I think that so much of it still requires human judgment, understanding of your system, and making sure that you have kind of that context. It’s not the LLM context, it’s that human organization context and that the socio-technical systems have to come together so that the AI is supportive. There’s a couple of ways we can think about the way that observability data is opening up. I think one of the things that is amazing about the LLM-based technology shift that we have seen, and that really has democratized people’s ability to query data, like for people to understand how to ask a question of their data without having to have learned something like a specific PromQL or something like that. Instead of having to know a query language, I can go into these tools, ask a natural language question, and start to ⁓ get some degree of answers. So it means that we don’t have to have quite as tight of a hold on who has access to the system and who’s able to derive value from the system because it’s easier and the UX is easier for people to undrstand things.
I think the other thing that we have started to see is that there’s a lot of really strong overlapping use cases for a lot of this data. So for example, observability data and security data, a lot of that data is actually like the same raw data and you can just use it in different ways. And so finding out how we can get more value the data we have and present our data to the different groups that could possibly use it is a huge win in terms of that value. So we see a lot of users as we’re expanding the number of people who trying to derive value from observability data, particularly if you’re asking developers to use this data. They’re going to be wanting tools that they understand and tools that don’t only show what’s happening at the infrastructure level, but can they start to get some insight into the application level and so can we start to help developers understand how their applications are running in production is a huge value for any observability tool.
So when we’re thinking about vendors who have started to put some of these tools together and starting to build that context so it’s not siloed data of logs, metrics, traces, but can we start to put together holistic kind of views of our data and enrich our data then with some of the AI tools, companies like Dynatrace are doing a great job at trying to make it easier for people to actually find and derive that value and it’s not just you having to build the context yourself but the tool is helping you do that.




















