In this RedMonk conversation, Kelly Fitzpatrick, senior analyst at RedMonk, talks with Tom Johnson, co-founder & CTO of Multiplayer about the evolution of observability into what some folks have termed “Observability 2.0”. They explore how Observability 2.0 is better positioned to address the specific needs of developers, particularly in debugging and system transparency. The discussion highlights the importance of open standards like OpenTelemetry, the need for better developer tools, and the impact of improved observability on developer experience and productivity. Tom shares insights on how these advancements can lead to more efficient workflows and ultimately foster innovation within development teams.
This RedMonk conversation is sponsored by Multiplayer.
Links
- LinkedIn: Thomas Johnson
- Twitter/ X: @tomjohnson3
- Multiplayer.app
- Twitter/ X: @trymultiplayer
- Charity Majors on Observability 2.0
- Geeking Out Podcast on Demystifying Observability 2.0
Transcript
Kelly Fitzpatrick (00:12)
Hello and welcome to this RedMonk conversation on observability 2 .0 and what debugging looks like in this day and age. My name is Kelly Fitzpatrick and with me today is Tom Johnson, CTO of Multiplayer who has guested on the MonkCast previously two times to talk about various aspects of developer tooling. Tom, welcome back again.
Tom Johnson (00:32)
Thanks for having me. Again.
Kelly Fitzpatrick (00:35)
And then again, for folks who may have missed our earlier conversations, what should people know about you?
Tom Johnson (00:43)
Long time backend software developer. I got my start in speech recognition and robotics segue to telecom and then building, you know, internet apps, large distributed systems for small companies and large.
Kelly Fitzpatrick (00:56)
and current CTO of Multiplayer. So to kind of kick things off, what does observability have to do with your work at Multiplayer?
Tom Johnson (00:59)
Yes. That’s a great question. So, you know, observability, the way I think about it is like the ability to see in your system and watch and see stuff happening instead of treating it like a black box, like what magic happens inside. So observability gives that a level of transparency. So for us, we’re using observability technologies like OpenTelemetry for the purposes of auto -documenting your system, your distributed system, and debugging. So these are new applications of observability to solve real problems that teams have.
Kelly Fitzpatrick (01:41)
And for folks out there who are listening who may not be aware, only a few short years ago, we were debating the definition of observability. And now the term has been taken up by APM vendors and marketing organizations of all kinds to the point that observability as a term almost seems rather ubiquitous. Partly in reaction to this, we have heard folks, including thought leaders in the space like Charity Majors at Honeycomb, take up the term observability 2 .0.
What do you make of all of this?
Tom Johnson (02:13)
Well, that makes some good points. I mean, observability has become this term that is applied so widely. What does it mean? And that’s why I think of it in terms of just being able to see into your system. And I think there’s an evolution that’s happening. So if you think about what observability, say, 1 .0 is, it’s basically APM providers that provide just a bucket, a big bucket for all of your logs, metrics, and traces. And maybe they have
proprietary agents to help collect this data. So it’s a good starting point. It’s necessary with complex distributed systems, but it’s not necessarily solving the day -to -day problems that developers have. If you’re debugging an issue, the developers I know rarely go into the APM tool because it’s so hard to find the needles in the haystack
that give the evidence of the bug that’s happening. So it’s like you’re chasing stuff around. So I think observability 2 .0 is more focused on solving problems and also built on open standards. So OpenTelemetry is a common standard for instrumenting your code, collecting logs, metrics, and traces. But then with that data doing things that are helping directly,
development teams. And for us, that means auto documenting your system, discovering it, tracking it over time to take the manual work out of doing that. And manual work diagramming or updating API documents. Or maybe you don’t have any API documentation. Just getting it together, like having a source of truth about your distributed system that
you don’t have to bug your developers to constantly update these things. It just happens for them. That’s one thing. And then making it easy to debug issues where, for us in our platform debugger, you start, you click a button, it’s recording your session and all the related logs, metrics, traces for that session are in one place. You don’t have to go searching for it at all. so I think that’s, you know, for us, the observability 2 .0 is about solving problems.
Kelly Fitzpatrick (04:31)
And there’s something about the way you articulated observability, 1 .0, we’ll just call it, which having been around for the defining observability in those days, we had the three pillar model, X, Y, and Z, metrics, logs, and traces. But one thing that you noted is that the tooling that falls under that observability 1 .0 category are things that developers don’t often use or want to use and are often just kind of jumping in just to , solve a problem. And that very much fits with the folks that we talked to where the developers are like, I don’t use this tool very often at all. I wouldn’t know how to actually set it up or leverage it fully. There’s somebody else in the organization who tells me how to do that. And it strikes me that your articulation of observability 2 .0 is much more developer -focused on things that developers can actually use themselves and that are built for them.
Tom Johnson (05:26)
Yeah. And I think that is a great way to put it. Like observability 1 .0 is for the ops people who need to know whether things are running on fire and potentially like see some trends across all things, across all systems, across all metrics and traces to try to suss out like what, is my system doing okay or not? It’s not for developers.
it so observability 2 .0 is for developers to help them with specific problems. And two of the biggest problems they have are like knowing about the system that they’re working on things like dependency. If I change this, what, what is affected is not an easy question to answer today or debugging, which is like, I, and that’s a constant thing. It bugs everybody has bugs and you’re constantly like,
trying to track things down and replicate the bug in order to fix the bug. And then once you fix the bug to be able to verify that it’s fixed, it’s very, very hard to do that in a distributed system. It’s really for the developer kind of a black box still. And observability 2 .0 and the stuff we’re working on at Multiplayer, we’re looking to solve those problems.
Kelly Fitzpatrick (06:50)
And can you speak a little bit more about what observability 2 .0 means for the average developer, especially around debugging? Because to your point, this is something developers care about a lot, but it’s not something that we talk about all that often when we’re talking about observability.
Tom Johnson (07:06)
Sure. So what it means for the average developer, it means more visibility into their system, less manual work. And for debugging, I’ve been trying to come up with a good metaphor for this. Let me test this out here. So this is what it’s like for a developer today to debug an issue. Picture yourself at a restaurant.
Kelly Fitzpatrick (07:23)
We’re here for testing out your metaphors.
Tom Johnson (07:34)
Okay. You go to a restaurant, place an order for a cheeseburger. Okay.
The the you know waiter takes the order goes in the back goes in the kitchen out comes salmon Okay That’s an issue You’re asked to debug that issue. Okay, you are unable to go into the kitchen Try to fix the problem so that’s observability 1 .0 or maybe you have a lot you’ve got all the receipts from the day You know, and that’s what you’re that’s the big bucket of stuff
And you’re supposed to figure out from that, what happened and then fix that problem. Okay. Now observability 2 .0 is okay. I go to the restaurant, I place an order. I walk into the kitchen with the waiter and I’m able to go from station to station to station following the process until the salmon comes out and I say, this person here, you know, we need to change that process. Bug fixed.
You know, and, and you’re done. So it takes the backend and, doesn’t treat us to black box anymore transparency. Plus you’re able to follow your session, your order all the way through deep into the backend and all the way back. So that’s, that’s a problem where we have actually solved it in Multiplayer with our platform debugger, but that’s part of like, that’s a big deal.
to no longer treat your system as a black box and to be able to really go in at a session level and see what’s really happening.
Kelly Fitzpatrick (09:18)
I think I like, I know it’s an experimental metaphor, but I like the restaurant metaphor. I don’t think I ever want to be in the kitchen, but I like, I get it. I get it as a metaphor. And I see the usefulness of, you know, don’t send out the salmon when we want a hamburger is what was ordered I think it might oversimplify what this means if the thing that doesn’t leave the kitchen is in fact like a breaking change.
Tom Johnson (09:25)
Yeah. Yeah, yeah, so maybe I’ll adjust the metaphor. Maybe it’s I get ketchup instead of mustard on my hamburger. Okay, little change and I have to figure out, you I have to fix it without being able to see the process on the inside. So, okay, I’ll work on that. It’s good to workshop it. Thanks for letting me do that live. Okay, good, good.
Kelly Fitzpatrick (09:58)
Yeah, but… Yeah, but I like it!
Like it’s a start, like this is totally a start, right?
So in terms of, and again, this was a really kind of good overview of observability to you, Poyot, where we are today, where does debugging kind of fit in all of this? Where do you think observability and debugging goes from here? Because I feel like in 2024, things are changing like every day. We get like a new way of doing X, Y, and Z. Where do we go from here?
Tom Johnson (10:32)
Yeah. So that’s a great question. You know, so one of the things I think that people need to embrace is the fact that software is increasingly complex. You know, you’ve got these distributed systems, you’ve got, know, it’s not getting easier. It’s getting, more complex. First thing I think to do is really adopt open standards.
OpenTelemetry is a great one. It’s evolving as a project, but build your observability base on open standards. So as you’re to get more and more features that can plug into that, same data over time. So starting point is like, adopt that. And then,
for us, we’re, we’re solving the auto documentation and debugging problems. And, I think that that those tools have to be. part of the tool belts, for teams going forward so that, you know, move away from the, just throw, all this stuff in a big bucket. And then that’s enough. you have to,
do more, you have to give developers better tools because right now there’s a lot of grunt work. There’s a lot of manual work. Developer experiences is low, can be much higher just by removing that grunt work, taking it off the table. So take off the manual work for documenting, let the system auto -doc on top of OpenTelemetry. Give them better debugging tools so you don’t have to manually go to the APM.
providers and try to find stuff, evidence of bugs. Don’t say, know, QA reporting a bug or customer, like, here are the steps to reproduce. Don’t do that. Manual reproducing a bug is terrible. Take that off the table. Have a better approach. So I think that that’s what’s coming. That’s what’s here now. And I think embracing that is really important.
Kelly Fitzpatrick (12:34)
Yeah, and we’ve been following OpenTelemetry for quite a while. And I think what strikes me here is it’s let us take OpenTelemetry. We have found a way to do use the leverage this for like auto documentation. And then we have found a way to leverage that to help with debugging. So it’s almost like taking something that is in the realm of possible and like taking this nicely packaged, precise way of leveraging it that developers can really benefit from.
Tom Johnson (13:01)
Absolutely, yeah.
Kelly Fitzpatrick (13:02)
So moving on from that, what do you think this means for developer tooling and developer experience in general in the future?
Tom Johnson (13:10)
So it’s a better developer experience. It’s easier on boarding. It’s less risk for an organization when you’re able to get the knowledge out of the heads of people. And it’s out there for people to share. And you can have better communication. You can have better productivity. It’s faster. Developers can be focused on the
Stuff they should be doing, which is, developing features, fixing bugs quicker, more productivity, faster, delivery of features, more innovation, shorter time periods. And on top of that, because this stuff is really connected to a lot of grunt work that nobody likes to do. Boy, you know, just, just for that alone, I would say just do it, but, know, that’s a huge benefit from.
for the daily happiness level of developer that can focus on the things they like to do. I don’t know how many hours, weeks, months of my life I’ve spent grepping through logs to find the issue related to the bug that I need to fix. I’d like all that time back. And if I could use tools that would never require me to do that, again, I absolutely would.
Kelly Fitzpatrick (14:27)
And there’s definitely space in my head that was taken up with the process of digging through logs, knowing how to do that. Because once you learn how something is being recorded in a given tool, it’s easier to do. But I want that brain space for other things.
Tom Johnson (14:33)
Yeah. Absolutely. yeah, and there’s just there’s just so much to know about and keep in your head anyway as systems get more complex. This should not be one of those things.
Kelly Fitzpatrick (14:52)
Yeah, absolutely. I think kind of connecting developer experience to not only happier, but more, say, productive teams. I think there’s an argument there. We’ve, you and I have talked about in like a previous episode about developer experience research. And there was an Atlassian report that came out this past summer where they talk about developer experience. And one of their findings is that developers are losing eight hours a week due to inefficiencies.
And if you do the math on that, you get to like $13 ,000 per developer per year or something like that. And you can add that up by all the different developers in an organization. There’s like a financial argument for all of this as well.
Tom Johnson (15:34)
Absolutely. Yeah. we think that, that up to 20 % of the time a developer can be spent on things like grunt work related to keeping documentation up to date or finding stuff, dealing with out of date documentation, debugging, finding all the traces involved in the customer.
problem that they said yesterday around, I don’t know when it was, but I got an error. Can you fix the problem? You know, like that is, you know, that is real. So he’s had a team of, of 10, you know, like one or two developers, it’s like having a couple of developers basically sitting on their hands, not doing work, on the core product and advancing it. It’s all like wasted time. So, I think
that the tooling that’s coming out, that we’re developing, and that the trend for tools to help developers do their jobs a little easier, it’s going to equate to better bang for your buck as a company. What are you paying for? Are paying for developers to do new things, new features, and not… not sit on their hands. So it’s an imperative, I think. Now, and it’s as systems get more complex, it’s just more time without a radical change in how you do things. And yeah.
Kelly Fitzpatrick (17:03)
So more time and more brain space for innovation.
Tom Johnson (17:06)
Yes.
Kelly Fitzpatrick (17:07)
which is a good goal. And Tom, with that, we are at time, but I have to ask, where can folks find you on socials? Where can they go if they want to learn more about Multiplayer?
Tom Johnson (17:17)
Sure. you can on X or threads, I’m @tomjohnson3. That’s number three at the end and LinkedIn. I’m Thomas Johnson. And you can find out more about Multiplayer at Multiplayer.app and we’re @trymultiplayer on the social platforms.
Kelly Fitzpatrick (17:34)
Many thanks again to Tom for yet another great conversation. My name is Kelly Fitzpatrick with RedMonk. If you enjoyed this conversation, please like, subscribe, and review the MonkCast on your podcast platform of choice. If you’re watching us on YouTube, please like, subscribe, and engage with us in the comments.
No Comments