A RedMonk Conversation with Derek Collison, “the Forrest Gump of Messaging”

A RedMonk Conversation with Derek Collison, “the Forrest Gump of Messaging”

Share via Twitter Share via Facebook Share via Linkedin Share via Reddit

In this RedMonk conversation, Derek Collison, founder and CEO of Synadia and creator of NATS.io, discusses the evolution and importance of messaging systems with Kate Holterhoff, senior analyst at RedMonk. In this wide-ranging interview, Derek shares insights from his time at TIBCO building Rendezvous, Google co-founding AJAX APIs, VMware co-founding Cloud Foundry, and now Synadia maintaining NATS. Derek not only talks about the history of the tech industry and how it has evolved since the 90s, but also his current views on AI, streaming, open source foundations, edge computing, distributed systems, and the rise of enterprise application integration (EAI).

Synadia is not currently a RedMonk client.

Links

Transcript

Kate Holterhoff (00:12)
Hello and welcome to this RedMonk conversation. My name is Kate Holterhoff, Senior Analyst at RedMonk and with me today is Derek Collison, Founder and CEO at Synadia and creator of NATS. Your friends call you the Forrest Gump of messaging, so I can’t wait to hear more about that moniker. Derek, thanks so much for joining me on the MonkCast.

Derek Collison (00:29)
Kate, thanks for having me. I appreciate it.

Kate Holterhoff (00:32)
All right, so I’m excited to have Derek chat with me as part of my ongoing research into the history and future of messaging. historically, I’ve been talking a lot about message queues here. Of course, today we’re gonna be talking a lot about NATS and TIBCO and all these sort of things that Derek’s history has touched on. But before we get into your career history, Derek, I wanna kick off this conversation by addressing why this topic still matters so much. So at a high level, how do you define messaging? And when you talk to folks, today, you know, in 2025, how do you explain its importance in our era of distributed systems?

Derek Collison (01:06)
That’s a great question. I think in the 90s when I joined a company called Teknekron, which became TIBCO, the CEO had a very clear vision and understanding of what was about to happen in financial trading systems. We were moving from humans doing everything to programmatic training. for the most part, information distribution was done to his point, I’m using his metaphors here.

like a phone call, right? And so if I called you first and then I call, you know, four other people, humans, you know, our brain switched at about 160, 200 milliseconds or so, right? That’s the golden rule inside of Google is it has to get back outside. The response has to go back out less than 200 milliseconds based on that. However, once programmatic training was, trading was going to come into effect, that last person was at a major disadvantage. So.

Vivek, who is the CEO of Teknekron and then TIBCO, his vision was simply to change telephone calls into a radio station that you could tune into. And that was kind of the birth of kind of the initial PubSub type stuff. When you talk about modern systems, think PubSub is kind of a dated term, right? However, I do believe deeply in what I call intelligent connectivity. So when I started my career, I got very lucky.

But I was not feeling that way when it happened, which is I graduated university out of Maryland and went to work for a program at the applied physics lab. And I got selected by the second best physicists at the lab. And what that simply means to the audiences is that I had orders of magnitude less time on the supercomputer. This was an age of everything was vertically scaling, right? In the late eighties, I mean,

I did university courses in concurrent programming, but I’m like, this is kind of goofy. No one’s ever really going to do that. And so the physicists kind of looked at me and they said, hey, we’ve got these 12 spark pizza boxes down in the basement, right, of the lab. Can’t you make them do what the Cray does or the connection machine does? And these were incredibly brilliant physicists and mathematicians. And I just kind of looked at them. I’m like, no, you can’t do that. Fast forward to today.

everything is scaled horizontally. Everything is a distributed system. And I think what we’ve seen with kind of this massive transition, in my opinion, from cloud to edge is that not only are the systems continuing to grow in complexity, the number of moving parts, but they’re being pulled apart. Right. And so a lot of times you’re not trying to vertically scale something. You’re just simply trying to get it closer to wherever it’s being consumed, whether it’s the service or data.

Right. And so you could say, wow, I have the fastest computer in the world, but it’s in Europe. And let’s say I’m on the East Coast, you might be on the West Coast, you’re in Europe, wherever. It’s more of a latency thing. Right. So these systems are getting spread apart. And so when folks ask me about like, Synadia’s tech or NATS, especially in terms of the connectivity piece, I say there’s a couple of superpowers that NATS has, in my opinion.

One is location independence. Everything we do, all distributed systems, are based on location dependent. I need to know where you are, right? And of course, people can argue in the audience and say, no, that’s not true. We’ve got blah, blah. But I would counter with that you’re doing unnatural acts to get that. So in other words, when you do HTTP, you go through DNS. There’s DNS tricks like with Anycast.

Then you go to GSLBs, right? These global load balancers. Then you go to load balancers. I mean, there’s so many things going on to get away from the fact that the base construct of distributed systems that have been built with legacy tech is everything is one to one. Everything is request reply and everything is synchronous. And anytime it’s not, you’re doing unnatural acts. So you’ve got more moving pieces that either cloud providers provide XYZ. And so even though I don’t say PubSub very much anymore,

this notion of location independence. If you’re listening on a certain, I call them subjects because that’s we call them at Teknekron and TIBCO, but topics, channels, whatever those are, it doesn’t matter to me where you are. Moreover, it doesn’t matter if there’s one Kate or there’s a thousand. I just need to know I’m talking to a Kate and I need to know that it’s the best one, which is usually a late C sensitivity thing, meaning pick me the closest one. And so what NETS tried to do from all of the lessons learned from

Teknekron and TIBCO with TIBCO’s rendezvous, which I was a co-founder of that group behind what was called the CI server. Then I architected the modern version of rendezvous, and then I went to create EMS, which was the enterprise messaging system. We worked with Sun and others creating the JMS spec and then the WS star stuff, which was kind of a debacle. And so NATS kind of was bred out of all of that.

learnings and frustrations and I call them scars on my back. not an IQ thing. It’s just a number of scars. But the fact that it’s location independent by default, it’s M-to-N by default. M-to-N is kind of like the audience might say, what do you mean? HTTP is one to one. I know I’m talking to a single instance. I resolve an IP address. And again, if that’s not happening, there’s unnatural acts going on with load balancers and tricks to undo that. NATS has that natively and everything is M-to-N, meaning

Me as one can be talking to end Kate, but I only want an answer from one of them in the best one. So now you start to see that there’s no need for load balancers or GSLBs or even DNS, right? And I still have a T-shirt somewhere it says, always blame DNS. The other thing that NATS did, which now makes a lot of sense, but back when we were doing TIBCO, people were like, what? The protocol itself is always async.

HTTP started synchronous and of course they’ve added asynchronous behaviors into it now with HTM or HTTP/3 and QUIC and stuff and then multiple async streams going on the same physical connection. You’re starting to see that. And so for me, those two things are probably the most important part. And then the second part, which again goes from if you believe that these distributed systems are.

gaining complexity, more moving pieces, being stretched apart, X, Z, you can start to see why location independence and M-to-N as the default makes sense. The corollary though that needs to happen is that these servers are not like a broker, right? They’re extremely lightweight. Now, I don’t know if it’s relevant anymore. I’m dating myself, but they’re more like cattle, not pets, right? And you can lego brick these things into any topology.

that can cross cloud providers, cross geo boundaries, all the way out to edge. And for us, we call it near, far, and beyond. Near is, and this might be controversial, but near is the cloud providers trying to say, we do that too. I really think they’re the new mainframe, meaning they’re not going anywhere. But the interaction models are already starting to shift.

FAR is kind of the traditional and some of the newer entries into Edge. So the Akamai, Cloudflares, you know, Fastlys of the world, but also the Vercel Deno Deploy, Netlify, you know. And I think they will have their unifying event kind of like Kubernetes did for Cloud. But where we start to really shine is what we call far Edge, which is your factory, your store, your electric vehicle, your connected car, your satellite, your train, your medical device, your toothbrush.

All of that stuff are areas where NATS already runs and powers those types of interactions. The last thing, and then I’ll pause, because I know I’ve been talking a little bit, is that when we talk to customers, I’m not trying to change the what. So let’s say you and I want to design a new system. We would be drawing the same blocks and circles and microservices and key values and RDBs and XYZ. I think what NATS and the Synadia Technology Platform does is it just

radically changes the how. Right? So you don’t, you’re still going, hey, I need to invoke this microservice. Right? But let’s say I’m inside of a vehicle, an electric vehicle. I don’t know where that microservice is running and I shouldn’t care where it’s running. And I don’t care how many of them are running. They could be running in the vehicle. They’d be running on the cell towers. They could be running in geo cloud pinning. Doesn’t matter. And in our world, there is no additional cognitive load. Batteries are included. And so let’s say

the requester is the Kate app, right, running in the vehicle. It’s simply asking the question and all of a sudden it just gets a very, very fast answer because it’s running literally right next to you on the same ECU. So hopefully that makes a little sense. So location independent, end-to-end is the default, and then these servers as very lightweight constructs that you can lego brick together to form any type of topology that you need.

Kate Holterhoff (09:38)
Yeah, that’s great. And you’re right, there’s a lot jammed in there. So I’m hoping we can kind of take a step back and parse some of the stuff that you’ve been mentioning here. And also I was interested when you mentioned the asynchronous quality, because I think one of my biggest takeaways from talking to folks like Andy Stanford-Clark is that it’s the creating these asynchronous connections that was really sort of the end goal, that that’s really where the innovation happened back.

decades ago, right? Especially the financial industry and in the airlines industry as well. yeah, so all of that’s really resonating with some of the conversations I’ve been having around this subject. And I appreciate you building on that and helping me to flesh out that story. so you mentioned the physics lab. And so you just for folks who aren’t familiar, this is the Johns Hopkins Applied Physics Lab. which I only know because I’ve been stalking you on your LinkedIn account.

And you have a, of course, a really exciting list of companies that you’ve been affiliated with, projects that you’ve worked on. And so I was super interested in that. And I was curious if you had gone to school there in some capacity, how did you end up at the physics lab and what did you study in college to end up there?

Derek Collison (10:40)
Yeah, so I studied computer science in university and graduated from Maryland in 1990. Yeah, I think so. That’s correct. But I had summer internships and since I was in Maryland, you know, Washington, DC was a big hotbed and the Applied Physics Lab. I knew some people there that knew of me and liked me enough to kind of recruit me before I graduated to join and they had a program inside of the APL.

Around organization that at the time wasn’t known publicly called the NRO and so did some interesting work on some very, cool projects there for them.

Kate Holterhoff (11:19)
Yeah, did you like working in an academic setting? I mean, I can see that. I used to be an academic for a decade. So, I know the sort of ins and outs of what that is like and what was your impression of working with academics?

Derek Collison (11:31)
Well, mean, it’s capital A applied physics lab. the physicists and the engineers like myself were rewarded on delivering things, not necessarily papers. Now we did do a bunch of research, but it was always in support of a project, right? And so we had a very specific project that we were all working on. And so everything that I was working on with the

Kate Holterhoff (11:35)
Okay.

Amazing, yeah.

Derek Collison (11:58)
Multiple physicists and the software and the engineering was all in support of that project, which was a very high priority for that organization at the time. Still is, but I mean, back then it was a very big deal. it was very exciting. I like the applied part of it versus the pure academic part of it. For sure. like to, people who know me say GSD, get stuff done.

Kate Holterhoff (12:15)
Okay. Sure.

Derek Collison (12:22)
but stuff maybe said differently type stuff. But no, these were, it was a massive laboratory of results versus the academic. Now, Johns Hopkins, of course, has more of the academic flavor and being tied to them is just awesome, right? But I think we only did a couple of papers and they were, if I recall correctly, yeah, those never saw the light of day because they were top secret. So couldn’t show what we came up with.

Kate Holterhoff (12:25)
Mm-hmm.

How exciting.

Derek Collison (12:48)
I think that you can now. think there was declassified a while back.

Kate Holterhoff (12:53)
Okay, fair enough. All right, so that’s your sort of academic background. Can we go further back in time? Like, were you part of like a robotics team in high school or like what was, you know, how long were you interested in computers? You obviously majored in it.

Derek Collison (13:05)
Yeah, so yeah, I think I was born at just an amazing time, right? And I got extremely lucky. we grew up never wanting anything, but we were we were not well off. But again, I had an amazing childhood, but I had a friend who in middle school got a TRS-80 and I was like, what’s this thing? And so I would just be at his house every day and trying to figure out how to do this and.

For Christmas, think that year, my little grandma, maternal grandmother, or we called her little grandma, she just said, I’m gonna get him a computer. My parents were like, we can’t afford that. And she’s like, I’m getting one. So she got me a Commodore 64 and then she got my sister one and my sister could care less about computers, right? And so I stole hers for parts after a while. But that was the land of, there was no robotics classes. There was…

Nobody even knew what a computer was. We barely had to use the TI calculators. But I loved it. There’s no manuals really. So I just stay up all night. I was connected to a cathode rayed, know, black and white little TV that was about 11 inches big. I would cover my head up at night so that my mom didn’t know was staying up all night on it. And so I originally thought I wanted to be a doctor and that probably just wasn’t going to be able to happen. And again, life

Life can be very interesting that way because it was a blessing in disguise. So when I actually finally did move to California, one of the first jobs I had was with a healthcare startup. And again, I got thrown into distributed systems again. So I’m a little slow sometimes, but eventually kept hitting me on the head of maybe this is what I was supposed to be doing. But for whatever reason, the startup paired with a company out of Carlsbad, California called Puritan Bennett.

They don’t exist anymore, but they were in the medical devices field and they had created the first intra-arterial blood gas device. And for whatever reason, they thought it was a good idea to let this 22 year old at the time run all the federal trials for it. So I did all the data collection out of Stanford hospital and I had to do a bunch of surgical cases. So I think the two cardiothoracic, two neuro, two GIs.

I had to do two morbidity cases, if I recall correctly, so much time in the SICU, NICU, you all that stuff. So I got my fill of it and it was exciting, but also very depressing for me. And I just realized at that point how lucky I got. Cause even to this day, you know, I’m getting up there. I’m 56 now. I’m still amazed by how fast technology is moving.

I studied neural networks in university because it had just kind of been invented, in 80, the backprop stuff, right, in 86, 87, I think. And then we hit the largest AI winter, right. So I’ve been following it even though, you know, it didn’t go anywhere. about two and a half, three years ago, when Chat GPT 2.0 came out, I was like, uh-oh. And so all the

People around me started going, he’s geeking out. I’m like, I’m telling you, this time it’s going to stick, right? We’re going to get there this time. And everyone kind of laughed at me. But I think they kind of see the path now, right? It’s definitely something very, different than what’s happened in the past. But I got extremely lucky. And the fact that edge computing and distributed systems and the way these things connect and communicate, move data around is still so relevant.

I’m extremely grateful that I got lucky. yeah, I’m trying to keep up like everyone else on the advancements in AI. And I mean, it’s just bonkers to me, but it’s exciting. I’d rather lean in than lean back, if that makes sense.

Kate Holterhoff (16:34)
Yeah, for sure. I know there’s so much cool stuff coming out in the news every day, it seems like. I’m a big podcast buff, you know, obviously. And so I tend to keep up in that respect. And yeah, no, it’s truly a wonderful time to be following.

Derek Collison (16:46)
So you probably love the notebook LLM. I think it took me one time playing with it and doing the auto-generate the podcast to realize that’s how I do all my first step research now. I just load the PDFs, load the YouTube videos and say, generate a podcast. I then download on my phone and go for a run or a wall. And that’s how I first absorbed stuff. but I mean, the other thing, and you know this, but you know, our brains are about 50,000 year old wetware, right? It’s like having a 50,000 year old MacBook. We don’t do good with exponential stuff.

Kate Holterhoff (16:54)
Yeah.

Really?

Yeah.

Derek Collison (17:16)
And it has nothing to do with how smart someone is. It’s just our brains aren’t wired that way. But if you look back, it’s easier for us to understand. And if you look back at just the state of the art a year ago, and then two years ago, it should boggle your mind of how fast things are going. And again, I don’t know where it’s going to land. I I have opinions like everyone else, but I know that I’d rather lean in than ignore it, right? So.

Kate Holterhoff (17:41)
Right, right, right. Yeah, well,

yeah, this morning I was listening to Simon Willison and swyx actually having a conversation about everything that happened in 2024. And it was taking me back. I’d forgotten about a lot of the AI what are they, the tools like the rabbit or whatever, the wearables. I had completely forgotten about all of that. And that was like the big thing. Yeah, exactly. Yeah, so I know it’s happening. I try not to attend to the hype cycles too much. Obviously, here we are talking about messaging.

Derek Collison (17:56)
Mm-hmm. Yeah. Came and

Kate Holterhoff (18:07)
These are foundational important parts of computing and yet the AI has thrown me into having to attend to things that happened for two weeks and then sort of flip away. Yeah, so it is.

Derek Collison (18:18)
Well, I mean, you can imagine a world, right, where you look at, how do we get all the training data, you know, where we don’t have to have these remote locations worrying about getting the training data to this model, right? We’ve had those use cases now for three years, four years with our customers. And then about two years ago, I shifted and I said, everything we talk about in terms of edge computing and…

location independence and M-to-N and data being able to be both synchrony replicated and asynchrony replicated digital twins down sampled mirrors sources. You know, we call MUCKs MUCKs D MUCK sources. If you look at inference, right? Inference is going to again try to get closer and closer to you because people are just built like that. We want things faster and faster and faster. It’s just you know, and people go explain to me what you mean. I said imagine paying you know in a restaurant with a credit card and it takes an hour to get processed.

And they look at me like I’ve lost my mind. It’s like, yeah, but I used to have to go into a bank to get money. The world keeps changing. But if you look at inference, there’s a lot of complexity in there. But you can also kind of simplify it, which I try to do so I can understand things. There’s a prompt augmentation, which is you give me the prompt and I’m going to augment it with stuff. And I might need to access data sources through either RAG or in our world. just like, hey, Kate, just tell me whenever something changes and I’ll hold onto it so I don’t have to keep pounding on you.

when people are pounding on me asking questions. And so you can imagine that location independent M-to-N don’t need load balancers any tricks to just figure out where things are. Then of course there’s the, where’s the model loaded? What’s the most available GPU? What’s closest to me? This notion of kind of almost like an arbitrage of getting to the model. But more importantly now, 2025 will be the the agentic AI type of thing, which just means you’re gonna be walking through multiple agents and multiple models.

And so you could imagine a world again where I’m in a vehicle and I’m asking, you know, an LLM, know, open AI or Gemini or Anthropics models for a plan. And that’s running somewhere in the cloud on big iron. But then all of a it’s a geo model that it says you need to talk to this and that might be running on a cell tower. Then it says, talk to the vision model and that’s running in the vehicle. Right. So now you start to see where I’m not saying it’s impossible, but it becomes a very

complex and brittle, which usually means security suffers when all of these moving parts have to fit together. And again, once people usually build these systems, it’s kind of like the old databases. It’s like, don’t touch it. It’s like, hey, can I access the database? No, you know what mean? They become very brittle. And I don’t think the world of these systems wants that. They want to be extremely dynamic and ebb and flow, both from where the services are, moving themselves around to where data is moving itself around just to adapt, right? These aren’t static systems. And so we have a bunch of very, very large AI customers already. So it’s very quiet. People don’t know that we’re powering so much of this stuff on a global stage, but we’re learning a lot and it’s exciting. And we can tell that the initial constructs of what we believe messaging brings, again, just changing the how, not necessarily the what, is extremely applicable today, both with Edge and then

AI inferences is kind of a big one. Manufacturing as well. Manufacturing is going through a massive renaissance. And you’re talking about Andy, I think about three years ago or so, we saw a lot of manufacturing customers coming on board and, this whole, you know, upper level, what I call information flow, but then they wanted the low level stuff as well. And so we built in direct MQTT connectivity into NATS. So you can just take an MQTT

app that’s running on a factory floor and just plug it straight in. So now all of you have all of the low factory level data plus the high level processing and decision making data flowing in the same system. most of our users and customers, after they’ve drank the Kool-Aid, if that’s the right word, they kind of say, you’re like the central nervous system, right? It’s not like, oh, there’s the broker and I have to go to it. Kind of like Kafka is today. Kafka is extremely powerful, but it’s kind of like old databases. There’s some big thing over there that needs a perimeter security model and it needs eight people to run it type stuff. We envision the world slightly differently. We do streaming for sure, but we envision it more of this Lego brick, all these topology servers together, and then it’s just transparent. Wherever you are in the world, it just works.

Kate Holterhoff (22:42)
Right. It uses JetStream, is that correct for streaming?

Derek Collison (22:46)
Jetstream is the persistence layer that we built in, yeah. Yeah, so NATS started out as a fire and forget at most once system, which people always go, wow, that’s not very useful. at least for me, I try to remind people that some of the most resilient systems I’ve ever built in my life that are still running 30 some years later were built on just at most once. You get yourself in trouble when you try to save and replay stuff. But that being said, that’s kind of a natural consequence to it.

Kate Holterhoff (22:48)
I see, okay, all right.

You

Fair enough.

Derek Collison (23:11)
people want, right? But where I Jetstream took off for us was that we could think very differently about it because it was based on that connectivity layer. We had that for free, right? That location independent, M-to-N, you know, XYZ, no load balancers at all. And so we did three things with Jetstream that I think were compelling.

One is synchronous and asynchronous replication are built in by default. So you can scale up an asset, you know what mean? But you can also say, ooh, create a digital twin in a totally different cloud provider or in my factory and have them auto heal and sync. And they’re always up to date. And then the last, and by the way, the fact that the connectivity layer is a given for us is, demo we’ll give is, we’ll set up a KV and,

somewhere in Europe, the guy, our dev rel guy who is awesome, he still is awesome. He would run a little program that would be putting in and getting keys out of this KV. And so the KV, by the way, is we do materialized views over top of the core persistence technology. we streaming like Kafka, but we also do object stores. So massive assets that are all flow controlled M-to-N, you can put them and pull them out and then key values and key values took off like gangbusters, which

I’m really good friends with Salvatore who wrote Redis, you know, back in the day. Matter of fact, with VMware, I brought Salvatore to the United States for the first time and I took them up to Twitter. So we got Twitter off of memcache to Redis on that trip. And I said, no one would want to replace Redis. And I I misunderstood what the ecosystem was struggling with. And so in this demo, right, we put it as an R1 one replica somewhere in Europe. And so it’s like a 200 millisecond.

to get a value right out of it. And it’s just printing it out and it’s running in his phone, right? And he goes, I didn’t mean to put it in Amazon. I wanted to put it in Google. And in our world, that’s literally one command and it instantaneously moves over. So it essentially stretches itself over and then it collapses itself down. And the whole time the app’s just running and you can see it. And it’s like, I didn’t want it to be an R1. I want it be an R3. And so we just scale up to an R3. Then we go into the asynchronous application and go, we need to have a mirror digital twin on the West Coast.

And so we say, create a mirror in Amazon, even though the origins in Google in Europe, and all of a sudden his app just drops down like 15 milliseconds or so response time, right? And it’s still running. Then he goes, now I want to create one in my house on a Raspberry Pi. So he creates another mirror inside, another digital twin inside of his house. Next thing you know, it drops down to microseconds. And then we reverse it and we tear all of it down. And so the app never got restarted, never got reconfigured. And so that notion of

the world keeps changing and things are dynamic and you want to keep optimizing for reducing latency when you access a service or data, that’s kind of a core tenant to what we do. But when we show that demo, by the way, the demo M-to-N is only about four minutes. And so when people see that they’re like, holy smokes, you know, and so that time to value and things like that are a big thing for our customers. But again, you notice that we don’t say PubSub, we don’t say topics channels, it’s kind of.

what it enables, not necessarily trying to change your brain to think about the way you would architect the system, right? Which is, I need to talk to the Kate microservice and the Derek microservice, and we need this KV and this relational database or this object store type stuff. Those are kind of the fundamental building blocks still of these systems, and we don’t want to change that.

Kate Holterhoff (26:34)
All right, and that makes a lot of sense. And I love how our conversation is just sort of organically branching out in ways that I couldn’t even have anticipated. Although I knew we were gonna talk about AI, I assumed it would be towards the end of our conversation. I love that we got it in so early. So, you know, thank you for that. You anticipate me. But I do wanna turn back the dial of time a little bit and talk about how you sort of ended up here. So can we talk about TIBCO?

To begin with, I’m so glad that you talked about some of the acquisitions, I guess that happened, because I was kind of unclear about the timeline, because my understanding was, TIBCO was founded in 1997, but on Twitter, you make it clear that you worked on TIBCO RV in the early 90s, so I was trying to figure that out, but it sounds like maybe that you started working on it, maybe with your former employer that was then acquired. Is that accurate? Am I understanding that correctly?

Derek Collison (27:27)
So there was a famous investor out of Incline Village that funded Vivek, who was the CEO of TIBCO. And one of his rules was that every one of the companies he funded originally had to be called Teknekron. So the company that I joined originally was called Teknekron Software Systems. And they had a system called CI Server. And it was just barely starting to get into Wall Street. And they realized that they had something.

Kate Holterhoff (27:33)
Okay.

Derek Collison (27:51)
but that they wanted to kind of do a redo, a modern version on it. And so when I got originally hired into Teknekron, it was to port the CI server to the NeXTSTEP box. I was a big NeXTSTEP person, Steve Jobs’ thing when he got kicked out of Apple. So when I did that, they go, ooh, you did that pretty quick. Maybe you want to sit with this team who are trying to think of the modern version. As we progressed on, what happened was is Teknekron became TIBCO. So we didn’t get bought or whatever.

But then Vivek did a series of really interesting moves where we split the company and TIBCO financial went to Reuters. Reuters bought that for all of the market feed data and market sheet and some of our projects there. And then TIBCO software, which is the company that went public, maintained the low level messaging stuff. rendezvous, EMS hadn’t been created yet. I designed that and I think post the IPO in 98, 97 maybe. And so

Believe it not, wasn’t that TIBCO bought Teknekron. It’s just Teknekron eventually morphed into it and it just stand for the information bus company. That’s how we came up and I did the original logo, believe it not, which was kind of crazy. Well, the web was so new back then, right? mean, the web started in 94, 95. We used to have something called, was an app that would show market data that wasn’t ours that became real popular right before the internet started and we were powering that.

We’ve powered a bunch of really interesting stuff, but TIBCO came out of Teknekron and then it split. So he sold half of it to Reuters and then we took the other half public. And then of course, after years and of course the big crash in 2000, which, know, we were not spared. took us out pretty bad. You know, they, we’d gone public in 97, I think. Yeah. And so they sold, you know, into private. And then of course, now I think.

They’re bundled up with, is it Citrix?

They’re bundled with some company or whatever like that, the Silver Lake folks or the private equity folks and all. and what’s another interesting story about TIBCO, at least in my opinion was, is that we knew we had a great thing. And hindsight’s always 20-20. And I was an engineer, right? I wasn’t, I didn’t have meaningful opinions on pricing and go-to-market strategies. And I just knew, ooh, we can solve that problem. So a lot of times they would bring me in on these big deals, like the FedEx deal was one of our largest. And I remember

Kate Holterhoff (29:50)
Okay.

Derek Collison (30:11)
living in Memphis for six weeks designing that system to scan all tracking data in FedEx. But I think TIBCO for a while got to a point where we should have adjusted to pricing pressure and we didn’t. And we saw Wall Street react. So they reacted in two different ways. One was the AMQP protocol. So that was specifically designed to hedge against TIBCO.

because we were all closed source, right? This was before open source was really a big deal. And then of course, JMS. and they’re very, it sounds like you’re like, why would you even put those in the same thing? But they were doing the exact same thing from a market perspective, just very, very different approaches. JMS was the API is consistent, the implementations can all be different, and EMS passed all of the test suite stuff. And then AMQP was, is the protocol is consistent, and then you can build any type of client like Rabbit or whatever on top of it.

But yeah, so AMQP exists because of TIBCO’s, I believe, lack of awareness around, we might want to lower prices. We just kept holding on to these crazy high prices, which markets change and we probably should have adjusted. But again, I was just an engineer, so they might come out of the woodwork and say, no, that’s not what happened. But I’m pretty sure that’s why the Wall Street people got together and created AMQP.

Kate Holterhoff (31:25)
Fair enough, yeah, and you’re again anticipating me, when we talk about your time at VMware and working on Cloud Foundry, I certainly want to talk about AMQP and Rabbit and all those things. But before we move on to that, can you talk to me at all about the rise of EAI? I mean, TIBCO was absolutely core to that movement. Where did this need arise from? I’m interested in both your perspective as a technologist, an engineer working there, but also from, an aspiring business perspective, right? I mean, now you’re in that position where you’re making these sort of pricing and go-to-market decisions.

Derek Collison (31:57)
Yeah, and, Yeah, I think we had two things going on. One is that information was being spread out from different systems. And people were the things that connected the systems. And people are fallible and they’re expensive. And so you started seeing this notion of, we should be able to automate getting data from SAP over to this system, blah, blah. And I think TIBCO kind of built

a lot of their early business on financial services, data dissemination, right? Stocks, you know, then we started doing settlements, then we did, you know, transactions, X, Y, Z. But as we got to that point, we also realized we were kind of building a little closed system. You could write code to integrate other stuff, but we said they shouldn’t have to write all this code. We should start integrating with all the other systems that we know are at Goldman and Lehman at the time before it went under.

And so, know, hats off to the leadership at TIBCO looking at that and going, hey, we need to double down. And they did start doing a lot of acquisitions then on pulling in technologies that could allow this EAI type of explosion. And now it’s kind of table stakes. But even with Synadia, you know, we’ve gotten to the point where internally I go, they’re table stakes, but people don’t value them. They value them, but not at a very high level, right? So in other words, they’re not going to pay

$100,000 for a connector to S3, right? But they want a connector to S3 and they want a connector to BigQuery and they want a connector to XYZ. And so once we felt we had MVP from a Synadia standpoint, now we’ve started that process about a year ago. And so we have a bunch of stuff that’s about to come out there. And that’s what I consider the transition from Greenfield to Brownfield opportunities with users and customers, right? So we can meet them closer to where they’re at versus them having to, even though not change the what,

they would have to rewrite a lot of stuff to change the how, right? To get everything running and flowing on the Synadia tech stack or the NATS code base.

Kate Holterhoff (33:57)
And so you were at TIBCO for 10 years, know, 12, my goodness. Okay. You know, so your LinkedIn is informing me, left in 2003 when you left your job title was SVP and chief architect. So, you you moved your way up. I’m just curious, what was it like working at TIBCO in the 90s?

Derek Collison (34:01)
12, you know, it was an amazing time. It, I was very young and naive, from a non-engineering standpoint. And so I believed in what we were doing. So I never sold a single share. So on paper, I went from looking really, really good to looking really, really bad. And at the time it wasn’t that you look bad. It was real because they had weird tax rules and it was bad right in 2000. But the IPO was amazing.

Being in New York so much, I was on the first flight back in New York after 9-11, because I needed to get back there. The energy level of the financial systems through the 90s was just bonkers. And it became a global thing. So we were Wall Street, but then obviously we did the pit, the Chicago Mercantile. But then, of course, it was London and Tokyo. And then next thing you know, it was a global phenomenon. so we went from,

T plus three settlement to T plus one while I was there. Now everything is pretty much real time. know, everything’s always running. And I think for me, it foreshadowed.

what other parts of society we’re going to adapt, which is we’re going to digitize everything and the more data we can get, the better. And so believe it not, even though FinService is a very, it’s its own little bubble for sure, there’s so many lessons learned coming out of that. The other one too is that what Synadia does, and other companies do this as well, but what Synadia does is critical infrastructure. If our stuff isn’t working, you’re not going to go take a coffee break and maybe you’ll come back online in an hour.

type stuff. It’s a you’re paging me, you know, I mean, and I get paged on every single production problem. And we learned that right at TIBCO. So TIBCO, I remember one time Vivek walked into my office and he had a suit. And I’m looking at it. I’m like, why do you have a suit? He goes, because you’re going to New York. I’m like, you know, I real close to the airport. goes, no time. And I’m like, it’s that bad. He goes, it’s that bad. And so literally, he handed me the suit, I grabbed my

computer and at the time the laptops you have to take like 18 batteries because they only lasted five minutes. Grab the CD with all the source code because no one had the source code, right? There was no GitHub. There was no anything like that. So grab the CD with all the modern, you know, thing, burned a new one with what was on the top of, you know, our thing was RCS at the time we were using, you know, barely using any version control and then literally went straight to SFO and went. So that energy level, you know, and the stakes being so high and if the fact that things go down, it’s really bad, I think kind of shaped what drives me. So I love the fact that Synadia’s stuff, all of our customers, the ones that are public, but even more the ones that we can’t talk about yet, we are critical if we stop working, their whole business stops. And that’s both exciting and terrifying at the same time.

Kate Holterhoff (36:53)
Wow, man, I love that story. That’s incredible. It’s hard, yeah, for someone who came up in the, I think 2018 was when I was a front-end engineer. That just certainly wasn’t my experience at all. So I love this. Can you talk to me about how the creative energy was as well? I mean, creating this messaging system was,

really a huge innovation. so I’m wondering if, now we joke about having ping pong tables at tech firms, right? And making sure that the engineers feel like they’re able to have the sort of work-life balance that they’ve come to expect, right? Was that part of it? Was it like Bell Labs? how did they spark your creativity and make sure that you were creating really innovative things that needed to be built?

Derek Collison (37:28)
Yep.

Yeah, I mean, I’ve reflected on that quite a bit. obviously, I mean, I worked at Google. So I got the whole Google experience with Chef Charlie and the free meals and stuff. But TIBCO, we had an office, you know, on University Avenue in Palo Alto to start with. So that’s kind of where I got introduced to California was just walking up and down University. And I still love it. I still get nostalgic when I go back there. I’ll stay right in town again. I just walk back and forth just because.

Kate Holterhoff (37:43)
yes, yes, yes.

Derek Collison (38:01)
But then we moved to Porter Avenue and we were right next to another startup called VMware, which was really funny at the time. But I think it was, we want people to innovate. You can fail, but fail fast and be accountable. the people that I worked with inspired me so much, but it was also that they put so much trust in me.

That allowed me, you know, cause again, I was at the time only 22 or so, you know, and I’m getting yelled at by Lehman CTO on the, you know, the trading floor. And then I’m up in the NASDAQ in Connecticut, you know, going to FedEx and designing their system. So I think it was the trust that they put in me and then the, the, the people, was a very small group that was doing the messaging stuff early on.

It got to be pretty big. And at one point I had 600 and some people reporting to me or whatever. I’m not a good manager, by the way, so that didn’t work out well. And I was the first one to admit it. I I told them, I’m like, I’m not good at this. Give me a clean sheet of paper. Let me build something. Give me a small group of people. But at the time, it was four of us starting Rendezvous. It was led by Denny Page, who’s still in my mind. He taught me so much.

But we ate lunch together every day. We’d go out somewhere. But as a group, mean, we were just a tight. And this was you couldn’t work at home back in then, right? There there’s barely dial up. You know what mean? And so you have to be in the office and you had to be in that kind of space. But it was also an older even the new buildings at Porter were or more of an old style where you had an office and you could close your door. And so as I progressed through my career, I started to try to figure out.

Kate Holterhoff (39:14)
Yeah.

Derek Collison (39:34)
when I was designing Office Space for Apcera, how do people work? And so I believe in-person, shared whiteboard when we’re collaborating and trying to do the design, we’re still like at a white sheet of paper. It’s still hard to replace that. mean, COVID led to an explosion of technologies that make it easier doing this remote, like Zoom and all that stuff. Shared whiteboards, Excalidraw, which a lot of our team uses.

But I still think getting in a room and co-designing. But then once you say, I know what we’re going to do, then it’s like, leave me alone. I’m going to go and I want to shut my door or I want to put my head down. And I think the TIBCO space, maybe not by design, but it just kind of worked that way. Google did something very similar, but they were very methodical about it. And I picked up on it as soon as I joined, which is we had our own offices, but only the CEO of Google.

Eric and then the guy that created Alta Vista had offices by themselves. Everyone else had to be part of them, including Larry and Sergey. They shared an office, but all of us faced away from each other. So if we were heads down and we put our headphones on or whatever, you you could work. But then as you were moving throughout the campus, especially on your floor, all the kitchenettes and all the bathrooms were centrally located. So you were always crossing by other people. And I also noticed they moved us like every six weeks.

Google did. And finally, I was like, what are you guys doing? And they’re like, we do that on purpose, because, you you get around different people, it makes you think differently. The fastest way to change the way you think is change your environment, right? Go for a walk, move somewhere or whatever. And so they would literally move us every six weeks. And so the next thing you know, I was next to, you know, the guys that created CockroachDB. And then one time I was next to the guy that created Google Docs, Writely, you know, we acquired his company.

And that was actually pretty smart, right? You think differently depending on who’s around you and who you’re interacting with. And so I believe deeply in that. But again, I think the ability to fail and the trust that they put in someone so young, that was pretty big deal. And by the way, we did have, yeah, we had one outage that it took me over two weeks to solve. I could not figure it out. And after about three days, they knew I was crispy already. So I would literally

Kate Holterhoff (41:33)
Wow.

Derek Collison (41:38)
walk in, get a cup of coffee, go to my office, shut the door, and then at nine o’clock at night, I would just leave and go home. I wouldn’t do lunch. I mean, I was just so beside myself because they had put all that trust in me and there was a bug and they would only show up every once in a while. And it just it took me almost two weeks to find. But I remember I probably sounded like a young schoolgirl screaming out when I finally found it, but it took me almost two weeks to find it. And so I ran up to Denny, my boss’s at the Times office. I said, I found it.

And he goes, show me. You know what mean? And so I was showing him and I remember that day. But he would check in and be like, he knew I was working on it. He knew that’s all I was working on. And he’s just like, how are doing? Do you need anything for me? You mean? It wasn’t like, what the hell are you doing? Get this thing fixed, know, X, Y, Z. And so that confidence and trust in me, I think, was the biggest dynamic.

Kate Holterhoff (42:29)
That’s amazing. And that’s precisely the sort of insight that I was looking for.

So I do think we’ve talked a little bit about Google. I would love to have you back on the show, maybe to talk about this, because again, I was a web developer, you were involved in some amazing projects there, but for the sake of trying to stick to our theme, why don’t we skip ahead to VMware? We’ll leave those five years for our next conversation.

Derek Collison (42:41)
yeah.

Kate Holterhoff (42:49)
Talk to me about your time at VMware. So, was 2009, you became the CTO and Chief Architect for cloud application platforms at VMware, where you co-founded the cloud computing group and designed and architected Cloud Foundry. hit me with how this, ties into your time at TIBCO, the messaging initiatives you were involved in and what that was like.

Derek Collison (43:07)
Yeah, so every system I had designed was based on a messaging construct. So like the FedEx system, the Wall Street systems, transportation, logistics, Intel fabs that we ran at TIBCO. They were all based on this. And the only time I really got away from that was at Google. Google had their own version of papering over all the things. So instead of DNS, they had BNS, which would give you both a host and a port. And they had something called stubby.

which became GRPC way back in the day and the original Protobufs type stuff. But the stuff that we did there, believe it not, relatively speaking, it was the second most popular service for a while. Now, of course, we had just bought YouTube, and so YouTube obviously would kill it and Google search was way up here, but we were going between 60 and 100,000 queries a second.

But was a very small team. was only really three of us and then maybe a PM, was our original PM was Brett Taylor, which that worked out really good for us. And he was a young, you know, fresh out of Stanford graduate that Marissa said, Hey, you need to meet this kid and bring them on. And I originally, could quickly tell that he was one of the smartest people I ever met. Anyway, we quickly realized that even though I don’t know how long I was at Google, eight years or so, seven, eight years, something like that. We weren’t.

going to be able to easily get out from underneath of this project because it was so popular, It was beginning to developer APIs. We call it the Ajax APIs. Then I created the JavaScript CDN, which I think they just shut down or something. But for a while that was really popular. And so Paul Maritz, who is mostly known in inner circles of he’s the one that really created the Windows dominance right in Microsoft at Google. I worked with Mark Glockowski, who worked for him at Microsoft and.

Paul was asked to take over for Diane Greene at VMware. so Mark came to me one day and he goes, you know, we’re not going to be able to get out from underneath of this. My buddy just took over at VMware. We should go and talk to him. And I said, why would I ever go to VMware? It’s like a virtualization company. I don’t want to do any of that. But we got to sit down with Paul and Tod Nielsen the COO, who was also an ex-Microsoft guy. And I said, well, what would you want me to do? And they go, something cool that’s not

has to deal with hypervisors or virtualization. Just do something cool. And while at Google, I was noticing the birth of like Ruby on Rails and all this stuff where it so easy to kind of play around and do these things on your laptop. But when you had to deploy to production, it was extremely painful. And so even at Google, I was starting to become a really big fan of Heroku, which was, you know, the early thing that was kind of the PaaS, right? Platform as a service. And so Paul was like, join us, just do something cool.

So I had just a clean sheet of paper when I joined, but after about a week, I proposed to him and Tod and Steve Herrod, the CTO. They were the trifecta that I always talked with. said, I want to do Heroku for the enterprise. That was it. That was my pitch. And they go, great, how long you need? And I said, I think I can have something workable in three months. And they go, great, knock yourself out. And so that’s how it started. I just wanted to do Heroku for the enterprise. And at one point in time, we actually tried to buy Heroku because they said,

Great, you and another guy from Google, Vadim, who’s awesome, Vadim Spivak, one of the best engineers I’ve ever met, and him and I tag teamed a lot at Google, he came over with myself and Mark as well to VMware. And Paul and Todd goes, yeah, but it’s only like, you you two, plus we got four other people around you now, how do we go faster? And I go, if you want to go faster, buy Heroku. And we had just signed a deal with Salesforce called VMforce, which was Cloud Foundry powering this PaaS for Salesforce.

And we did the launch announcement and we did all kinds of stuff and, next thing you know, we’re trying to potentially buy Heroku and it stalls, you know, and they had asked me, what would you pay? And I told them, and then we were well past that already. And then they went quiet. And then about a week later, they came out and said, Salesforce just bought us. And then we got the call that VMforce was dead, you know. But I think, I think PCF, Pivotal Cloud Foundry is still a big moneymaker for VMware. At least that’s what I heard.

Kate Holterhoff (47:01)
Ha

Derek Collison (47:10)
But that’s where NATS was created was around that. Cause I said, Ooh, I have to figure out how to build a system like I know how to build. can’t use TIBCO stuff. So I’ve looked at the open source stuff and it felt like RabbitMQ was the thing. And so I said, great, I’ll just use RabbitMQ. And so I brought Alexis Richardson in, we bought his company. Alexis was kind of the head of that company that was driving Rabbit. But at a certain point,

I could tell that Rabbit was an enterprise messaging system, more like a pet than cattle. You have to care for it and feed it. And if you didn’t use it right, it would keep trying to do what you asked it to do, and then would just lock up or whatever. And so about two weeks out from launch, we had a couple of issues. And it could have been me.

I like simple things and there’s, you know, because of the AMQP thing and there’s a lot of complexity that kind of percolates out into AMQP clients. So it could have been me, but I also realized I wanted something that was more like a utility that protected itself at all costs. So what I mean by that is, let’s say you and I live in the same building, I shouldn’t be able to plug in a really bad blender and not only blow out all the electric in my unit, but take you out with it as well.

And so a lot of these applications, and I put Rabbit in there, it’s not as necessarily a fault. But when you look at these massively distributed systems with dynamic workloads, I need something that says, no, no, you’re trying to do something wrong. I’ll penalize you, but I’m not going to penalize Kate. Right? And so I kept saying that in my head, and I’m like, it seems to be that if I ask it to do something, it’s going to try no matter what to do it, even at the detriment of everything else that’s trying to use it at the time. And so, you know,

for the audience, what I’m talking about in these systems, again, it’s not necessarily changing the what, just the how, but it’s telemetry data, eventing, command and control. It’s the normal stuff, right? We just use a messaging system to do all that versus, you know, disparate different types of technologies. And so we had a lock up, you know, and I believe it was a Friday, it was a Friday morning. And it was like the third one. And again, it could have been

me. know what mean? But I said, enough’s enough. And I just stormed out and I left and Vadim’s like, where are going? I said, I’ll be back on Monday. I said, I’m tired of this. You know, blah, blah, blah. I was a lot younger then. You know what mean? It takes a lot to get me upset, but I was bugged. And so I’m like, know what I want the system to do, the messaging system. I know what I don’t want it to do. And I know that I want it to protect itself at all costs, meaning I can’t do something silly and take it out for the detriment of you.

And so we were building Cloud Foundry in Ruby, remember the Ruby on Rails thing, but I was watching. And so I said, I think I can build this in a weekend. And everyone says that, and you usually can’t, but I did. So I built it in a weekend, came in on Monday, ripped RabbitMQ out, put NATS in. That freaked out a lot of people. So I got called to the principal’s office for that one. I also made them make it MIT. So it was the first open source. Cloud Foundry was also going to be the first open source project.

from VMware, that was a big, huge deal. But technically, NATS was first. And I made them open source that one too with MIT. And they fought me. They didn’t think it was a good decision. And Alexis was upset at me, but we’re very close now, again. But yeah, and so it took off. then when I left VMware, I wanted to create a better version of Cloud Foundry, one that had more security, governance, could stretch more, do some other things.

Kate Holterhoff (50:11)
I didn’t realize.

Derek Collison (50:33)
And then the decision making process for that company was, do we want to go with not Ruby? Although I love Ruby and I love NATS, you know, it’s just not a good deployment language in my opinion. Do we want to go with Node.js or do we want to go with Go? This brand new language coming out of Google. And so we picked Go. And so the modern version of NATS is all Go.

Kate Holterhoff (50:56)
That is exactly the history that I was interested in hearing about. I didn’t realize that you had built that first version of NATS and it was in one weekend. I mean, that’s incredible. Yeah.

Derek Collison (51:05)
Well, I went home and I was really ticked off. And so I sat down and I’ve done these before, obviously. So the first thing I did was put the connection logic together. Then I did the subject router, right? You how you route things and things like that. And so in Ruby, I was pretty proficient. We were doing a lot of code in there at the time. Traditionally, I’m a C programmer. I don’t think I can program in C anymore. I’m too old. I don’t have enough time left.

But yeah, so yeah, was not as hard as you think. And again, the only things it did was route messages. It did what’s called distributed queuing, which is that load balancer replacement. So there can be 100 Kates, and I’m going to randomly select one of you. The modern version of NATS also is geo aware. So the very first version of NATS at VMware was a single server. I didn’t design any clustering or security or none of stuff. But the modern version, because you can

Lego brick these servers together in any topology you want. The distributed queuing, which says, hey, I want to ask the question of Kate, and there’s a thousand of them, just find the best one for me, is geo aware in the modern version. So if there’s one that’s very close to me, I’ll prefer that one. And if there’s not, I just keep watching and I know how far all of the other servers are away internally, not your app. Your app doesn’t have to worry about this, the NATS system and Synadia’s tech handles all this. And so it had that and it had a circuit breaker pattern. So the other one that I cared deeply about was

asking a question of an unknown set of respondents, and I only want the best answer. And you can qualify what the best answer is, but in this case, who was the first person to respond? So you can imagine a world where traditional messaging technologies, you would run this thing and send out a request and you’d get a response back, but you’re like, why did my CPU spike? Why did my memory spike? And it’s because all 10,000 responses were coming at you and you just had to throw them away, right? And so the other…

thing that the original version of NATS had was, I want to ask a question and I only want one response and it would circuit break all through the system. So as soon as one response was flowing towards me, it chopped all the other ones off. And so my client application didn’t have to process and throw away all of these responses. And that’s all it had, the original one, it was just fire and forget, obviously subject-based routing, the normal wildcard type stuff. Slow consumers, if you’re a slow consumer, it’s kind of like plugging the blender into the electric socket.

We just chop you off. We just cut you off. And we’re like, nope, you’re not a good citizen. And then the DQ. And that was it. That’s all that existed. And we could build all of Cloud Foundry with that, including it had a really interesting scheduling algorithm that wasn’t a command and control centralized thing. was any number of people can be asking for help to run jobs in these DEAs, droplet execution agents is what we call them. Me and Vadim named those at like 2 AM in the morning. Never let engineers name stuff. It’s so bad.

Kate Holterhoff (53:45)
haha

Derek Collison (53:47)
And so yeah, and that’s where it is. Now, of course, the modern version of NATS has a lot of stuff. And of course, Jetstream is a massive component that draws a lot of attention. But yeah, that’s how it all started. And it’s still on GitHub. So the very first commits are still in the same history buffer. So you can go back and look at the Ruby one. I used regular expressions for the subject router, which is goofball because it’s slower than Christmas to do it that way. But I had to have something working.

Kate Holterhoff (53:54)
Yeah.

That’s tremendous. I mean, speaking of naming, so you walk into the office of Monday, you’ve got your GitHub repo of NATS. Was it named NATS from the beginning? And on the record, what does NATS stand for?

Derek Collison (54:25)
Yeah, so the marketing team has one version of it, but I think it’s already been put out on. I think it was on Hacker News and someone else posted it. It stood for “Not Another TIBCO Server.” Yeah.

Kate Holterhoff (54:28)
Ahhhh. wonderful. Okay, yes, I alluded to it in my post (“Why Message Queues Endure: A History“), but I didn’t know where you stood on this. So thank you for clarifying.

Derek Collison (54:44)
No, I’m being honest. That’s exactly what, but the marketing team, I remember going, crap, we can’t say that. Let’s call it a “Neural Autonomic Transport System” or something like that. So you’ll see that. But if I’m being honest, I could, said on my way home, I’m driving home and I’m agitated. I’m like, I don’t want to build another TIBCO server. I don’t want to build another TIBCO server. And so by the time I got home, I already had the name. Naming and cache invalidation are the hardest things in computer science. So already had the name.

Kate Holterhoff (54:49)
That’s it, yep.

I love that. So the name came first. OK, so let’s dig into NATS a little bit. Because I’m interested in the open source story as well. So what is your position on open source? You mentioned the MIT license. Open protocols are such a big part of messaging from the beginning. What is your philosophy around open source?

Derek Collison (55:28)
wow, we need probably another two hours. you know, I think. Yeah.

Kate Holterhoff (55:30)
I know, I know. If you can’t condense it, we can put a pin in it, but I am curious.

Derek Collison (55:36)
I have a lot of opinions on open source. think in general, it’s a massive net positive. And I went through closed source to open source, then all of the licenses in open source from GPL, the LGPL to MIT at the other end of the spectrum and everything in between. I’ve went through foundations, probably an unpopular opinion, but I don’t know if the story’s been.

Kate Holterhoff (55:41)
Okay.

Derek Collison (56:00)
fully written on if foundations are a net positive or not. That’s my personal opinion. I believe that there’s that OSS is at a very critical time right now. And what I mean by that is that a lot of it, not all, but a lot of innovation still comes from very small groups, either inside larger companies or

Kate Holterhoff (56:05)
Okay.

Derek Collison (56:22)
startups that are trying to do something with open source and trying to make a business out of it. And I think that the ecosystem is not aligned with that. I think what the ecosystem is aligned with is massive initiatives by the hyperscalers that drive a consumer bias to zero. And if you drive the consumer bias to zero, meaning that the consumer thinks that they’re entitled to it being free, that they never have to pay, that that causes a lot of pain. Charity, by the way, is not a business model.

in my opinion. And so what we’re seeing is that when we started Synadia, said, well, Edge is going to be a thing. And I made a tweet years ago that I thought it would dwarf cloud by orders of magnitude very, very quickly. And I still believe on that. I also believe that since we are so critical that any company that was running us production would want to have a commercial relationship with us and have support.

What I didn’t foresee is, and again, if you’re like a little startup and you don’t have enough money to pay, but you’re trying to get back in other reasons, I’m fine with all that, right? But in terms of Synadia as a commercial viable entity, you know, I’m like, hey, if you use it and you value it, you know, you should have a commercial agreement with us. And I remember starting the company, what, seven years ago, almost eight now, right, the end of December of 2017, and saying, yeah, we’re so critical, they’ll have to have, you know, a relationship with us.

And what I’ve seen over the last five years is that trend is reversing. So Fortune 100 companies go, if it’s open source, we’re not paying you squat. And what happens there is that, and this is where I call it the big incentive misalignment, right? They will call us if there’s a problem.

On our side, we’re trying to make software that doesn’t have problems and trying to make perfect docs and reference architectures. So our incentive is to do this, but at the detriment of us as a commercial entity, right? Because if these big companies, and you’ll know all these companies names, you know, that they just say, nope, we’re not paying you. But if we made it harder, right, to deploy our stuff or we had bugs or there was tricks that you needed to talk to us or, you know, X, Y, Z, or the docs weren’t clear enough about how to do something.

That’s an incentive misalignment where people now will pay you to solve that problem. And that’s what I think is correct. And so in a nutshell, I do believe it’s a net positive across the deal. think not always, but mostly hyperscalers that don’t need to make money off of this that use it for other reasons like Kubernetes as greasing the skids to allow workload transportability into Google Cloud from Amazon or Azure.

they had a side effect of driving consumer biases here. mean, to date, yeah, we’ve probably invested.

40 plus million dollars into the NATS technology. And we have people on GitHub that they’re not trying to do it, but you can tell they’re so entitled. They’re like, you need to do this for me. And it’s like, what do mean I need to do this for you? And so now I just say, ooh, I said, we love our ecosystem and we do, and we want to help out. Are you a Synadia customer? And they always go, no, why? And I go, because we prioritize people that pay us and want to ensure our survival.

meaning the project survival and the tech’s survival. And so I think it’s kind of not at a crossroads yet, but I bet you it’s becoming pretty close and you’re seeing a lot more companies switch licenses or doing other things to try to say, hey, you should pay us. Even if it’s only for a certain period of time and then everything reverts back to totally open source. And so I don’t know

where the path is going, but I know that I don’t feel good saying if we make crappier software, it’ll help us as a business entity. That doesn’t feel good. so we got to figure that one, that one out. And there’s a lot of companies that, you know, are just awesome. You know, and they’re like, absolutely, we’re going to have a commercial agreement with you. But there’s a alarmingly increasing number of fortune, you know, 500 global 1000 companies that go, oh, yeah, our policy is, we just don’t pay. if it is open source.

Kate Holterhoff (1:00:17)
Right. And I think maybe it would be worth stating explicitly, what is the relationship between NATS and Synadia?

Derek Collison (1:00:24)
So Synadia houses almost all the maintainers, right? And so.

I would say 99 % of the server code base is all from VMware, the people that were at VMware, but mostly me, I think I’m 98 % of everything from VMware, then Apcera, then Synadia. So we’re the main maintainers, right? And so, and we wanted to make a commercially viable, you know, business out of intelligent data flow and intelligent connectivity for these modern distributed systems that cross geos, cloud providers out to edge. That was kind of our… our initial game plan seven years or eight years ago when we started.

Kate Holterhoff (1:01:01)
All right, and so you mentioned knowing in what, would that have been 2009 when you created NATS? Was it that first year? 2010, okay, so the year after that.

Derek Collison (1:01:09)
2010, I think, is when I first created it. I think.

I’d have to look up the GitHub, but it’s on there. It’s the stamp zone there. It’s October, I think 2010,

Kate Holterhoff (1:01:14)
That’s fine.

Okay, and so you and for that one you knew that you wanted an MIT license. Is it still an MIT license?

Derek Collison (1:01:23)
It’s Apache 2 because it’s part of the CNCF. So one of the rules becoming part of the CNCF, it’s Apache 2, which is still very liberal. It has a little bit of protections, mostly for the project maintainers, which is good. So yeah, it’s Apache 2 right now.

Kate Holterhoff (1:01:24)
very good.

Okay. And so did you consider creating a NATS foundation? How did you get hooked up with the CNCF?

Derek Collison (1:01:42)
I was part of the original founding governing board for the CNCF, the original kickoff meeting at the Las Vegas data center. Yeah, I was at and I believed in what they were trying to do. But again, it was very early. The only other foundation was kind of the Apache 2 and you know, the Linux Foundation was there and they were trying to create this cloud native compute foundation, obviously. And I like this notion of.

Kate Holterhoff (1:01:46)
Wow.

Derek Collison (1:02:07)
Two things, so in TIBCO, we had to do a lot of awareness and a lot of education. So we spent a lot of time educating customers of what problems they had and the marketing stuff. And so when we started Synadia, was like, I don’t want to have to swim upstream. And so our bet was that Edge was going to force people into going, crap, that stuff that worked in the cloud, it’s not working over here. We have to look for alternative ways to accomplish what we’re doing.

And then the CNCF was this could help us with our awareness issue. So we don’t have to spend all these marketing dollars and try to get the awareness out. And early on, it did that quite well. Where I think the CNCF has gone, I don’t, and again, this is just my opinion, but I don’t know if there’s that crystal clear, you know, clear as a bell.

advantage as a user looking at a CNCF landscape. I don’t think you look at that and go, I know exactly what to do. I here’s the problem I have. yeah, it’s like an eye chart. You’re just looking at this thing. I’m like, how’s the CNCF serving the end users and then how’s it serving the projects? And within the project realm, I think it serves projects that have lots of big, companies participating well. I don’t know if the match is as well suited.

yet for companies that trying to make a business out of the open source project. Right. And they have most of the maintainers, right. Cause the CNCF has slowly morphed into, you have to have lots of people contributing to a project for it to be viable, to be graduated. And, you know, I don’t, I don’t think that’s applicable for all projects. I’m not saying it’s not applicable to some, but I don’t think it’s a blanket rule. And I remember being challenged a couple of years ago within one of the meetings and said, well, you know,

You have to take our word for it. This is better. I’m like, that’s totally fair. But I said, remember, I’m kind of an old dude. And I said, I have three projects now that are still running over 30 years. There’s one over 20 and there’s two over 10. So I think I kind of know how to create something that lasts a long time. Because they’re like, hey, how do we know that it’s going to last so that the users can bet on it? And I totally get that. That’s totally legit. But I think the way the criteria of the way they’re judging it as a blanket rule might not be the correct way for all projects, if that makes sense.

Kate Holterhoff (1:04:22)
Yeah, absolutely, fair enough. So was this the first foundation that you were involved with?

Derek Collison (1:04:28)
Well, the JMS kind of had its own bunch of companies that were ratifying it, and then Microsoft pulled me into that WS star debacle back in the day that also had some sort of like makeshift. I mean early version, so not as much like the Apache stuff, but it did have like lots of companies and you know the more the merrier type stuff. But officially, yeah, I’ve seen something. Yeah, yeah. And again, I it might be, you know.

Kate Holterhoff (1:04:48)
Right, right. So you’ve seen some things. Yeah, okay.

Derek Collison (1:04:55)
most people in the audience might disagree, but I don’t know if we really understand if foundations drive value into the ecosystem, right? And again, I’m not saying it doesn’t, I’m just saying for me personally, I don’t know the answer to that yet. It’s still kind of a work in progress to me.

Kate Holterhoff (1:05:14)
Well, I think it’s such an important conversation to have not only just for the tech industry writ large, but also when we talk about messaging because open protocols are such an important part of that story, especially when you look at the financial industry and how everybody was trying to make sure that they could communicate across these different systems and everyone kind of had to play nice together.

And so I think that there’s a story to tell about collaboration and making sure that, know, sure we’re competing, we’re all trying to, you know, make sure we get the biggest slice of this pie, but at the same time, it’s about communication. That’s what messaging is, right? We gotta communicate.

Derek Collison (1:05:39)
Yeah.

It definitely is, although again, I’m not sure that all the protocol stuff actually bears out what people think it does. Now, HTTP does, and I agree with communication, but you can imagine a world where, let’s say there’s a NATS system, right? And by the way, I’ve modeled the protocol based on Redis.

Kate Holterhoff (1:05:55)
Ooh.

Derek Collison (1:06:06)
I had already decided to pull Redis into Cloud Foundry as one of the things it was gonna have and XYZ. And so I was talking to Salvatore Sanfilippo a lot and I was like, I’ve only done binary protocols and he’s like, yeah, the tech stuff. And I’m like, I can probably make it super fast. And I said, why did you do it that way? And he said, so that it would be easy for people to write clients in any language they wanted to. I was like, that’s smart. So he goes, otherwise we have to write all of them. But if you make the protocol really simple, XYZ. And obviously Redis’ protocol is not formalized but.

Almost every WAF and firewall understands it. It’s incredibly popular. And so if you have a world where it’s easily accessible, but then the last mile endpoints can be HTTP, MQTT. I probably won’t do AMQP just for historical reasons. It’s very complicated protocol, unnecessarily so in my opinion. I think that

It could help, but I have no desire to try to go to OASIS or something like that and try to standardize the NATS protocol. We’ve standardized the ports, so they’re on listings and stuff, and we care deeply that the machinery that a lot of security groups use is aware of our protocol, but it’s a text-based protocol, so it’s literally like an hour’s worth of work. I mean, it’s drop dead simple to build a protocol-aware version of a security device that understands NATS. So it’s actually even

It’s as simple, I think, as HTTPs. That’s actually simpler than HTTP as a protocol. And it’s just text-based. So you can actually telnet to a server and go back and forth, and you can see how the whole thing works. And so one of our early demos was to, if people who cared wanted to see how it works underneath the covers, like watch the protocol, you can actually just telnet and pipe to a server and see what it’s doing. Yeah. And I did go through OASIS a couple of times.

Kate Holterhoff (1:07:27)
Huh. Okay. Yeah.

All right. Yeah, no, I like it. These.

Derek Collison (1:07:49)
The W3C and yeah, they’re painful. They definitely have their place, don’t get me wrong. But I don’t think us not being standardized on something. And people said, why didn’t you standardize on HTTP? I’m like, we were trying to do everything different than HTTP when HTTP was just one-to-one synchronous request reply. Because everyone’s like, GRPC is based on HTTP and HTTP3. And I’m like, yeah, we predated all of that stuff.

Kate Holterhoff (1:08:15)
All right, now this is great. love when these conversations become a little spicy. It’s good, the more opinions the better. This is, you know, we’re not here to be wishy washy. This is it.

Derek Collison (1:08:21)
Well, opinions

are interesting, right? Because they’re free, meaning you can throw it away. You don’t have to take what I say and you can just discard it, which is good.

Kate Holterhoff (1:08:28)
Yeah, that’s amazing. Yeah, no, thank you. Okay, so I got one last question that I think we will wrap up here. So I think arguably we’re in what many would consider the Kafka era of streaming, right? So what is your take on the future of streaming? Do you feel like we truly are in the Kafka era? How is this sort of molding where messaging is gonna be going in the next five to 10 years?

Derek Collison (1:08:54)
I think streaming and event sourcing are incredibly powerful constructs, right? And so, and I think Kafka is the de facto 800 pound gorilla. Where I think a lot of our customers come to us is it’s this big thing like the old databases, like the Oracle databases. It sits in some data center in one region and it takes a whole bunch of people and a whole bunch of money to run it. And also I think that event sourcing and streaming are becoming

a more critical component to architectures, but they’re not the only one. And so what my career, or at least I’ve experienced in my career is, is we go through waves. And I think I’m on my fourth one. And the wave goes something like this. When it’s on its upswing, it’s, want more options. I want more tools in the toolbox. Give me more stuff, more stuff, more stuff. And then we reach a saturation point, right, as an ecosystem. And then we immediately go into a deep dive of

this is too complex, there’s too many moving parts, I want to reduce the surface area. We’re definitely in a down deal. And so a lot of customers are coming to us and they’re like, we’ve got HTTP, we’ve got service mesh and API gateways and GRPC, and then we’ve got Kafka, and then we’ve got Redis, and then we’ve got relational database, let’s say Cockroach or whatever. And they’re like, how do we reduce this complexity and be able to take this thing and then move it so it runs inside of a vehicle or runs in a cafe or runs in a distribution center or a factory?

That’s kind of where we hit our strides. So we definitely do streaming. I don’t know if we’re going to be doing exabytes like Kafka does today, although we have plans for that. But for very high speed, real time streaming, eventing, plus work queues, plus key value stores, plus object stores, plus normal request reply, which you shouldn’t be storing stuff in memory or on disk for request reply. It’s kind of like when Kafka

did a push and said we’re good for microservices. And I said, that’s like Google writing all their log lines before they return your search result. That just will never happen. I think that the world is a lot bigger. And if there’s a tech stack that doesn’t fulfill everything, but kind of gives you that reduction in surface area, and you can check off a lot of things with one tech stack, I think that’s kind of the future. I think Kafka will continue to

probably push into adjacent markets, Cause they’re a Confluent, at least because they’re a public company. And by the way, I hope I have their level of success. Don’t get me wrong. But I think, they’re going to struggle to keep trying different adjacent markets where they’re not built for that. They’re purposely built for very, very one specific use case. And they do it really, really well, but modern systems have, you know, lots of different moving parts. And so what we’re trying to capture is the edge, transferred edge.

The majority of interactions will be out there. The rules are very different than in the cloud. Perimeter based security models don’t really exist out there. And so we don’t depend on those. We don’t depend on DNS. You can run these things, just say run it. You don’t have to have like a PhD in how to run a Kafka broker cluster type stuff. So we like our positioning, but we also, we work really well with Kafka. So we’ll do like a lot of the edge locations and funnel in X Y Z.

We do have some customers that after a couple years, they get so comfortable with our tech that they replace everything with us. But we still have some customers that say, no, we want you to interoperate with this big Kafka cluster running on in Oregon or something like that. And of course, that’s what we’ll do.

Kate Holterhoff (1:12:12)
Wonderful, and I think that’s a really good final topic for us to deal with because yeah, I think that the world is bigger than Kafka. They take up a of oxygen in the room, but there’s still a lot out there and the story of messaging is still evolving and being written. Before, absolutely. So before we go, for folks who are interested in keeping up with more of your thoughts on all of these subjects, we’ve mentioned that you participate on Twitter or X if you’d prefer.

Derek Collison (1:12:27)
It is. is.

Kate Holterhoff (1:12:40)
Are there other social channels that you participated in?

Derek Collison (1:12:43)
I just LinkedIn and you know again, I’m old enough that I just use my name so it’s just Derek Collison. I don’t have a fancy name and it’s hard to find me. If you want to find me, it’s really easy. I try to be accessible. We have an open Slack for NATS, which is I think it. don’t know how many 10,000 more people or so and then you could just email me. Derek@NATS.io or Derek@synadia.com and I try to respond to everybody that emails so.

Be happy to hear from anybody, especially with questions and stuff like that. So, and Kate, thank you for having me on. I really appreciate it. I don’t know if I lived up to any of the Forrest Gump type of stuff, but I think a lot of us in the, in the, uh, this ecosystem have a lot of really interesting stories, including your, previous guests. It’s just some amazing stuff of, of, you know, going through. And there’s even more stories over a cup of coffee or a glass of wine or something type deals that.

Kate Holterhoff (1:13:17)
No, absolutely.

yeah.

Derek Collison (1:13:36)
that I’m sure they have and I have that are just bonkers. But yeah, really, really cool.

Kate Holterhoff (1:13:40)
Yeah, yeah, I’ve heard a few. But yeah, I I appreciate that. And thank you so much for coming on. again, my name is Kate Holderhoff, senior analyst at RedMonk. If you enjoyed this conversation, please like, subscribe and review the MonkCast on your podcast platform of choice. If you’re watching us on RedMonk’s YouTube channel, please like, subscribe and engage with us in the comments.

No Comments

Leave a Reply

Your email address will not be published. Required fields are marked *