The term “platform engineering” is all the rage right now, but for many organizations the road to building their platform capabilities is years in the making. Join Rachel Stephens as she interviews Johan Marais, the Senior Platform Service Manager of Discovery Limited. Hear about Discovery’s path from moving from a traditional IT org to building their own platform to finding their way to VMware Tanzu. The discussion of both the tools and culture changes required is informative to anyone trying to build out platform capacity in their company.
This was a RedMonk video, sponsored by VMware Tanzu by Broadcom.
Rather listen to this conversation as a podcast?
Transcript
Rachel Stephens: Hi everyone, it’s Rachel Stevens at RedMonk with another RedMonk Conversation. I am really excited about this one. This is a customer of VMware’s, Discovery Limited. And the story behind me getting into this is very exciting. So the person we have with us today is Johan Marais, and he runs the platform team at Discovery Limited. And I got to hear him at VMware Explore. He was on stage in the keynote with Betty Junod, who’s one of my very favorite people. And then I got to have a lunch and learn session with Johan later, which was also wonderful, but it was like, lunch and you can’t really take notes during lunch itself and it’s never a fun thing. And so I’m now here to ask even more questions of Johan, learn more about what they’re doing because their story is really cool. So let’s go ahead and have you introduce yourself Johan, tell us about what you’re doing at Discovery, a little bit about who Discovery is, and then we’ll dive into questions.
Johan Marais: Thanks, Rachel. And firstly, thank you very much for having me tonight. So yes, I’m Johan Marais. I’m a Senior Platform Services Manager at Discovery. A little bit about Discovery. Discovery is a financial services organization based out of South Africa that started just over 30 years ago in the financial healthcare space and with one core thing in mind: to make people healthier and enhance and protect their lives. And 30 years later, the company has grown into multiple financial organizations from short-term insurance, life insurance, investment. More recently they opened a bank. So I suppose we cover the whole portfolio from there on.
Rachel: Very cool, very cool. And tell me a little bit about the team that you run.
Johan: So I represent central IT or central technology services, depending on how you want to refer to it. I come from a traditional background where we provide the siloed functions as a central governance and I suppose IT technology stack to the business. We have 15 different businesses globally and we do all of the support essentially from South Africa. The team that I represent look after all the cloud services and platform services for the group globally. That includes all the public cloud providers we deal with. We’re in four of them today. The on-prem VSPA state containerization, infrastructure DevOps, and a few other things that goes with it. But I suppose it’s a natural evolution as the team has grown from when IT started as a traditional VM shop to where we are today.
Rachel: Gotcha. And I think I remember you saying, so I think it was 14 CTOs that you’re trying to all align into your platform. Is that right when you’re talking about your federated team?
Johan: Spot on. I think with having these different companies across the globe, every one of these companies are their own entities. They have their own CIOs, they have their own budgets and governance structures, and we provide the central IT aspect to them. So although there’s standardization across the group, in some respects, they’re all unique, but we operate in a federated manner to obviously get the best for the group out of the whole offering.
Rachel: Very cool. Okay, so I get excited. I’m going all out. So first I’m gonna do the tell people what we’re gonna tell them thing. So I wanted to kind of like a three-part story arc on how you guys have approached building your platform at Discovery. So I wanna talk about just how you started to build this centralized team in the first place, which we kind of started to get into. And then I wanna talk about, because I know that you also attempted to build your own platform, which I think is a very interesting part of your story. And then I wanna talk about where you are now. So let’s talk about the start of the journey and federating these 14 teams and having like this central IT group kind of turn into kind of a platform group. Can you tell me how, just like the origin story of where you all came from?
Johan: Yeah, a hundred percent. And I think if you consider traditional IT, the way I refer to it is where there’s all these silos, networking teams, storage teams, security teams, even operating system teams. And it was traditional on-prem capabilities and competencies. And the team that I started was looking after virtualization. And virtualization for me was one of those touch points where it touches the network and the storage and the data center and everybody else. So it felt like the natural progression for the team to evolve into cloud space and then eventually into platform services. To us, everything runs on virtualization platforms. If it’s on-prem or in the cloud. And we were well positioned to be that, I suppose, foundational team that could enable business in that way. We’re probably the best positioned team and the closest in alignment between the application developer teams and traditional IT and filling that middle ground. And there’s a need for us to as a central team to evolve and become closely aligned with developer teams and we were well positioned for that. So I suppose it was a natural progression for me, how we started out. And as time went on, we just add more and more services and we do that today still.
Rachel: I love this. I feel like a lot of times the DevOps team ends up having fancy roles and titles and I love that it came out of, we were the virtualization team and this is what made sense. Talk to me about the journey from virtualization to containerization because sometimes that can be a tricky part to try to navigate.
Johan: Yeah, I think Discovery as a whole, we still sit with a fair amount of legacy on-prem dependencies and legacy monolithic applications and so forth. But some teams, because we have these 14 different entities, some of these teams are in a better position to reinvent themselves or make that leap from the old stack to the new stack. And we found that teams were doing it on their own without involving the central capability.
They were running Docker on general VMs and that was a starter or a form there. So we decided it’s better for us to get into the game and set a standard and a strategy for going forward. So that was the kind of evolution of us having our own containerized platform that we offer as a service internally to our customers.
Rachel: Which you naturally did on Kubernetes. So to tell me about, okay, this is the next step of the story. It says, tell me about your journey with vanilla Kubernetes.
Johan: So I suppose some of the business units, when they were on their own, made some strategic choices. Some of them even procured some of their own platforms. And as we were of the opinion it’s better suited to be central, we had some very clever people in our team that were of the opinion that we could do this better ourselves. It was the early days of containerization or Kubernetes for that matter.
We were embracing this open source world, but with a team that’s very small, even though, although very skilled, I think the ambition was misaligned and we embarked on this journey of building our own, but soon business weren’t seeing the value. It was unfortunately one of those where we had to learn the hard lessons and realize, is this really our game? What is our job really? Enabling business? And was it really to be these platform engineers that contribute to the Kubernetes community at large scale and it wasn’t us. The value for business wasn’t there. So we had to kind of reset and change direction over time because ultimately we weren’t providing value to business and business was suffering and going somewhere else for their services.
Rachel: I’m curious, could you share with the group, how long did you go on this journey for before you decided to change directions?
Johan: Hahaha. It was, in my world, probably too long, but short-lived, two years’ stint.
Rachel: Okay, okay, two years. That’s not so bad.
Johan: I think in the third year, we already started making the changes. So it wasn’t, I think the difference was that business we’re adopting, we’re making all these promises, and people were getting into it with the belief that this is the new way of doing things, which unfortunately just weren’t working in business favor. It wasn’t stable enough and we were focusing on the wrong aspects of the platform. So we obviously, being federated customers were on this journey with us, but we had to kind of take it, make a change and relook at what we really wanted to do and then change direction to kind of be that platform service of choice for business instead of them going somewhere else.
Rachel: And I loved your comment around just making sure that your platform was actually providing value to all your businesses and to all your internal customers, which is definitely a story I have heard from VMware often. James Waters is one of the first people I met doing this job and he was kind of right in the heart of his value line storytelling. And are you having your team’s focus on differentiated work? And it sounds like the story is completely aligned with that. So I guess that takes you then to moving into the VMware platform. Do you want to talk about where you are now?
Johan: Yeah. And I think it was a natural progression. I think being a long-term VMware customer, in the form of the traditional vSphere or virtualization platforms. At that point in time, when we had to take stock and figure out where to next, VMware already had platforms. And for us, as a customer, we work with partnerships. So the sense of going out to yet another provider didn’t make sense and the offering was on the table for the Tanzu stack at that point in time fitted our needs considering that we weren’t the most advanced use case of technology stacks and foundationally it was rock solid and again we have a partner that can kind of guide us through that process from start to finish seeing that we run the Kubernetes stack on top of VMware anyway. It felt natural from a progression perspective. And once we looked and prioritized what’s really important, it was a natural fit for us to kind of move into the Tanzu stack at that point in time.
Rachel: And so you’re on Tanzu application platform, right?
Johan: Well, that’s the next platform that came along.
Rachel: That’s the next. Okay, so you went straight to Kubernetes grid.
Johan: We started off with Tangent Kubernetes Grid. Yeah, 100%. So we replaced our own self-built Kubernetes platform with Tangent Kubernetes Grid, which creates the foundation for us from running containerized workloads in a self-managed way. And we can offer this as a per cluster or per namespace base. So we have two different models. And the one that you referred to now, the engine application platform or TAP for short, was the evolution for us now, because now we’re taking the next step into that evolution of how do we provide better value to our developer teams, how do we remove the friction and the ease of adoption of containerization, while at the same time, keeping things secure and simple. So I suppose that part of the next part that we’re working on right now and productionizing within the group.
Rachel: Now let’s talk just a teeny bit more about that because I think one of the things that’s interesting is TKG is built into like vSphere now. And so that’s kind of one of the core parts of lower level part of what is happening in the VMware stack, but you’re moving up into the Tanzware application platform for the developer experience. Can you talk about some of like the rough edges or things that are getting easier for you in that shift from TKG to TAP?
Johan: Yes, 100%. And I think, and just as you mentioned, you’re quite right. The default approach today for TKG is the vSphere integrated way of doing things. But when Discovery started on this journey with TKG two and a half years ago or so, we chose a model which is referred to as the TKGM model, which is the multi-cloud model. So it’s slightly less dependent or integrated into the vSphere stack, but it allows us some flexibility in our deployment models on-prem and in the cloud where it makes sense.
We also, as part of that stack, embraced the Intersect Advanced Load Balancer, or the previous name was RV, because we used that for layer 7 increased controllers. So some of those decisions forced us in the direction where we had to do the multi-cloud model versus the integrated model at that point in time. But be that May, as the multi-cloud model, container orchestration platform for us. And TAP natively ties into these capabilities. TAP as a application platform, leverages the integration that we have between the default TKG as well as the vSphere stack, and just makes the adoption so much simpler because it hides all the complexity. It’s a lot to manage from a platform team perspective, but from a developer perspective, the ease of use hugely simplified because they can focus on the real business logic and they don’t have to learn and figure out what Kubernetes needs and what’s missing from a Kubernetes perspective when they deploy their applications.
Rachel: How many clouds are you running in?
Johan: We’re in four clouds today. So if I consider that excluding the on-prem, we refer to it as a cloud as well, but we run in four different hyperscalers purely because of locality for some of our application and workloads. We run in 17 different countries globally. And some of those countries have restrictions around what cloud providers can be used, where data can be stored and so forth.
So based on those restrictions, we have to operate in various clouds. And so, we’re trying to use TKG and more recently TAP to try and abstract that complexity as well. The developer doesn’t have to choose different deployment models for the various clouds. TAP can decide, make those decisions for them. They just tell it where it needs to go. So again, that’s one of those that we do multi-cloud and TAP is well positioned to abstract that complexity from a multi-cloud perspective.
Rachel: So 14 different CTO organizations running across four clouds in 17 companies or 17 countries. So that is a pretty good scale. How big is your platform team supporting all of that?
Johan: We have six people today…
Rachel: That’s a pretty good scale ratio.
Johan: Well, I think there’s one caveat to that. I think the teams are at various stages of the journey. Some of them are grad students, so they’re new into this space where others are more experienced. But I think as we adopt these tuning and technologies, the intent is to grow the team because what’s happening with us is a central platform team.
And what we’re seeing is happening within the greater IT department is that it also talks to restructuring and breaking away from the traditional silos. So what we envision over time is some of these silos will be collapsed and the members of those teams would be absorbed into existing teams that make sense to remain relevant. So in a platform team in that case, we could then get more members added to the team and we could then eventually get into this world where we split, I don’t know, platform teams from SREs and so forth. But we’re too early in the game for that. We’re first just trying to do one thing right.
Rachel: Fair enough. But nonetheless, the roadmap may be on its way, but that’s six people right now and they’re doing some pretty heavy lifting across the organization, it seems like, and it seems like your impact is wide. So, very cool. I’d love to hear more about kind of the culture that came, because you started to talk about this in terms of silo teams or different IT teams that need to start to get integrated into the platform way of doing things. Did you have a hard time originally, as you kind of started this platform mission of getting buy-in? And I mean, like you can get buy-in from a lot of different things. But I want to start, let’s talk about getting buy-in from the IT teams in particular.
Johan: So I think it’s not unique to us, but the people and process problem is probably the biggest part of the change. The technologies exist to do these things and to help support us, these workloads. But transformational changes and especially structure changes within traditional teams is very difficult and we’re still working through that today. I think what we found is over time is depending on the team that you come from, you’re either naturally more aligned to the new way of work versus you stuck in your way. And I think that journey doesn’t happen overnight, except we’re still working on it. Some teams have moved, some teams have embraced cloud services and some teams have embraced cloud automation. Some are just slower in that adoption process. So we by no means have the answer for this. I suppose we’re taking it day by day as we go along. We have a set strategy on what we’re trying to achieve, which is the part mental wide, which I think is in our works in our favor, because that helps to bring everybody on the journey in the past, when it was a very solid team, what I did from a platform team perspective, wasn’t necessarily aligned with what the network team or storage team had to do and therefore we, we kind of drifted apart where now it’s a centralized IT strategy in embracing cloud and infrastructure as code and the platform team is one of the leads in that initiatives, it’s easier for us to bring people along on their journey.
Rachel: Mm-hmm. So it sounds like you’re bringing people with you. I love it, bringing people along on your journey and helping people understand where everybody can be moving towards together. So then that leaves a few other constituencies in terms of culture and buy-in. So let’s talk about top-down buy-in. How are you getting 14 CTOs to come to your platform? And then let’s also talk about how are you getting all of their developers to come to your platform? So it’s like, what is your grassroots and what is your top-down and how do those kind of work together?
Johan: So I suppose the answer would be it’s work in progress as well. And I think a lot what we’ve learned is we had to up our communication level. We really had to showcase to the CTOs what’s the value to them. And what we found working in Discovery is that we work with one or two small teams first, let’s make them successful and use them as success stories within the group. Because these teams that are more agile, that are well positioned to make some of these changes, once they see the benefits, we can, it’s easier for us to articulate the discovery benefit. It’s very often you listen to a sales pitch or you read the documentation and it sounds all great, but it’s not realistic within our organization, whereby our approach is by making our own teams successful, work with one or two small ones, and use them as the catalyst or the sales or ambassadors for that matter within the group. Then we get success and traction from that perspective.
So two parts, bring developers on the journey with us, find the people that’s willing to work with us to make them successful, and then constantly communicate at the top level, the changes, the benefits and where things are going.
I think the risk with 14 different companies doing their own thing at some point in time means that you don’t always consider it of something else that might already exist within the organization. So price communication is probably by far the biggest thing that we get to change in our engagement model with all these stakeholders.
Rachel: Gotcha. So you are a six person team building a platform and also in charge of internal marketing, it sounds like. How do you approach this communication in terms of like, are there things that you have found that work better in terms of reaching different audiences?
Johan: So I think what we are fortunate in what’s working in our favor is the fact that although the platform team is by far leading a lot of these changes, there’s a strategic approach from a central IT perspective to remain relevant and make change. So having the right change managers in place to help us with the effective communication, greater documentation or communicative at the right level. So we had to get the right people to support us in that. So although the platform teams only six people, they’re the people that technically make it happen. It’s not our skill set to market this effectively and communicating all these changes. So yes, it’s a much bigger team that sits behind us in making this real because —
Rachel: Okay, I was gonna say you all have very full plates.
Johan: We might have the best intentions, but I don’t think we’ll be as successful as if we didn’t have the change team behind us and the mind from the central IT perspective, from our own CTO that communicates and advocates for all these strategies that sits behind us. I think it’s a team effort, collective as an IT department, although we might just be at the forefront of some of these changes.
Rachel: Very cool. And in terms of mediums that you’ve found, is it internal blog posts? Is it demos? How do you try to reach out to different people? Do you see some that are more effective than others?
Johan: So, yeah, so I don’t think we have that solved. Internally, what we found so far is depending, especially with the C-level stakeholders, depends on definitely an in-person meeting with, I was taking them through a, I suppose a PowerPoint presentation, slideshow that gives them the high-level detail, something we can share afterwards. So we have intentional reoccurring meetings with these people that kind of updates them on the journey. With the developers, it’s slightly different. It’s really getting stuck into the code itself and understand the metrics that make sense. Something that’s new for our IT team as well as something like Dora metrics. It’s something that’s familiar within the developer space and application teams, but traditional IT had to change.
Or we had to change our ways, traditional IT, and how we report on things and how we measure and monitor things and how we motivate for change. Because the traditional solid approach didn’t really in our world rely on all those metrics to sell something. So as part of this journey, we have to up our game to be more closely aligned, speak the same language and just keep the engagement going.
Rachel: Gotcha. So I like that because it’s not just like we have published a blog post and everyone should come do it. It’s like, we are working very hard on achieving internal alignment and making sure we are tracking the things that matter and all kinds of good and important and very challenging work.
Johan: Yeah I think for us, just to comment to that, a lot for us is around culture. And our culture is an in-person kind of culture. We like to talk to each other, we like to meet each other, and that’s when we do our best work. So even though in this new world of hybrids —
Rachel: So you built an entire platform during the pandemic. Got it. Sounds fun.
Johan: Yeah, it’s a toughie.
Maybe that’s part of the reason why our own platform also wasn’t as successful because it was done during the pandemic in the sense that there was no human interaction. The virtual construct really didn’t do it any favors.
Rachel: Fair enough. All right, so it sounds like you’ve had quite the journey in terms of evolving the platform, evolving the culture. Do you have any parting words of wisdom for any other companies who are embarking on this journey?
Johan: So I think our biggest learning or even if I look at my own learnings, you have to take a stance and be bold and make some changes. For us, acknowledging that what we were trying to do, building our own platform was wrong. Like own it, like be bold, own the mistakes and figure out what’s next. So that was our first thing, but just own it. Like don’t try and hide behind it. Don’t try and delay the change. For us, it was make a change and be open and honest about it. That is our biggest thing. So that was the first part.
The other part of this is, and I mentioned it earlier, people and processes by far is the most difficult thing to change. And in some cases, you have to be patient with these things. There’s more technology out there that we can probably think of to do the jobs that we need to get done. But people and processes are hard. And for us, what works is being open and communicating about these things. Like only if it doesn’t work, if you don’t know, say, I don’t know. For us communication by far is the biggest thing and just get the buy-in from the top down. For us, having that support takes a lot of weight off our shoulders, executing. Having our CTO backing us for what we’re doing at his level and communicating it at the C level, Where are we going? What we’re doing?
That speaks words and for us that just takes the pressure off because he trusts us that what we as a department are setting out to do is achievable and be patient with it. These things take time. Kubernetes on its own is complex, it’s hard and tough, underestimated. So stick to basics.
Rachel: Yes, that’s fair. Wonderful. Well, Johan, thank you so much for your time today. I appreciate you telling your story in increasingly more in-depth levels of how you have accomplished all of this to me. It’s been wonderful to hear about your journey and I really appreciate you sharing how you have built your platform.
Johan: You’re welcome. And again, thank you very much for having us. It’s great to be part of it.
Rachel: Wonderful. Thank you so much.