With AI models and technologies rapidly evolving, organizations face challenges such as unpredictable AI behaviors, data security risks, and uncontrolled costs. VMWare Tanzu by Broadcom’s Adib Saikali (Distinguished Engineer) and John Dwyer (Director, Product Management) join RedMonk’s Kelly Fitzpatrick for a conversation about how AI middleware aims to address these challenges, streamlining AI application development by offering secure integration, model access controls, and compliance tools. Also included: a demo of Tanzu AI Solutions.
This is a RedMonk video sponsored by Tanzu Division, Broadcom.
Further reading:
- What is AI Middleware, and Why You Need It to Safely Deliver AI Applications: https://blogs.vmware.com/tanzu/what-is-ai-middleware-and-why-you-need-it/
- Not My Father’s Middleware: How To Be Productive With Agentic AI: https://thenewstack.io/not-my-fathers-middleware-how-to-be-productive-with-agentic-ai/
- Enterprise Ready AI Applications with Tanzu AI Solutions: https://www.vmware.com/solutions/app-platform/ai
Transcript
Kelly Fitzpatrick
Hello and welcome. This is Kelly Fitzpatrick with RedMonk here with another What Is/How To video. Today, we’ll be talking about AI middleware and take a look at how to build AI-embedded applications with Tanzu AI solutions. My guests today are both with Tanzu VMware by Broadcom. Adib and John, can you please introduce yourselves? And Adib, we’ll start with you.
Adib Saikali
Hi, everyone. My name is Adib Saikali. I’m one of the engineering leads working on the Tanzu AI Solutions. I’ve been involved in the app dev platform space since 2014, so almost going on 11 years. Really been an awesome journey.
John Dwyer
And I’m John Dwyer, and I look after the product management of the Tanzu AI Solutions. I’ve been working in the platform space for a number of years, but my background before that was in software development. So this is a great situation for me to be in where I’m finally bringing my platform experience and then my former development experience together to look at how do we deliver applications holistically.
Kelly
Well, thank you both for joining me today. I feel very fortunate to have people who have just expertise in this area. I think one of the reasons we’re talking about this is that these days, AI is all the rage, but many organizations are still at some stage of figuring out their tools and their processes, both for building and delivering AI applications. One concern that I’ve heard is that the space is evolving so quickly, that figuring out what tools and processes that can actually evolve with the space is something folks are concerned with. This, of course, is where AI middleware comes in. But I’d like to start with the trends that both of you are seeing in the space. You’re both people who work with intelligent applications. So what about AI is on your mind these days?
Adib
Well, for me, it’s really helping developers figure out how to use it in real-world scenarios. That is the number one thing. How can we help people simplify what it would take to build something?
John
And I think from my end, any new technology that we have, the developers as well as the experts of the industry, are continually finding new ways to use it, new solutions to leverage this functionality to deliver. And that’s one of the only constants we’re going to see going forward is these models getting better, and we’re going to figure out better ways from a software standpoint to use them. So how do we keep evolving over time?
Kelly
I love that. And those are both very developer-forward takes on the AI space, I think, which sits very well with RedMonk, of course. So first question, what exactly are intelligent applications? And to your mind, how do they differ from conventional applications?
John
So intelligent applications are just an evolution on the journey of application development. We started with traditional apps, with apps and data. And then intelligent apps are bringing in the large language models to benefit and help evolve the application. We’ve seen over time this evolution where we went and started from the traditional app, we moved to machine learning, now we have Gen AI, and now on the horizon now, came in pretty hard in 2024, but really, 2025 is going to be this year of a GenTech apps. And that’s really just evolving how we solve our business problems. And that’s a key piece is we want to enable the developers that are core to solving our business problems to be able to take advantage of this moment in our industry.
Kelly
And one word that you just said was agentic AI. So this is something, one of the newer things we’ve seen. To your mind, how have these evolutions in AI technology from Gen AI, now we’re talking in Gen tech apps, impacted the design and functionality of intelligent apps?
John
Well, one thing is it’s a natural evolution. I just talked about that we’re finding new ways to use it. So we’ve seen this evolution in how do I use a large language model? It started with prompt engineering, and then it moved into a RAG was all the rage, and then it’s agentic. And what’s interesting is this is just new software patterns, new patterns to leverage a new service API that’s available to me to generate new outcomes. And what has happened is basically we have the same challenges we have with getting any app to production, and we have to worry about that. Where am I going to run it? How am I going to version control it? How am I going to evolve over time? And those aren’t any different with agentic apps. It’s just agentic apps are connecting to more and more data points or other agents to build better solutions with the large language models.
Kelly
I talk to a lot of organizations and a lot of developers, and I think some folks are a little intimidated by the prospect of building apps with AI. You make sound like it’s just any other app, and I think that has to be comforting for some developers out there, that there’s really not that much different that they need to do.
Adib
I think that that’s totally the case. I think that there’s a set of concepts you have to know when you build Gen AI apps. To use a database example, most developers are comfortable with concepts like there’s a table, there’s an index on a table, there’s a sequel query. There are equivalent concepts like that that you you’d encounter as a developer, like what’s a chat model, what’s an embedding model? What’s a similarity search with the vector store? These are all foundational concepts and my experience in hosting full-day workshops with developers, where at the beginning of the day, I will survey the room and they’ll say, I’ve used ChatGPT, but I don’t really know what’s involved in writing an app. Towards the end of the day, they’re like, Oh, yeah, this was a lot easier than I thought it was going to be and a lot more approachable. I don’t actually have to be an expert in linear algebra and calculus and training models. It’s really if I know how to call an API, I know how to use these models. My domain expertise for the company I work for is the secret advantage that I have or the competitive advantage that I have for bringing AI to my role.
Kelly
I think that’s a really good point is that a lot of developers have experience using AI tools, like ChatGPT being one of them, and the jump from using to build things, to building things that actually leverage generative AI. That’s a jump there. Can you share some real-world examples of the type of applications that you’re seeing people building out there in the real world?
John
I think this has been one of the most fascinating things is every time we go into a company and seeing what they’re building, there’s always some use case that comes to light. That’s like, I can’t believe they’re using that for this. But a key thing that every company we talk to is going through is they’re trying to take advantage of this moment in the industry, but they’re making sure that they’re identifying use cases and getting a solid return on investment to justify the continued investment and evolution of intelligent applications. We recently just talked to a large automobile manufacturer, and they were specifically gave us an example from their commercial space where they built a sales copilot. We always talk about developer copilot. Imagine having a sales copilot that can help your sales team be better prepared for customer engagements. So their measure, their KPI for success was around how can we make the to shrink the amount of time the sales team has to invest in preparing for a meeting with a customer, and we can empower them with AI and the data we have to right size the basically discounting offers that a salesperson can make and a commercial commercial sales engagement.
And it was something that was really actionable and measurable so that they could justify continuing to invest and evolve their applications.
Adib
I think this idea of the copilot that has a human involved, where the human is involved in a very high touch activity that requires a lot of preparation, is one of those very common things we run into with different customers, because for the customers, it’s a lower risk activity. If you can make your highest paid, your private banker that meets with high networth individuals, better prepared for that meeting, maybe you make more business with them than you would otherwise.
Kelly
Well, thank you. I think those are both great examples. I really like the translation of the copilot that most of us are familiar with in the developer landscape to how could this be applied in other areas. I want to talk a little bit about middleware now, since AI middleware is one of the things that we’re we’re hoping to look at today. Middleware is a term that I have heard for quite a while because I’ve been in the industry for a bit. For folks who are not so familiar with middleware, much less AI middleware, can you explain the concept and its role in application development as you see it?
Adib
Yeah. I mean, middleware really emerged as a common term in the industry shortly after the Internet took off and everybody wanted to have a website and everybody wanted to have an online banking app and an e-commerce site. And the idea behind middleware was pretty simple. When you build an application as a developer, there’s a whole bunch of stuff that you need to do that’s infrastructure-related, and those things would be best done once inside of this middleware application server type construct. And then you can then focus on your business logic and implement the things that are related to your use case, as opposed to, Oh, I need to build a web application, so I need an HTTP session manager, so I’m I’m going to code an HTTP session manager from scratch. I’m old enough to remember life before the Internet. And the classic example people used to give was you’re building a shopping cart. How do I build a shopping cart? You need a session manager. We’re finding that when it comes to building AI-powered applications and experiences, there’s actually a common set of things that everybody needs to do over and over again. And those things are very much amenable to being offered through, let’s call it an AI app server or AI middleware, so that you don’t have to reinvent the wheel for every application that you ship.
Kelly
Thank you for that. That makes a lot of sense. In terms of AI middleware, specifically, what is that bringing to the table in terms of things that developers don’t necessarily have to go have expertise in because the middleware is taking care of that for them?
Adib
I think a really good way to think about it is the decisions you have to make as a user or a builder on top of AI. First, you have to say, Hey, which model am I using? And where am I getting the model from? And we see two approaches from customers. Some customers are like, Look, I’m just going to use a cloud-based model, and I trust the cloud providers. Others are like, No, for any number of reasons, I would like to run models within my air-gap environment within my four walls. So they have self-managed models. But no matter what model you pick, there’s a common set of concerns you have. You usually have more than one application that are going to leverage that model. And those models, whether on public cloud, on managed, self-managed, They have limits. There’s only a certain amount of tokens that you have. So you get into very basic things like, how do I fairly share the AI capacity amongst 10 applications as an example? How do I quota those things? Others are much more more organizational nature. We’re very excited about AI. We’re going to look to our app dev teams to see what AI features they can add.
But then how do you know, how do they pick what models they’re going to use? And what are the approved models? Within organizations, we find that there’s a small team of, let’s call them AI subject matter experts. They’ve studied the models that are out there. They know the price performance, they know the safety characteristics, and they’re like, Hey, we think that in our enterprise, we should use We should use model X. The other people are like, We should use model Y. At that point, how do you scale that out to your hundreds of developers within the organization? We find that this type of middleware makes it possible to safely allow people to self-serve. You can think about it as instead of the applications directly speaking to the model, the applications go through the middleware layer to talk to the models. Because you go through the middleware layer, you get a variety of governance controls, and you get a variety of observability controls, and you got a variety where there’s certain types of business logic or infrastructure-ish/AI-ish patterns that you would have had to do in the app. Now, you can actually just implement that simply in that AI middleware layer.
That gives you, as a platform engineer, the ability to centrally control the configurations. Most importantly, as a developer, it frees you from two nasty things you would have to deal with otherwise. Number one is it frees you from having to ask permission. If you want to use AI, we frequently hear people have an AI Safety Council or an architecture review board, and they need to go in front of that group and they need to say, This is what I’m doing with AI. Here’s how my application works. The idea is to try to make sure that things are being done in a safe way. Well, this gives you the guardrails. You don’t have to go in front of a human to ask for something. If it’s in there, it’s approved, it’s been vetted, and it’s safe to use. The second thing that it helps with is that because AI field is moving so fast, no one is ever sure that they actually solve the problem the best way. Every week, there seems to be a new model that’s out there that’s faster, cheaper, better, or some new capabilities. There’s a huge desire to make sure that you’re able to, as a developer in your application, to make use of the latest developments in AI, regardless of which vendor and which model comes out with them.
And so this ability to go through the middleware layer means that your code is portable. You don’t have to change your code. You don’t need to restart your code. You can do things like A/B testing, different models against each other. Where you can do other things like collecting the data you need to perform model distillation, where a teacher model is used to train a smaller model, which is usually like 80% cheaper to run and operate, compared the teacher model. That’s some of the value props that we see customers really gravitating towards with the AI middleware.
Kelly
I really like that you brought in the platform team and that whole platform engineering concept, because I feel like there’s a lot of similarities here between, say, an app platform and developer portals and something that is designed to let developer self-service, but then give people who are on the ops or security, or in this case, AI expert on the corporate side, the ability to be like, Here are the things you should be using, which for a developer means you don’t have to think about it. If it’s there and available to you, it’s something that you can use.
Adib
Absolutely. I think that this has a really critical relevance for achieving ROI on AI. I was in a customer meeting with a large bank last week, and one of the things that one of their very senior executives said is that we went to the board and we promised that we’re going to generate… It was like large numbers of new revenue through the power of AI. Let’s think about it. How are you going to do that as the bank? Well, you’re going to have to either go in and modify the applications that your bankers use. So when they’re meeting with a customer, they can offer them more banking services that are right for them and win more business. Or maybe you make changes to your public-facing, customer-facing facing apps that the customers are using, and you add more AI there. So maybe you sell more banking products. In either case, those applications already exist. Because those applications already exist, the question is, how do you enhance those existing applications? And that means that your existing development teams, they’re the ones with the domain knowledge and the skillset on what the applications currently do. And so the fastest way to get to your ROI as the bank is to empower all all of your teams that are there with the existing skills that they have to go in and start enhancing those applications with AI capabilities.
That’s how you get to a realization of ROI on AI investment. The anti-pattern is to say, Oh, we have a team of specialists that we hired. They’re our top-notch AI people, and they’re going to implement the thing that’s going to make the bank successful. Sadly, those very capable people don’t understand the business domain and don’t have that on the ground view of where AI can actually help.
Kelly
What I’m hearing is you just described a challenge that you see organizations face when embedding AI into new and existing applications, and this is the way of addressing that. Are there any other challenges that you’re seeing organizations face?
Adib
I think a huge challenge is they don’t know what is needed to build AI applications. There’s a lot of copying and pasting or following the crowd or seeing what you’re trying out things that other people have tried out. That can be a little bit of reinventing the wheel multiple times. People are trying to figure out the pattern for how to solve these, and then they’re trying to figure out how to scale it. And it’s the how to scale it part that we think there’s enough patterns now that those patterns are things that you can purchase with an AI middleware as an example, and you can focus on the scaling out AI within your organization.
John
The other thing that we’re seeing a lot of is just, how do I get started? And I need GPUs. No, you don’t. Or I can’t use public cloud because we have data sovereignty rules and we can’t use public cloud. And with our AI middleware solution, we can just spin up models based on CPUs. They’re going to be slow. But guess what? That costs you nothing. Use the hardware you have, use the capacity you have wherever it is, and prove you have a use case. Start developing now and improve the use cases there and then make the investment at the right time. Don’t pre-invest before you know that you have something that’s going to add value to your business and your existing applications.
Kelly
All right. So one question I have is about a GenTech AI, which we kind of touched on a little bit, but can you explain what it means and why this shift to Agentic AI is significant for developers?
John
Well, first of all, when we see the news that’s everywhere, you can go on CNN or MSNBC, everyone’s talking about agentic AI. I was recently on LinkedIn and saw a head of product at a major tech company saying they surveyed a thousand developers and 99 % are building agents. And I think first we need to level set on what an agent is because it’s really just a software pattern. And that’s why it’s so important to developers. And when we talk about an agentic application, there’s really this broad spectrum of what it actually means. When they surveyed those thousand developers, most of them were probably talking about the blue side, which is an outcome, a UX pattern. It’s the simplest, easiest thing to implement, and so everyone’s getting started there. But when we go to the other side of the spectrum, on the green side, this is really an architecture or software pattern. People have figured out better ways to bring data and information to the large language models. And this term agentic is really coming from the data science land where it’s the model having agency to decide what happens next. And we are empowering the models with tools so that they can define their control loop and iterate on that.
And I love the Hugging Face definition, and so I put a link here. But they actually have this star rating system about how agentic it is. And it’s like in the most basic form, the agent can decide an if statement in the logic to decide what to do next. But in its more complicated form, it’s actually generating loops and iterating on them and can do recursion.
Kelly
Once again, you have made something that I think can seem very complicated to developers, seem not so complicated. I love the rating system the Hugging Face has. But to reiterate, if you are a developer, what technical skills and tools do you need in order to deliver an agent-driven application?
Adib
Well, I think you need to know how to call a REST API, which I think all developers know how to do.
Kelly
I mean, I should hope so.
Adib
Then the other thing you need is an understanding of the blue side of what you see on the diagram, which is like, what is that UX experience that’s a GenTech? Then you need to also be familiar with the patterns, the software architecture patterns that you use to implement that, to drive a model. It’s not very difficult. It’s on par difficulty-wise with learning how to build a web app or learning how to build a mobile app. There’s certain patterns that you can learn. I’m quite optimistic that a lot of developers are going to be able to pick this up quickly and with the help of AI middleware solutions, are going to be able to to ship a lot of agentic capabilities within existing apps that are out there in the enterprise. That will lead to just better outcomes for all of the people that depend on their apps to get their job done or accomplish whatever tasks they’re trying to accomplish.
Kelly
Can you speak a little bit more about how AI middleware solutions and app platforms that have them support this?
Adib
Yeah. Take an example where you’re wanting to build, let’s say, a customer service agent. And in order to do that, you might say something like, I ordered some patio furniture from a retailer. Patio furniture arrived, and I’m supposed to get two boxes, box one of two and box two of two. And, oh, no, I got two box two of two. That happened to me. So I had to pick up the phone, call the retailer, and get bounced around three times. And then three times, I was asked for the order number so they could pull it up. And then eventually, it’s like, Okay, we’re going to tell the shipping, we’re going to ship you a new order. And then shipping company calls me to see when I’m going to deliver it. And then they also have to schedule another shipping company to come pick up the old boxes. So this type of interaction literally went over four weeks to try to get to… I get box one of one, one out of two, and two out of two. Yes. And so I imagine a much better experience would have been possible me if I could just went to a chatbot after I logged in because I bought this online, it knows who I am because I’ve logged in as me on that system.
I should have just been able to type something like, Hey, there’s a mistake in the delivery. Here’s a description of what happened. If you’re the developer building that software agent, that customer support agent, you need to be able to reach out into the order management system to pull that order. Then you need to try to figure out, Okay, so the issue was we sent them the wrong stuff, or the issue was that this was damaged, or the issue was it was just completely the wrong item that was sent to them. A lot of this problem solving to support the customer situation evaluation, that’s going to require the software agent to pull on data that is only available inside of existing APIs within the enterprise. And so the AI middleware is going to make your job a lot easier to do this thing. Because you can imagine that you have your public customer-facing agent, the customer support agent, which the public can use, I can use. But on the other side, there’s the actual human that you might actually need to interact with. And that human could be talking to me on the phone, or they could be responding through a chat system with me.
And that screen could show this was a question from Adib, this was the response from the AI. This was the response from John, who’s the customer support person on the other side. And so there’s a common set of patterns with a lot of annoying infrastructure that is going to look identical from app to app. And those are the things that you want to get out of your AI middleware. In this example, there are literally three key things that you would need. You would need something called memory, which is a way to do self-editing memory on the application context. It’s a famous paper called MemGPT from UC Berkeley that basically explained this pattern. So you can go and implement MemGPT style memory yourself, or you can just get it out of your AI middleware server. The other thing is, I don’t want to write the code three times to call the order management system, right? Perhaps what you do is you have a Model Context Protocol, or aka MCP server. MCP is a protocol that Anthropic released in just before US Thanksgiving in 2024, and that’s already supported by us in Spring AI and other things, that would make it possible for you to say, Okay, I’m going to take an existing API and I’m going to expose it for consumption by AI agents as an MCP server.
So, can my AI middleware make it easy for me to do this? The third thing is, sometimes these AI agents are going to generate some code to solve the problem. If the agent generates code, you need a place to run the code that’s generated, you need a sandbox. You’ll see things like code execution sandboxes, something you would need in that category of product. Others, like Observability, that’s what you can get when you’re able to run all the traffic through the AI middleware, is going to tell you how well things are working, how long are things taking. You might want to go back in time and and try a new model with old inputs. You can do that because you’ve been able to capture some of it. There’s all sorts of these types of things that you want to do it in a systematic way, and doing it on a platform which has AI middleware built into it is so much easier than trying to do it yourself. While you’re trying to figure out how to do AI for your business case, you’re also trying to figure out how to build infrastructure for AI. It’s like app infrastructure.
I hope that answers the question.
Kelly
No, that was a great answer. I think even the use case that you were describing, or the scenario that you were describing was just a really good setup for generating all the things that you would have to do. I very much feel that we perhaps have gotten to the part where we do a demo. So Adib, for folks who are not already familiar with Tanzu, where are we? Who would be seeing this view? What would they be doing?
Adib
Okay, great. So this is what the view that the developer sees. They navigate into the applications manager, they click on the Tanzu Gen AI platform, and they can see a variety of models that they’re able to consume. So for example, they wanted to access OpenAI, they would select that plan, or give it a name, Let’s call it Adib, and we’ll hit Create, and that should go off, and it will provision a service that would allow you to get an API key to access this. So you can see here now I’ve got the Adib OpenAI chat service. If I click on this guy here, I can create a service key. Let’s call this Key Test. And once that key is created, I’ll be able to view the actual API key, and There’s a URL here to the AI middleware server, which is part of the Tanzu platform. So I can now go and write a Python application in BlankChain for Python or the official OpenAI Python client, or I can use Java with Spring AI, and I can say, Hey, please send the request to this URL here that’s highlighted with this API key, and that will just give me access.
So you saw this really nice self-service that I can do as a developer without asking permission, which raises the question of, Okay, there’s a curated marketplace. How did things get in here? And that’s where the platform engineer steps in. So the platform engineer would have collaborated with the AI subject matter expert. And through the Tanzu operations manager interface, they’re able to click on the Gen AI on Tanzu platform settings. So you can see here there’s a set of integrations. And one of those integrations is, I’ve defined it here as GPT-4. 0. I can tell it the capabilities that this model has. I can tell it the URL to send this request to, in this case, it’s for real sending it OpenAI. I put in the actual API key to access OpenAI. The previous key that you saw was a key that was generated by our AI middleware server so that we have complete control over it. We know who the app is, we can attach policy and all sorts of stuff like that. Then as the platform engineer, you went down here, you can see the plan name is OpenAI Chat, and it says GPT 4.0 Chat Envision. If I go back here to the marketplace, that’s OpenAI chat, and that’s GPT 4. 0 Chat Envision.
Kelly
So when a new model becomes available, that’s something that is enabled and made available in the operations manager and then can then be consumed by developers.
Adib
Yeah. So let’s just do an example of that. Let’s click here on beta integrations. You can see here we support Azure Open AI, Google Vertex, Gemini models, AWS Bedrock, Titan models. And so I could click on one of those. I don’t know, maybe we click on AWS, and then you can see very similar. I could pick the model name, what capabilities it has, provide my credentials to access that Bedrock model and put some limits, like how many requests per minute are allowed, how many tokens per minute, what region it’s in, and again, how do I want to expose it to my developers? And that would, once I hit apply changes, that will actually become available to all of it. So what’s important to understand is that we’re really all about helping you do whatever you want as a customer. So one of the things we also support is if you want to do VMware Private AI Foundation, that’s fully supported, where we can take those models if you’re hosting air gap within your four walls, and we can make them available. And we also have the ability to run models for you if you choose to do that on CPUs or GPUs that’s built into the platform itself.
John
Because we have that consistent place that all things are flowing through in the AI middleware, we can instrument everything and make sure that we have really fine-grained telemetry in place. So we’ve implemented this in our OpenAI middleware or in our AI middleware. And in this example, as a platform team, I can get visibility into the overall usage of the system. I can look at the different models that are available. So in my system, I only have a Gemma 2 and then an Omec 1, and I have different applications. So I got fine-carrying control over the different applications.
Kelly
All right. So that was a lot. We’re going to leave some links in the show notes for folks who want to learn more. Adib and John, thank you both for joining me today.
John
Thanks for having us.