In this RedMonk What Is/How To, James Governor and Ram Iyengar, chief evangelist of the Cloud Foundry Foundation, discuss the evolution and significance of buildpacks in modern software development. We highlight historical context, note Heroku’s introduction of buildpacks as a usability advancement and their transformation into an industry standard. Ram elaborates on the necessity for standardization in enterprise build workflows, particularly given the diverse range of programming languages used by development teams. He explains that buildpacks streamline the creation of immutable artifacts, and more recently, container images, reducing deployment complexity and improving efficiency. Ram provides an overview of the collaborative development of cloud-native buildpacks by the Heroku and Cloud Foundry communities, leading to the creation of Paketo buildpacks. He emphasizes their modularity and compatibility across different platforms, including Java Spring.
The video includes a live demo of using buildpacks to build and run a containerized PHP application, illustrating their practical application and ease of use. The demo underscores buildpacks’ capability to automate the build process.
This was a RedMonk video, sponsored by the Cloud Foundry Foundation.
Transcript
James Governor: Hi, this is James from RedMonk. We’re always interested in anything that makes it easier to package and deploy applications. Pretty clearly, a lot of what developers have to do at any given time feels like yak shaving — makes things difficult. If we go back a little ways in history, the introduction of buildpacks by Heroku back in the day, I think was a big step forward in terms of usability. And what you may not know is that buildpacks have moved on. They’ve evolved, they’ve become an industry standard. And that’s what we’re going to be talking about here today in this What Is/How To video. I’d like to introduce you to Ram Iyengar, the Chief Evangelist of the Cloud Foundry Foundation. And he’s going to be talking a little bit about buildpacks — where we are in buildpacks in 2024. And he’s going to give you a demo showing just how useful they are. So over to you, Ram. Welcome.
Ram Iyengar: Hello, everyone. Thanks for having me, James and the RedMonk team. It’s definitely a pleasure doing some buildpack stuff for everyone. So I just thought I’d show like a few slides explaining some things about the value of the project, the history and stuff before diving into a short demo like you said. Let me know when I can kick things off.
James: Great. Let’s jump straight in.
Ram: So a lot of the value for buildpacks comes from this need for standardization across the enterprise. Now, there’s a lot of work that goes into making the build workflow homogeneous across teams. And with all of this hipster microservices, bring your own language for your services kind of thing that’s going on, there’s a lot of different teams that enjoy working with different languages. I don’t blame them. But it makes it very hard for the operators who sit beneath who are trying to say, hey, standardize this build process. And then for all of the security and compliance stuff that comes later on when you’re trying to deploy and maintain these things, it becomes a real hassle if everybody is just building their own way. Not to mention it works on my device, it works on our pre production. Our test view is okay. I don’t know what’s wrong with production. It avoids a lot of these awkward conversations and prevents a lot of these weird meetings just the day before rollouts from happening. And so buildpacks were born, like you said, out of this magical world of Heroku, where he just had a magic spell and stuff would deploy.
So what buildpacks did, what buildpacks always did behind the scene was to say, here’s the language and family that I’ve detected. Let me check if there are other immutable artifacts that I can cache. If not, let me build the artifact. And then for Heroku, it obviously did the deployment as well. But buildpacks worked up to the point of, here’s a build and here’s an immutable artifact, go to town with it. So that was really the workflow or the contract of the buildpack. And over time, what has happened is we stopped deploying just immutable artifacts, and instead we started deploying containers and container images. And so along the way, in its existence, buildpack sort of evolved to take on this, let me not just do an immutable artifact. Let me instead do a container image. And that entire process was not without a lot of history. So, like any good technology, buildpacks originated in one world, which is Heroku. It was adopted heavily by the Cloud Foundry community as well. This was known as Pivotal back then, and there were two kind of parallel standards that happened for buildpacks. And so Pivotal had their own way of pursuing buildpacks.
They decided to do containers before Docker was a thing, and they called them different things in their own languages. So it was called a droplet. When you use Cloud Foundry, it’s the equivalent of a Docker container. Heroku continued putting out immutable artifacts and exporting those as the build artifact. And then after Docker won the container wars and Kubernetes won the container orchestration wars, it was time for the two communities to say, hey, let’s not do parallel development. Instead, there’s a lot of great ideas that we both independently developed as independent communities, and let’s bring them together. And that’s how the genesis of cloud native buildpacks as a specification happened.
James: There was some commonality. I think it’s important to say that from a user perspective, and this is an experience I’ve had, where it was kind of interesting. You could indeed, one of the things, looking at Cloud Foundry, you could pretty much just pick up a Heroku buildpack and use it. The commonality was, I think, there, and it’s good to see that there’s now, as you say, they’re coming together further.
Ram: Yeah. So both Heroku and the community within Cloud Foundry, it’s more popularly known as Paketo buildpacks. That’s the implementation that’s part of the Cloud Foundry Foundation. All of them are coming together and saying, here’s a whole bunch of open source production ready buildpacks. Start using them, don’t think about tinkering with them just yet. And once you get a good feel for what it can do, then go ahead, customize it. Do some of the more advanced things. They are also modular, that you can cobble them together and say, if I want a React based front end and a PhP based back end, I can have a React PHP buildpack based off of the modular cloud native buildpacks from Heroku or Paketo, or you can club them both and they’re very convenient to work with each other and just work modularly like that. Which brings me to a quick tour of what are the different platforms that support cloud native buildpacks at this time? The obvious two I just mentioned, and because of the nature of the communities that are involved, Java Spring is like a first class citizen. And so the whole Spring ecosystem is heavily invested in buildpacks.
And one of the things that jumps out to anyone looking at this is a lot of the folks here are platform projects themselves. So buildpacks are becoming extremely popular for building platforms upon which you can then enable your engineering teams and things like that. Folks that I’ve left out on this slide are Fly.io and Waypoint by Hashicorp, who are all different platform projects by themselves. Very popular, very focused on that developer experience, who are enabling this whole platform experience using cloud native buildpacks for their developers. Other examples are like Bloomberg built their whole AI/ML platform and they gave a demo at the recent Kubecon that happened in Paris, and they showcased buildpacks front and center. There’s a couple of other companies who also built their AI/ML platforms exclusively using cloud native buildpacks. And all of that is just wonderful to see how people are taking buildpacks, building stuff on their own, and then putting out, I remember seeing a demo, I think two Kubecons ago about ING, the big banking and insurance and finance company. They built a Jupyter Notebook buildpack which I didn’t even imagine would be possible.
So their data scientists basically use buildpacks to set up their, the first steps of their AI/ML pipelines. And I don’t know, for some reason this whole ML ops is so dense with buildpacks examples. And it’s just very refreshing and wonderful to see the different ways in which people just took this technology, started adopting it for their very custom workflows.
James: Yeah, but a fair amount of confidence in terms of number of different deployment targets. You’ve got a these cloud platforms there.
Ram: Absolutely. I mean, a workload is a workload at the end of the day. And the reason I really like buildpacks is they’re not vertical specific, they’re not provider specific, they’re none of these things. They’re a very generic tool that will just plug and play into your existing containers based workflow. And you can use so many different tools to do that. One of the tools that I like to demo buildpacks with is known as Kpack, which is Kubernetes based build service. And Kpack is very convenient to do an in cluster build and it uses buildpacks underneath. I mean, you could guess from the name, but it’s a very convenient way to start consuming buildpacks if you’re on the Kubernetes hype train already.
James: Okay, okay. Well, just before we jump into the demo, quick question. What’s the difference between then a buildpack and a Docker file?
Ram: So I think one of the important things to know when you start using buildpacks is buildpacks and Docker come from completely different ways in which to think about creating a container. A Docker based build uses a primitive known as a Dockerfile. And this Dockerfile basically dictates how your container is going to be structured. So all containers have layers, but in the Docker world, your layers are dictated by how your Dockerfile is structured. Now switch over to the buildpacks world. You have what is called a build image, and you have what is called a run image. And your buildpack specification dictates how these three pieces are going to interplay in order to help you create that final image that is used for you to run the application itself. Think of it as there’s stages in a rocket and then you discard each stage as the payload flies further. It’s kind of like that with buildpacks, whereas with Dockerfiles you sort of carry the entire thing. And so your mileage might really vary in terms of how big your payload gets and things like that.
And so what this allows buildpacks to be are smaller images, so they’re more minimal in terms of what is finally run versus what you start with. They are more modular in the sense you can replace these by distinct layers when you’re trying to update the application. And so because of the combination of this layering and a powerful caching mechanism, buildpacks can be updated way faster than doing a Docker based build over and over again. So you have, because of these fundamental differences, you have so many different things that sort of grow out of this. So the buildpacks path of an update is known as a rebase. And you don’t technically rebuild, but you rebase on existing images and replace just the layers that have changed and things like that. Whereas with Docker based build you sort of have to replace the entire image. And then when you have a container registry, the update from the container registry also flows as full builds in the case of Docker, whereas in the case of cloud native buildpacks, you can just have individual layers be replaced on what is running. And so an update cycle might be days in the case of Docker, whereas it might just be minutes or seconds in the case of buildpacks, for something that you can compare apples to apples.
So because of these fundamental differences between what is a buildpack build versus a Dockerfile based build, you have so many different ways in which these differences sort of bubble up and are exhibited.
James: Okay, let’s jump into the demo.
Ram: Will do. So this is just some of the stuff that we spoke about, but let’s demo mode. So what I’ll do today is I will pick up a very random codebase, I’ll delete the Docker file that’s present in there and run a build. And fingers crossed it’ll actually run. So let’s see. So hopefully this returns something that we can run. I’m going to clone this. So make a directory and then clone what looks, you know, a reasonably modular PHP, twelve-factor PHP application, hopefully. All right, and this has a Docker file and this has the source. We are going to remove the Docker file and then run a build. So yeah, this seems like a fairly straightforward just PHP stuff. Let’s take a quick look at what’s there. Let’s get rid of these famous words and put buildpacks instead. Hopefully a lot of Vim fans will love this video. So what I’m going to do now is use Pack. So Pack is the CLI that basically uses buildpacks. So the best way for you to get any sense of what buildpacks are about, I would highly recommend using Pack.
And I’ll show you all of the URL’s towards the end. But for now, just remember that Pack is the CLI. I’m going to give the container a name. So in this case that’s RedMonk sample. And then while this next step is not strictly needed, I’m going to ask Pack to make use of a very specific builder. So like I said, a builder is a construct that is used to create the whole image. And it’s the base layer that the entire image will be based on. So if you want a Dockerfile parallel, it’s the equivalent of the from statement that you put in the middle. Again, while this is not strictly mandatory, I’m just doing it in order to demonstrate that the Pack build process works in a particular way. So I’m going to be using the Ubuntu Jammy based builder. And so let’s kick this off. Now, I’ll go through the different processes in some detail, but essentially what happens is Pack first pulls the image layer for the base layer. A lot of it is cached if you’re using repeated builds and things like that. But the first time you run it, and I wanted this to be like the full process and not say caching and everything.
So the first time you run it, it’s obviously going to be a little slow when it’s going to collect all of the layers. And again, this is very comparable to a Docker build step. So every time you do Docker build, it will also do a lot of downloading and extracting of the base layers that it needs. And sometimes it can feel like it’s downloading the whole Internet. So what it first does is assembles all of those base layers, much like a Docker build style. And then it goes through five very distinct phases, which are known as the buildpacks lifecycle. Now the buildpack lifecycle consists of a detect phase where it basically detects what’s the language and what’s the language family that is being used by the application. And it does that in a few clever ways. So it will look for certain files. For example, in the PHP world it will look for PHP files and some dependency files and things like that. In the Node.js world, it will look for a package JSON file. It will look for like go.mod and other things in a go world. So for every language family it will detect what the language is and it will call buildpacks in order to do the build for that.
So in this case, you can see here that it’s picking the PHP buildpacks out of the many buildpacks that it has. So out of the nine buildpacks that it has, it found the need for two that were PHP related. And remember, we’ve not specified anywhere what the language family is. We just got this, there’s no metafiles that are exchanged and things like that. Now, the actual build happens in the build stage. The restore phase is the second, the third rather, where if you have some existing layers, so this is your subsequent build and not the first one. It’s going to say restore those layers and get those layers and make them available for this particular build. And then this whole build sort of took around 8 seconds or something, and then it will create the build environment, etcetera. And what’s interesting is it’ll add software bill of materials for every build stage automatically. And so this opens up a whole can of worms, rather a totally different topic in terms of software supply chain security and things like that, which hopefully we’ll get to discuss in another video in future. But I think it’s important to know that all of that stuff is built into the whole build process.
And it’s assigned a start command and it’ll assign the default workspace. And so all of this typically comes from a Docker file, whereas in the case of buildpacks it’s just automatically available. So that’s sort of the big part I wanted to demo in terms of the build, but you’ll only believe me that it’s built if it runs so quickly. To run this container, I’m going to use a very specific port in order to run this. And now it’s running somewhere. And so let me go back to my browser and if I do localhost:80. There you go. So that’s our little HTML from what we, what we saw. But this sort of demonstrates — so some very random code base, you don’t need Docker files, there’s layers, much like any container strategy that you need. And you know, it works. So it’s a very useful tool and if I wanted to do, let’s say a Node.js app, which unfortunately, because we want to constrain this to a small demo, it follows the exact same process, Perl and Python and what have you, all of it goes through the exact same process.
You get a container image with a lot of stuff that’s automatic built in and based on what a developer would expect for their build and test workflow and what they would expect for deployment on production as well.
James: Okay, great. Well, I think it’s important to keep the What Is/How To short and sweet. So we’ve had some explanation about buildpacks, where we are today. We’ve seen a demo, and what they actually look like in practice. So I’d just like to thank you very much, Ram. And that’s another What Is/How To from RedMonk. Thanks a lot.