I recently talked with Reductive Lab‘s Luke Kanies and Google’s Nigel Kersten on the topic of Puppet. First, we go through a quick overview of what Puppet does – establishing the desired configuration of machines by modeling services and then enforcing that model.
As I note in introducing Nigel, while Puppet is well known for managing servers, I haven’t heard about it being used too much to manage desktops, making his experience that much more interesting. On this note, later in the conversation, Luke paints out the many different scenarios that Puppet is used in: from servers, desktops, to new situations like virtualized installs, even on Amazon EC2.
Using Puppet at Google
Nigel has been using Puppet to manage “many, many thousands” of Mac desktops used at Google by developers and others. He tells us how he got involved in using Puppet last year during WWDC last year and quickly applied its use to managing Google Mac desktops.
How Puppet Works
I then ask Luke and Nigel to tell us how people usually get started with Puppet. Both recommend starting with a very small service to get started quickly, for example, managing
sudo or SSH. As Luke explains,
sudo is the command that allows users to execute other commands with administrative privileges, and managing it means ensuring that
sudo itself is permissioned and configured correctly for use. Nigel says that, indeed, this is exactly the service they started with.
We then dip into the details of Puppet by talking about the modeling language that it uses. While Puppet is written in Ruby, the modeling language isn’t, being more like “the psuedo-code you write down when you’re planning what a program should look like,” as Nigel says. On the topic of the modeling language, Nigel comments on new user’s common reaction to the language, namely looking for something more script-ish. The point of the language is to simply model resources rather than describing in detail how to go about configuring those resources. As such, there’s more giving up control on how configuration desires are fulfilled – focusing on the what and ignoring the how, as Luke says.
Deploying Puppet and Ongoing Use
At this point, I get curious about how Puppet itself is configured and deployed. Each machine to be managed needs the Puppet agent installed that works with the main Puppet server. Nigel also tells us how his team tracked down the unmanaged desktops in Google.
Here, we get into the ongoing use of Puppet once the initial setup is done. Luke talks about his ideas that admins and operations people would benefit from thinking more like developers – using the term “infrastructure developer.” For example, Nigel talks about using a version control system to keep track of the configuration models used by Puppet and Luke talks about work that he and Andrew Shafer (at Reductive Labs as well) are doing around brining unit testing to operations.
Expanding Puppets Use in Google
Finally, while wrapping up, Nigel tells us that his group has convinced the people using CF Engine to manage Linux work-stations to start switching over to Puppet. More than just cause for Luke to do a little dance, this is interesting because, as Nigel says, it’s encouraging the Mac and Linux operations groups to collaborate more which, one would hope, would increase their overall effectiveness both in human terms (reducing repetitive work across the two groups) and work-product quality (making sure both actually have the exact same effect when desired across both platforms).
More on Puppet
Also, if you’re interested in more on Puppet:
- Listen in on a discussion of how Shopzilla uses Puppet to manage it’s servers and data-centers.
- Check out these two videos with Luke from last year: on Puppet itself and then on boot-strapping and open source company.
(MC = Michael Coté, LK = Luke Kanies, NK = Nigel Kersten)
MC: Well, hello everyone. It’s a special edition of Redmonk Radio and, uh, today we’ve got two special guests with us. Would you like to introduce yourself first guest number one on the phone?
LK: This is Luke Kaines. I’m the author of Puppet and, uh…yeah, I’m here at Portland, Oregon doing a training class on Puppet right now, actually.
MC: Well, that sounds exciting, and special guest number two?
NK: Hi, uh, my name is Nigen Kersten and I’m the technical lead for Macintosh Operations at Google, so I manage the whole corporate side…
NK: …of the whole production side, the Gmail and search engine
MC: Right. So, so you’re managing the desktops that actual Google-ers or whatnot are using.
NK: Yes, we, we have kind of, uh, I guess…we…because we are such a large organization, things are split up quite nicely where we’re not responsible for sort of frontline desktop support more for sort of building the infrastructure making sure that the Macs work happily with everything else. So we’re part of the general, unique (?) organization, but I run the sort of (?) within that produce just the Macintosh units.
MC: And, and that’s that’s actually one of the, you know obviously, what I wanted to talk with you guys about today, to have both you and Luke on here to, uh, talk about, you know, how Puppet is being used at Google to manage the desktops. So, so along those lines do you want to briefly explain, Luke, what Puppet is and what it does?
LK: Puppet is a tool you can use to essentially describe how you want all your computers to look in, um, not too much detail but sufficient detail and then it will take responsibility for making sure that they actually look like that. So, it’s the tool that, in most cases you can, you can basically use it to get all the work done on all your machines and you should no longer have to login to do much of anything. Um, and it provides centralized, you can get centralized login, and, um, yeah, that’s basically what it does.
MC: So, so do you think it’s sort fair to say that you, you, I mean, in very broad terms you sort of create a model of the ideal sort of set-up or configuration that you could have either for the machine or something else and then, uh, sort of the, the you know, part of, the first part of Puppet is creating that model, or being able to create that model, and then the second part is being able to enforce that model, if you will, or make sure that model is being used on the machines or set-ups.
LK: Yeah, it’s a little more granular than that…
LK: …because usually what you do is is you create a model for every, every, service class you care about, so this is what my Apache servers look like, and this is what my database servers look like, and this is what machines in the data center look like, and this is what machines that this dudes built look like and then any given machine you’re going to intersection of all of those configurations and its configuration is the result of combing all those and figuring out where the conflicts are and resolving those conflicts.
MC: Right, so, so one, one of the interesting things, Nigel, you know is that a lot of the, the talk I’ve had or discussion whatever a lot of the ways I’ve encountered Puppet have been sort of more server configuration if you will, you know something that maybe is not a desktop sort of think like you guys are using so, how did you come across to think to use it to manage desktops rather than servers?
NK: So, um, when I arrived at Google, Google brought me over from Australia to the US offices middle of last year and it was just before that whole World Wide Developer Conference that we’re about to have again next week and I was kind of looking for a solution to how to manage everything at Google. The Macs here have very much grown organically. There’s, there’s very much a philosophy here to try to give the engineers whatever tools they want to feel as if they’re as productive as possible.
MC: Right, because they’ll probably get whatever tool they want even if you don’t allow them to have it, right?
NK: Yeah, and…
MC: I mean that’s what I remember from being an engineer.
NK: Yeah, and there’s very much a general philosophy here that it’s not worth spending time trying to block whatever engineers want to do because really your engineering time is the most valuable resource that you have in the company and the corporate infrastructure should be about enabling them instead of telling them what they can’t do. So we have an awful lot of Mac laptops here because, honestly, if you’re working in an Linux environment and you want a laptop, Linux on a laptop just isn’t quite there as far as a whole lot of ease of use with wifi…
MC: Right, right.
NG: …and things like that. So, we just had an awful lot of Mac laptops here. I think we’re sitting, I think we’re well over 90% of our Macs are actually laptops instead of desktops.
MC: Oh, really?
NG: So, yeah. When I arrived, I was kind of looking for a way to try and manage it and I came from a university environment where you had much more sort of iron-clad control over the lab machines and you would, we used to use a tool called Redmined that is very much like a triple (?) system where I just say, I want every file on the operating system to look like this and it would reapply those changes then everything is changed. Here at Google, we really couldn’t get away with such a tool because we’re trying to support a really wide range of uses from sort of marketing and (?) sales people to which their machine is an appliance to colonel hackers and people writing drivers and people who expect to be able to (?) at a really low level. So, I was looking for a tool that sort of covered that whole range of allowing us to just specify a minimal state for the machines and make sure that state was ensured as far as enforcing a security policy.
MC: Right, right. So you could sort of specify instead of specifying what the whole machine looks like more just a subset of it. Like, I’m only interested in this part being maintained and people can go off and do whatever crazy stuff they want and it won’t necessarily affect that.
NK: Exactly, and I think that’s one of the really strong points about Puppet in this sort of systems configuration management. It makes it really easy to get started. You can just go, “OK, what’s one thing that I want to manage on all my machines…
LK: Right, Right.
NK: …and you come up with a model for that and you know you’re not breaking everything else on the machine. The barrier to entry is actually really low because of that.
MC: Right, that makes sense. So, so how many of these desktops are you guys currently managing?
NK: So, we have, um, I can’t actually get too specific…
MC: Sure, sure.
NK: …but we have many, many thousands of Mac desktops and laptops. So,what actually happened was, I turned up to WWDC, and Jeff McCune, who used to work at the Ohio State, who did a lot of the early back-work with Puppet and he’s a good friend of mine who had sort of been in my ear for a long time going, “You should try Puppet. You should try Puppet. It’s amazing.” And I turned up with my new workmates. I’d only been a Google a week or two and we went to Jeff’s talk on Puppet and that really impressed everyone and actually by the end of the talk we were sort of VPN’ed into service back at Google. We had clients applying at sort of minimal configuration set and within a week or two we had Puppet doing real work on hundreds and hundreds of machines in the pilot.
NK: So it was really fast to get started.
MC: And, and, and, you know, maybe both of you can sort of contribute, you know, answer this is you want to. I mean from taking that experience like you, last year, were saying you were at WCD. What sort of like experience of getting started with Puppet? Like, what, what’s kind of the first step you go through and then what’s the next step and so forth and so on?
LK: I was, um, I was working with something small but important and in those environments small and important almost always means it’s pretty simple to manage, but its also the thing that if it is broken then you can’t recover your computer very easily and if it is working then you can fix almost anything else if you have to. Um, the next (?) would be SSH and you can probably expect to be managing Sudo n, you know, minutes and (?) on one machine and then getting to the point where you have it on a number of machines is probably minutes per machine once you have a base idea of how it works.
MC: And so, what do you mean example, I mean, what do you mean, exactly, of managing Sudo or managing SSH?
LK: to the thing that matters most was to, um, the correct configuration would have to be in place, but most important that’s not usually what breaks, what usually breaks is that the mode of both the Sudo binary and the Sudo config. file has to be exact and if they aren’t exact then Sudo will just tell you, “I’m sorry I refuse to work because somebody may have made some change.” (?) you may have to wait a few minutes, but pretty soon they’ll be able to get back in and get (?) again. Over the course of my career that been one of like the, “Oh, no that’s broken.” You can’t fix anything…
LK: …You have to reboot the computer and run it on some sort of weird mode or you have the “break glass in case of emergency” root password somewhere that someone has to use to get on to the docs and, um, managing the computer is managing those files and the modes to get on those files.
MC: Right. And, and, so in the, um, in the kind of modeling that we were talking about earlier, you would sort of specify, you know that idea of like the Sudo-service or whatever you might like to call it. Like, here’s this ideal state that I want it to exist in. And then I would assume, and you can correct me if I’m wrong, that you would sort of apply that state to a given set of machines or, you know, OS-es running somewhere if they’re virtualized or physical.
LK: Exactly. And virtualization is another place where Puppet kind of shines because everyone has these great tools to manage how many virtual machines you have and whether they’re running and what they’re doing except that none of them really manage inside the virtual machine. Anyone can say, “Start a 100 of these!” and then they’re like, “What’s wrong?” And you say “I, I don’t know.”
LK: So, you have these great tools to manage whether you have machines running. But, once they’re running you’re kind of stuck until you’re back in the battle days of SSH and a (?) So, Puppet, you know, is great for what Nigel is doing and people use it for servers all the time but there’s also, uh, great for virtualization because it has this, uh, this ability to get inside the virtual machine unlike most of these virtualization managers.
MC: Right. And, and so when, when you were applying this to managing the desktops in Google, Nigel, how did you come up with the state that you wanted? Was that something that you guys had already figured out and you just had to map it into Puppet? Or, or how did that process work out?
NK: Um, so we actually started with exactly the example that Luke’s talked about with managing sudos and by the end of the talk we had (?). What is the vocab to be? It’s like a five-line Puppet syntax saying, “I wanna manage the permissions, owner, and group on the sudo file. It became really apparent even if you train Puppet at the simplest level of managing certain attributes of files, that’s an incredibly powerful thing. That by no means is all that Puppet can do. But simply managing the permissions, the owner, and group and then at the next stage the content of certain files is incredibly powerful and there are a lot of tools that do that, but Puppet has a really clear and really nice syntax of doing this and anyone who’s had the most, sort of, basic and introductory experience of scripting can pick up the Puppet syntax. It’s incredibly simple.
MC: And the syntax, or scripting, is sort of a subset of Ruby if I recall, right?
LK: It’s more…It’s a custom language that I wrote, (?), what some of them call internal DSLs, which are syntaxes that look like a special custom language that are actually just pure Ruby. Um, Puppet actually has it’s own custom language. I’ve written my own parser and my own grammar, um, lexer and abstract syntax treeand that kind of fun stuff that no one knows anything about.
LK: And no one would want to.
LK: And, and so it bares some resemblance of Ruby, it’s kind of funny it’s kind of this weird mash-up of Ruby, um, the (?) configuration file format and, you know, maybe some of my own twisted (?) in there for fun, um, but it’s not pure Ruby, it’s written Ruby, the whole thing is, but, um, the language itself is definitely is not Ruby.
MC: Right, right.
NK: I find that the language looks quite like pseudo-code in a lot of ways. It’s very abstract and because it’s so simple it almost looks like it’s the sort of pseudo-code that you write down when you’re planning what a program should look like.
MC: Ah, right, right. I mean that always seemed like one of the great promises of a domain-specific language or a DSL you can be more natural speech with it rather than friendly to the compiler or the computer, which is always an interesting aspirational thing for any DSL to see if it actually works out. It seems like having the narrowest scope possible is what enables that to actually happen rather than getting distracted.
LK: Yeah, and multiple times people have talked about replacing Puppet’s language with a full Ruby internal DSL narration and all these things. And there’s always somebody in the community who says, “I don’t want that, I want my sys admin to be totally limited. They can only do a few things.” Um, but also there’s this big benefit of having to think about things in a Puppet way. Puppet is all about resources. Puppet is not a scripting language, it’s not even to manage files. It’s not even to say, ” This file should contain this extra string or whatever.” And, um, if you follow how Puppet works, it works very well, and if you want to use all these crazy other language features, you’re probably not going to get the most out of Puppet because if you do that you’re basically trying to use Puppet like it’s Pearl. And if you’re doing that then just use Pearl.
NK: And it’s a very common path you seem to see people go through when they enter the Puppet community. I know I went through this where you sort of go through the language and the syntax guides and you’re like, “Is that all? Like, is that all the language can do?” And then you’ll see people post to the mailing list going, “You know….I really wish I could do blah, blah, blah.” Which the language doesn’t seem to support and you’ll see the old-timers respond and go, “Look, you’re actually thinking about this the wrong way. You need to abstract what you’re trying to do in a more… you need to keep the resources that you’re trying to manage and define them in a more abstract manner. And that’s the point at which Puppet gets much more powerful.
MC: Right. So, so I mean what would you say moved you from the new user to the old timer mindset? Was there some chain of events or specific thing that happened?
NK: Um, so, I spent an awful lot of time working on just managing parts of files and making sure that a certain line exists. I was doing…You have these two really base building blocks in Puppet, which basically let’s you shell out to a command and file. And you can come up with really horrible, convoluted puppet syntax that like, you know, “If this (?) holds true, then add this line to this file. If it doesn’t hold true, then do this.” But, you’re really just creating a lot more work for yourself and what you want to do is define a resource in the Puppet syntax or in native Ruby. And once you’ve done that, you end up with a much simpler Puppet syntax to manage the resource that you’re actually concerned with.
MC: Right, right, it sounds like a very declarative mindset of things where instead of, uh, specifying how you want a desired state to come about, you just declare what you want that desired state to be and let Puppet manage that for… Let the actual execution, you know, manage that for you.
LK: And that’s kind of Puppet’s key cue-in is that because so many people are managing similar or the same sort of (?) or in (?) different environments, being able to focus on the what and kind of ignore the how is very valuable because they don’t have to
worry about whether they’re using RPMs or whether they’re using (?) or crazy things like that. In fact Puppet has tools you can use to query the current system. I’m slowly, but blissfully, forgetting how to use a lot of these tools because Puppet can install these packages or create these users or start these services faster than I can remember or extract how to use the local tools to use them to do so and then run that command like that. It’s not quite at the point where it’s always faster to use Puppet to do interactive work, but it’ll probably do that pretty soon.
NK: Yeah, definitely. I find my self creating small little snippets of Puppet syntax and running them in an ad-hoc manner to do things that I would have previously written much more complex command lines to do. Because once you’ve got that work and you’ve extracted things away, it becomes much simpler to do it in Puppet..
MC: Right, right, you can actually start using software as sort of more of a minion or henchman instead of a tool that you have to handle directly yourself. Which always seemed like… that seems nice.
MC: So one of the other things that you were mentioning that I was curious about is, um, so essentially you’re declaring the ideal state or the model. You’re describing all these things, so how do you set things up…how does Puppet push that out and enforce that? I mean, does it do it over SSH or is there an agent on the machine or how does that work out?
LK: Well, um, Puppet doesn’t push, it pulls, so you’ve got your server sitting there cleaning it’s nails waiting to do something. And every half an hour by default, you can tune this, but by default every half an hour every client wakes up and does “hey, whats my configuration?” pulls that configuration down and applies it. Um, I’ve used every half an hour, but in truth you don’t want every machine on the hour , every half and hour, to wake up and do this. You want to kind of…
MC: Right, you want to stagger it out, right?
LK: …Exactly. So what Puppet does is that basically it assumes that you’re starting your machines at, um, that when you set the Puppet servers, it will essentially at random based one when you start the computer up and tings like that. And, so, um, it gets most of it’s low-down thinking of that whenever the servers are started it’ll start waking up every half-hour from that. But there’s also modalities in Puppet to enforce an additional stagger to say make sure our machines don’t hit at once.
MC: So, so, how did you manage, Nigel, getting the client on the thousands of desktops? What was that like?
NK: Um, so, we had this rather large slate of unmanaged machines , so all we really knew about them was that they had a certain administrator and user password. So, essentially what we did is we diverted all the syslogs from all our DHCP server, scan them for Macintosh specific like hardware, Mac prefixes…
NK: ….Found those machines. SSH-ed into them and installed Puppet.
MC: Oh, right, right. That’s kind of clever.
NK: It was kind of clever. I was kind of impressed. One of the guys on my team came up with that. And it was really quite amazing because as I was watching the Puppet server and seeing clients sort of get created and start to check in, it was almost sort of this viral thing…
NK: …But as users started connecting their machines to the network, suddenly Puppet starts appearing on it and managing a very minimal configuration. But once it’s on there, the possibilities are endless. You no longer need to SSH into the machine and d anything. It’s just, you change the Puppet server and the clients change appropriately.
MC: Right, right, right. You can go address your coliseum of Puppets and have them go out and do your bidding, as it were.
MC: So that was actually the last thing that I was interested in. Once you have it set up, like what, I mean, both on the sort of configuring the ideal model and everything and then also jsut having it, uh, take place, I mean what’s been the experience of updating it and just keeping things on-going and updated. Like, when there’s some new item that you find you have to manage on the Macs.
NK: Um, so, we’ve actually been, one of the relatively recent additions to Puppet has been the idea of environments where a client machine can say, “This is my desired environment.” And the server, you apply different configurations based upon the environment the client tells you it wants to be in. So we use that with, um, we use that as sort of the standard, unstable testing stable release (?).
MC: Uh huh.
NK: So those of us who sort of deal with Puppet every day live in the unstable branch and regularly break each other’s machines and yell at each other over IRC.
NK: And once that’s working happily, we push it out to the testing branch which tends to be our more interested field tech and front-line support people. Which also gives them the advantage of knowing what (?) are coming to the end users who they’ll have to support. And once we go through that process and everyone on the testing group is happy with what Puppet is doing, then we push to our (?) branch. And hopefully our clients don’t really notice that anything has happened. One thing that is kind of the missing piece in what I’ve jsut described is version control and really core to the concept of how you manage Puppet is that every thing’s in version control. So what we actually so is we check everything into our version control system and our various Puppet servers around the world pull from version control at regular intervals.
MC: Oh, so…
NK: So you can have peer review and all that sort of…
MC: You use Version control as sort of a repository for configuration that Puppet pulls from.
LK: And, really, like I’ve been pushing the whole developer thing to think more like sys admins for a long time and then this is definitely part of it. Most of sys admins that started have eventually adopted version control, but Puppet really, you know, it’s just not that. You can use Puppet as operations control but you will die. You will go crazy. You got all this code, and if you’re trying to manage this code…developers learned a long time ago, well, most developers learned a long time ago, you’ve got a bunch of code, you need the version to deal with that code. You need to be abel to take advantage of the ability to easily roll back, do queries, do logs, um, you know svn blame is nice. Um, and so….
LK: So starting to use the tools that developers are used to using and I’ve really perfected how they’re used in solving system problems. It’s jsut clearly a win. I’m pushing towards also making it so you can write units after your Puppet code. Um, that’s something that I plan to have available by the end of the year. And that’ll be one step closer to having the same rigor that developers have provided in our systems stuff. Um, my partner, Andrew Shaeffer, just came up with the term “infrastructure developer” because that’s kind of what I’m trying to turn people into. You know, everyone walks into Puppet as a sys admin and kind of (?)…I’m not really doing what I thought with systems stuff, which is kind of cool.
MC: Yeah, yeah, that is interesting. That actually, um, that’s a wider trend that I see a lot of people talking about it trying to, uh, get development mentality and operations mentality from both sides of the fence to be closer with each other because that seems…
NK: Definitely, yeah.
MC: …The IT that we have nowadays.
NK: One of the things that I think that’s along with una-tests and having version control is something we’re looking forward to is continuous builds, where we have dedicated clients continually applying the Puppet configuration. So that when you check a change into Puppet, you can go, “This should only affect Linux laptops or this should only affect Macintosh desktops with these kinds of users who are using them.” And once you’ve tagged your submission to the conversion control system, the continuous builds run on all your clients and the Puppet configurations doesn’t actually get checked into the real servers until the continuous build machines all say, “Well, yes, this is what’s actually happening.” Or you might push out a change that you think only affects your Linux workstations and then my continuous build Mac machine goes, “Hey! Look, my state that I meant to be in has actually changed that (?) pushed to the real servers. So that’s something that we’re aiming for.
MC: Right, right, that sounds interesting. Well, did you guys have anything else that you wanted to mention or talk about before we wrap up?
NK: Um, I guess the only other thing was that although I’ve really been talking about Macs here, one of the nice things about what we’ve done with Puppet is that we’ve managed to convince all the people who were previously using CFEngine to manage their Linux workstations to switch over to Puppet. So, going forward all our corporate Linux platforms are going to be managed by Puppet, not just the Macintosh’s.
MC: Oh, great.
LK: I’m doing a little dance over here. I’m very excited to hear that.
[MC and NK laughs]
NK: It is very exciting, and it’s actually been really good here because it’s encouraged a lot more collaboration between the Linux and Macintosh teams because it does become kind of simple as, I guess, another example of the resource would be a directory service node so that we can jsut sit here and go, “OK we want all our clients to be talking to the specific L-lap server.” And the Puppet syntax looks very simple but behind the scenes what you have is a much more complicated type that goes, “OK, this is a Macintosh. Let’s set it up using open-directory, directory service nodes, make sure this node is connected and clients can talk to it. And the Linux boxes sets up the the appropriate configuration.
MC: Right, well, that does sound interesting. Yeah. Well, uh, thanks to the both of you for taking the time to talk with us today. I appreciate it.
NK: Thank you.
LK: Thank you.
Disclaimer: Reductive Labs is a client and sponsored this podcast.