On package management: Negating the downsides of bundling

Package management is the seemingly simple task of administering software installation, upgrade, and removal. Like many things, it’s only simple when you squint your eyes from half a mile away — as you get closer to the problem, it grows increasingly complex. This is part of the reason why, almost every week, we hear about a new implementation of package management, from Linux distributions to iOS apps to, more recently, JavaScript (e.g. npm, bower, jam).

Perhaps the largest fallacy of package management is the refusal to learn anything about history — every new package manager pretends as if the problem had never existed in the past and encounters, even occasionally surmounting, all the same issues as had been solved years, or even decades, in the past. It’s NIH syndrome at its finest — but of course, this is simply a microcosm of the tech industry as a whole.

One of the overarching issues going under discussion and experimentation in the package-management world is whether to bundle software dependencies or use global, system-level instances. Think Linux versus iOS — Linux distros install a single copy of libraries such as zlib or openssl, whereas iOS or OS X apps would tend to bundle the same functionality into every installed package that uses it.

Therefore, on Linux, shared libraries are king — a single copy of the shared library zlib.so.X.Y.Z would be used by every binary on the system that calls out to zlib’s compression functions. This provides some distinct benefits over bundling (e.g. here, here, here, and here) such as increased security, increased maintenance costs for any changes in the forked, bundled copy, and decreased size both in memory and on disk (read the previous links for details).

Global copies also have significant problems for the vendor as well as the end user in terms of robustness and reliability, primarily due to API or ABI changes to the library itself. For the vendor, bundling increases predictability — you know very accurately what the relevant environment looks like on every single system, and this decreases your support burden. In addition, when a global library changes in an incompatible way, it breaks everything on the user’s system that uses it, all at once. This can be somewhat alleviated by keeping a copy of the old library around, but you can run into extremely interesting issues when you have a binary using two versions of the same library at once (via various dependencies). That’s exactly why OS X uses bundling.

So what could we do to get the best of both worlds? The ideal scenario for distributors, app developers, and end users would be some kind of blend of the bundling and global approaches that optimizes as many of the above benefits and downsides as possible, for the best possible experience all around. My proposal is to provide bundled applications (but not statically linked, rather a directory containing the app and its dependencies), coupled with two things: (1) use of a linker and loader that prefer the bundled copy while falling back to the system copy, and (2) a package manager with a deep understanding of the bundle and what it contains.

This enables, by default, all the benefits of bundling. At the same time, it doesn’t block the most important advantages of global copies — it allows for the package manager to learn when bundled copies are vulnerable to security holes, it even enables them to be upgraded to compatible versions (or incompatible versions if the app is open source), and it further allows the libraries to be rebuilt by the package manager to “vanilla” versions that are guaranteed to be untouched by the vendor. While this approach doesn’t provide the guaranteed predictability that vendors desire, it does place the priorities of the end user first — a usable, more secure system trumps all.

Disclosure: Apple is not a client.

6 comments

Richard Nicholson says:

November 12, 2012 at 5:37 pm

Sounds like you are trying to re-invent OSGi? The linker is the powerful OSGi requirements / capabilities metadata – and the resolver behaviours. The new R5 resolver specification supports multiple namespaces enabling support for complex environmental and application layer dependencies. The loader? Well the OSGi runtime. Also no-longer is not limited to Java! Paremus are wrapping traditional applications like MongoDB and installing them dynamically and managing them as OSGi artifacts – but running on the OS instead of in an OSGi framework!

dberkholz says:

November 12, 2012 at 7:39 pm

Not trying to be hostile to OSGi, but if it solved the problem easily and beyond Java, wouldn’t everyone be using it?

On package management: Negating the downsides of bundling « Striving for greatness says:

November 13, 2012 at 9:32 am

[…] and package-management frameworks to integrate well and deal with issues like security? I muse upon this over at my RedMonk blog. Share this:TwitterFacebookEmailLike this:LikeBe the first to like […]

Donnie Berkholz's Story of Data

On package management: Negating the downsides of bundling

6 comments

Richard Nicholson says:

November 12, 2012 at 5:37 pm

dberkholz says:

November 12, 2012 at 7:39 pm

On package management: Negating the downsides of bundling « Striving for greatness says:

November 13, 2012 at 9:32 am

Ringo De Smet says:

November 14, 2012 at 9:19 am

M-A-O-L » On package management: Negating the downsides of bundling says:

November 14, 2012 at 12:04 pm

RedMonk’s analytical foundations, part 4: 2011–present – Donnie Berkholz's Story of Data says:

April 3, 2015 at 9:04 am

Leave a Reply Cancel reply

About

Sponsor

Subscribe to Blog via Email

Recent Comments

Archives

Recent Posts

Recent Comments

Categories

Meta