Blogs

RedMonk

Skip to content

On package management: Negating the downsides of bundling

Package management is the seemingly simple task of administering software installation, upgrade, and removal. Like many things, it’s only simple when you squint your eyes from half a mile away — as you get closer to the problem, it grows increasingly complex. This is part of the reason why, almost every week, we hear about a new implementation of package management, from Linux distributions to iOS apps to, more recently, JavaScript (e.g. npmbower, jam).

Perhaps the largest fallacy of package management is the refusal to learn anything about history — every new package manager pretends as if the problem had never existed in the past and encounters, even occasionally surmounting, all the same issues as had been solved years, or even decades, in the past. It’s NIH syndrome at its finest — but of course, this is simply a microcosm of the tech industry as a whole.

One of the overarching issues going under discussion and experimentation in the package-management world is whether to bundle software dependencies or use global, system-level instances. Think Linux versus iOS — Linux distros install a single copy of libraries such as zlib or openssl, whereas iOS or OS X apps would tend to bundle the same functionality into every installed package that uses it.

Therefore, on Linux, shared libraries are king — a single copy of the shared library zlib.so.X.Y.Z would be used by every binary on the system that calls out to zlib’s compression functions. This provides some distinct benefits over bundling (e.g. hereherehere, and here) such as increased security, increased maintenance costs for any changes in the forked, bundled copy, and decreased size both in memory and on disk (read the previous links for details).

Global copies also have significant problems for the vendor as well as the end user in terms of robustness and reliability, primarily due to API or ABI changes to the library itself. For the vendor, bundling increases predictability — you know very accurately what the relevant environment looks like on every single system, and this decreases your support burden. In addition, when a global library changes in an incompatible way, it breaks everything on the user’s system that uses it, all at once. This can be somewhat alleviated by keeping a copy of the old library around, but you can run into extremely interesting issues when you have a binary using two versions of the same library at once (via various dependencies). That’s exactly why OS X uses bundling.

So what could we do to get the best of both worlds? The ideal scenario for distributors, app developers, and end users would be some kind of blend of the bundling and global approaches that optimizes as many of the above benefits and downsides as possible, for the best possible experience all around. My proposal is to provide bundled applications (but not statically linked, rather a directory containing the app and its dependencies), coupled with two things: (1) use of a linker and loader that prefer the bundled copy while falling back to the system copy, and (2) a package manager with a deep understanding of the bundle and what it contains.

This enables, by default, all the benefits of bundling. At the same time, it doesn’t block the most important advantages of global copies — it allows for the package manager to learn when bundled copies are vulnerable to security holes, it even enables them to be upgraded to compatible versions (or incompatible versions if the app is open source), and it further allows the libraries to be rebuilt by the package manager to “vanilla” versions that are guaranteed to be untouched by the vendor. While this approach doesn’t provide the guaranteed predictability that vendors desire, it does place the priorities of the end user first — a usable, more secure system trumps all.

Disclosure: Apple is not a client.

by-sa

Categories: apps, open-source, operating-systems, packaging.

  • Richard Nicholson

    Sounds like you are trying to re-invent OSGi? The linker is the powerful OSGi requirements / capabilities metadata – and the resolver behaviours. The new R5 resolver specification supports multiple namespaces enabling support for complex environmental and application layer dependencies. The loader? Well the OSGi runtime. Also no-longer  is not limited to Java! Paremus are wrapping traditional applications like MongoDB and installing them dynamically and managing them as OSGi artifacts – but running on the OS instead of in an OSGi framework!  

  • dberkholz

    Not trying to be hostile to OSGi, but if it solved the problem easily and beyond Java, wouldn’t everyone be using it?

  • Pingback: On package management: Negating the downsides of bundling « Striving for greatness

  • http://ringo.de-smet.name/ Ringo De Smet

    Richard, I am very interested to learn about OSGi outside of the Java world. Do you have pointers to public articles of such usage?

  • Pingback: M-A-O-L » On package management: Negating the downsides of bundling