Skip to content

Collaborative Systems Management

When something goes wrong in your system, one of your best hopes is that someone else has already encountered this problem and documented how to fix it. This works well with open source projects because mailing lists, blog posts, and FAQs are all on the web and have high traffic.

The practice of going to Google and doing those searches doesn’t have a formal name at the moment that I know of. For the purpose of discussion, I’ll use the phrase “collaborative systems management,” which will mean diagnosing and troubleshooting by pulling documented problem/fix write-ups from any source available. Opposite of consuming that data is publishing it, which is currently done primarily on mailing lists and other group forums.

Collaborative Systems Management as a Feature

Systems management tools haven’t “featurized” collaborative systems management as much as you’d expect (or, maybe, “as little as you’d expect” if you’re one of the cynics out there ;>). But, recently, there’s been an explosion of systems management applications that are either using baking collaborative systems management into future releases, or providing this feature right now.

CITTIO WatchTowder

James and I got a briefing from CITTIO last week on their WatchTower product. WatchTower is a “standard” systems management application, but it has some interesting collaborative features, namely integration with O’Reilly Safari. When looking at an event or problem, WatchTower pulls related information in from Safari’s online books, providing admins advice on how to fix the system. That is, part of WatchTower is a systems management mashup with Safari. They claimed they were the only ones using Safari like this.

Splunk Base

Splunk provides another approach to collaborative systems management: Splunk Base, a wiki to collect and tag events, logs, and other systems administration data. For example, there’s a page on events you can track in WebLogic logs.

I haven’t given Splunk Base a thorough look over — it is, after all, Sunday afternoon — nor been briefed on it. But, Dana Gardner has a good discussion of it in his most recent podcast (which is what kicked off this post).


There are other people I’ve spoken with that are baking in collaborative systems management into upcoming releases. Usually, it involves a community web site, users contributing data to that site, and harvesting that data to add in contextual information from the public sites.

Formatting and Sharing

The things that spring to mind about the above are:

  • Could we come up with a microformat for each piece of systems management data? Anyone could publish the data in whatever visual form they wanted, but add in the formatting needed for 3rd parties to easily harvest the data instead of screen-scraping it. That low barriers to entry approach for publishing would mean more collaborative systems management data out there for applications to harvest, hopefully, translating into faster response times to fixing problems. For the microformat gang, it’d mean a wider audience and an interesting foot in the door of enterprise software. Foots in doors are always nice ;>
  • Will providers of collaborative systems management sites allow competitors to harvest the data? Could we establish a site not owned by one vendor, but fed and used by all of them? That is, could we hand over collaborative systems management to the community (which ever one that would be), and commoditize it?
  • Or, if each vendor ends up with their own site, could we get some clever person to layer an aggregate site on-top of all the different sites and provide a normalized view?

Does it Matter?

Ultimately, as the disclaimer above hints at, the proof will be in the pudding. All of this sounds cool, but will it really make the jobs of sysadmins easier, or will it just be a flashy, cool sounding feature that helps sign deals?

In the development world, the ability to search Google for common and uncommon errors in open source libraries makes using those open source valuable. If you couldn’t look up how to resolve Obscure Tomcat Error #54 in a matter of minutes, programmers would be much more reluctant to use Tomcat.

My gut feel is that the same effect would happen in the systems management world: it’s just a matter of getting users to contribute back all the info they come across instead of trapping the information behind their firewalls. Much of “requirements” for this approach are cultural, namely, getting people to contribute.

More than likely, detractors to collaborative systems management will use the need for people to contribute as a weakness. They’ll also say that if anyone can contribute, you’re going to get bad data, which is going to end up hurting you. Those arguments, of course, sound like the familiar OSS freak-outs I’ve been hearing forever and still, unfortunately, hear all the time.

Disclaimer: BEA is a client. Also, we haven’t talked with CITTIO, Splunk, or any of the “other’s” customers or users. Thus, all of the info above is based on what vendors have told us and hasn’t been “verified” as being real or otherwise truly as cool as it sounds ;>

Categories: Community, Ideas, Systems Management.

Comment Feed

One Response

Continuing the Discussion

  1. Briefing – 20060328 – CA Service Management

    RedMonk Take: Who: Janice Thomas (AR), Arlen Beylerian (director product management), Robert Sterbens (director product marketing). Duration: 1 hour What: CA Service Management r11 Key Takeaways: CA has established a nice, clean message around what it…