Recently, I sat down with IBM’s Phil Fritz to talk about Tivoli Live – he’s the product manager for the service. We start out with a discussion of what Tivoli Live is, the types of things it monitors, services it offers (like reporting) how the SaaS angle works, the types of users they’re seeing so far, and what we can expect in the Tivoli Live roadmap. After we lay that ground work, we get into a brief, but thorough demo of Tivoli Live, from logging in, to looking at the monitored infrastructure, to browsing the available reports.
And, if you’re interested in more, check out my initial write-up of Tivoli Live from late November.
Interview
Transcript
Michael Coté: Well, hello everybody! Here we are in the Austin Tivoli offices to talk about one of the new releases that’s come out pretty much towards the end of 2009 and that’s the Tivoli Live offering that IBM has come out with. To go over that, we have a guest with ourselves. Would you like you introduce yourself?
Phil Fritz: Sure, my name is Phil Fritz. I’m a product manager in Tivoli and one of my responsibilities is the new IBM Tivoli Live Monitoring services that we just announced.
So what we’ve done essentially with Tivoli Live Monitoring is that we took our portfolio for resources in application monitoring, we were using the same software that we sell on premise for customers that have been using in the enterprise and essentially partnered with our colleagues in GTS to made this software available over the web using a monthly subscription model.
So really the differences are more in terms of the business model, the subscription, and the customer experience in using it as opposed to new technology. So we are really working off of our existing software offering. So we haven’t dumbed it down or tweaked it in any way shape or form from a functionality perspective.
Instead, what we are doing is hosting the management servers and the infrastructure, allowing customers to sign up for a certain monthly fee, and they can download the monitoring agents and instrument them to talk back to our hosted servers.
Michael Coté: It’s kind of like air diagrams: it sounds like essentially you’re hosting the – I always call it the “brains,” but – sort of the central part of the monitoring platform somewhere in the cloud or on your servers and essentially someone might have stuff behind the firewall they want to monitor and so there’s agents that they download or some little piece of software depending on if its agent or agentless.
Phil Fritz: Right.
Michael Coté: But there is a piece of software that they download behind their firewall that basically talks to the “brains” in the cloud if you will.
Phil Fritz: So there’s three essential services we offer as part of Tivoli Live Monitoring services, so you described two of them very well. There’s what we call out “touchless” or “agentless” service, which is our technology for downloading a data collector that can reside in your environment, but we don’t have to touch or reside on every single server. We collect data, really availability information more than anything else, things that are available natively the operating system or a network device you are trying to monitor.
So from a customer perspective, you only really need to download one thing and they have pretty good visibility – is it up, is it down, back in to he service.
The second service we offer is our Distributed Monitoring Services and those already using the agent-based technology. The difference there is that we do have to deploy those agents of data collectors on each device you want to monitor. The benefit is you get much richer set of data, much deeper information that you would that while using the agentless as well as you get the ability to automate.
So you can actually take actions on the computer. So if it is something like deleting temp files to make more capacity available on that machine or actually recycling that machine to – use that time auto condition if it’s not working in –
Michael Coté: I always call it the “problem fix.”
Phil Fritz: – exactly, reboot, and restore. So anyway that’s the tradeoff, is that you do have to deploy them locally but you get much more powerful automation functionality out of that.
There is third service we call Performance Services, which is a tool that does historical reporting and analysis to help you with capacity planning.
Michael Coté: For people who are more familiar with Tivoli, how does GTS fit into this?
Phil Fritz: Well, GTS fits in to this in lots of interesting way. So for those that may not know Global Technology Services is what we refer to when we say GTS and they offer a variety of capabilities in the service management space. So they do consulting for you, so if you want to start your service management practice these are experts that can come in and help you identify which processes to automate and, obviously, love it if they use Tivoli to automate a lot of those processes.
But they also provide managed services, so that if you wanted IBM to take care of the infrastructure management of your IT we’ve got managed services to do that, to manage your servers, your storage, a lot of different pieces of your infrastructure.
Michael Coté: What exactly are the different things that the data collectors interact with in the different sources – what are the things that get monitored?
Phil Fritz: We have extensive monitoring for Operating Systems at the operating system level. That’s traditionally where a lot of our customers start with, is they’ve got Windows, a variety of flavors of Windows, a variety of flavors of Linux, a variety of flavors of AIX including things like HPUX and other Unix types. So that’s kind of the basic starting point as well as some simple network devices.
Michael Coté: Great.
Phil Fritz: — be able to ping things, capture SNMP traffic, things like that, so the basic up and down. We also have monitoring available for a lot of Microsoft applications as well. So things like SQL Server, Microsoft Exchange, BizTalk, what we see in mid-range business is just a variety of applications in that area and also some middleware. So web servers, JBoss things like that, thing of that nature, WebSphere are obviously, the major players there.
Then finally we also have some packaged applications like SAP. So if you are running some mid-ranged servers with SAP those are things that we’ve got available.
Michael Coté: Right. Tivoli is usually thought of as a big enterprise thing that you’ve got teams of people running it and taking care of it, and yet your guys are targeting this at the mid-market?
Phil Fritz: Right.
Michael Coté: What are other aspects that you’ve changed around a little bit as far as whether it’s sort of the pricing that you have or just various other aspects that fit for the mid-market?
Phil Fritz: So first of all, the pricing is the biggest part of it; the model as well as the price point. So as I said, I mentioned we offer a fixed configuration and part of the service delivery is structured in a way to make the price points and the model, the monthly model – that’s the first major elements that we’ve done.
The other things what we’ve done is we’ve put a lot of investment around self-install scripts for the agents themselves. So when the customer goes to the portal and downloads the agents there is actually a script that helps them deploy that agent without them really getting to know a lot of the details and configures that agent to talk to their private instance of the management server.
So we are really reducing the amount expertise required to get the system up and running and to that there’s also a lot of self-enablement capabilities, a lot of knowledge-base that GTS has developed over time to say, here are the best practices for deploying these pieces and we’re making those all available. As well as all other regular ITM documentation, all of the regular IBM support documentation, all of the tips and tricks for using ITM are that we give to our regular enterprise customers are also available to customers of the services.
The optimizations we’ve done for this environment is really – each instance can support about up to 500 monitored resources and that is, for now, the target. We’ve got in terms of how we offer in terms of conditions of our service.
And that seems to be good number of in terms of number of managed things that we are really targeting for in this space.
Michael Coté: So you are saying that GTS kind of contacted Tivoli as wanting to have this form factor, if you will, the sort of packaged way, this way of selling to the mid-market and I’m curious then since it seems like it was driven by demand that was sort of existing – like how have you seen people been using Tivoli Live, what are the different types of organizations that have been using it?
Phil Fritz: Well we get a pretty wide mix of customers across a variety of industries sectors. I would say some of the early customers we are talking to on this are in the retail, in the Telco and media spaces. So there is a lot of government interest especially outside the United States for the service.
Typically the situations are again sometimes geographically disperse organizations that may have lots of branch offices or retail offices or manufacturing facilities, government agencies that have responsibility for a certain region or maybe across a country.
Sometimes large enterprises, but at a departmental level that they may not want an enterprise-wide deployment but they will have a particular area or two that they want to — that the service would be an alternative to deploying this off and having the staff then in those places from our GTS partners as well as our Tivoli business partners there’s lots of opportunities to layer on value-add services on top.
So, for example, a lot of Tivoli business partners are very excited about the service, because it allows them to extend their value proposition that they have today like best practices for deploying agents or configuring or reports that they want to add or workspaces that they want to tailor all they way up to even custom agent development, all that kind of activities, all valid, all the stuff they are doing for our software.
Michael Coté: Right, so you can use their agent-builder stuff too.
Phil Fritz: All that stuff is plugged in and then from a GTS perspective there is a set of managed services they can layer on this on top of this is well. So for example for those customers that want off shift management, so I will use Tivoli Live and I’ll watch the alerts from 9 to 5 but I want somebody from 6 to 8 to look at that for me, those are all [there] so it’s a new interesting service models on top of — those are the base services we are going with at the time.
Michael Coté: To the classic IBM model as you kind of alluded to it does get a lot of room for partners and other people to layer things on top of it. So IBM doesn’t have to be the only concerned party involved in that service contract — this service that you providing.
Phil Fritz: Right.
Michael Coté: So just to differentiate it, you guys, Tivoli also have an offering that you can actually run in the cloud all on you own if you want to, on EC2. And can you explain the differences between those two, just so that people don’t get confused?
Phil Fritz: Sure, Amazon has been an important partner to IBM for some time now and we’ve made a variety of our software available through Amazon.
So what we’ve also done is we’ve taken — which is a sub set of what you see in Tivoli Live, which are our base operating system monitoring capability – just base ITM. The agentless is well as agent-based technologies for Linux and Windows and made them available in hourly rate from Amazon.
So the differences between the two is really if you want to use the Tivoli Monitoring capabilities on an hourly basis on Amazon, maybe you have some workload you are running on Amazon and you want to provide some similar monitoring to that we’ve made that available. We find a lot of customers, for example, “Hey! This is a great way of getting instantaneous Tivoli Live,” and came back [with a] credit card out and that can start capturing monitoring information of my EC2 resources.
But if you want applications, databases, and things of that nature for now that’s the differentiator, Tivoli Live offers more capabilities from a monitoring perspective as well as an IBM team behind it, so it’s an IBM datacenter, and there is security you know other concerns, not to say Amazon is not secure or anything like that, but different ways of consuming it.
So we are looking at Software as a Service, Cloud, and Amazon and other varieties or appliances as well. Look there’s a lot of ways to consume software now and we want to make our software available on as many of those delivery vehicles as possible both to satisfying demand as well as to provide choices and look at is this a fast moving market, there’s lots of things coming up, and we want to make sure we are on top of them. We want to make sure we’ve got our experience; we’re touching all these different models so that we can better deliver capabilities to our customers.
Michael Coté: What do you guys see happening in the next twelve months or so? What’s kind of the future of the Tivoli Live stuff, what are you looking at?
Phil Fritz: Well, it’s not just going to be just for monitoring, so there’s a pretty broad set of service management portfolio we’d like to roll out on this. So we haven’t closed any plans yet, but you know absolutely we are looking at other areas of our portfolio like our service desk, for example, adding additional monitoring pieces, a little bit more robustness on some networking pieces as well, asset management, and management things like that. It’s going to be — we’re watching the space very closely, we’re responding to customer demand on this area, and so stay tuned I think around Pulse time we may be hearing a little bit more.
Michael Coté: So sometime near the end of February?
Phil Fritz: End of February, that’s right at our annual conference, known as Pulse.
Michael Coté: Given all of this that we’ve talked about, what’s sort of like the process for finding out more on getting it installed or getting set up with it and just getting Tivoli Live essentially?
Phil Fritz: Well, ultimately it boils down to – we’ve got our [info] available on the web, but contacting your local IBM rep for information, but we’ll be making, obviously, web-based resources available to either how to order it or eventually even — in the future ordering it directly from the web.
Michael Coté: Right, and like you were saying is it month to month terms or are there different contract terms that you sign up for?
Phil Fritz: Well, the billing and the rates are expressed in monthly terms and we’ve got a minimum of 90 days to sign up and typically what we find is customers are more comfortable with yearly contracts or one or two years. And here’s the thing, and I know you’ve talked to this before in your podcast, is the whole the whole CapEx/OpEx discussion is, which is true, what we found is what our customers prefer is on a well known, repeatable OpEx number.
Michael Coté: “Repeatable Ex.”
Phil Fritz: Right, so the budgeting can happen in a more controlled fashion. So that’s kind of the model we’ve seen is we’ve expressed our terms in monthly rates but single, two or two to three year contract terms is typically where our customers like to transact in, so that way they have their repeatable OpEx outlays as it were.
Michael Coté: Well, great well, I appreciate you taking all this time to go over Tivoli Live with us. It’s always – having worked on SaaS stuff in this space before – I always have a soft spot for someone who is trying it out so it was fun talking about it.
Phil Fritz: Well, it was fun to talk about it, so thanks very much for the opportunity.
Demo
Transcript
Michael Coté: Well, now we are going to check out a demo of Tivoli Live which should be pretty fun.
Phil Fritz: What you are seeing here is the actual Tivoli Live Monitoring Service where a customer that has a registered can get started. So from left to right what you’ll see here is a link to the actual Tivoli Monitoring Console, a link to the historical reporting service that we were talking about. A notification system to be able to receive email alerts based off of the alerts we’ve defined in the Tivoli software, as well as a facility for downloading agents. So where typically a customer would first start is, they would go to the screen and download the install packages.
So for example, let’s click on Linux here and what you’ll see here is a variety of catalog elements to download, those Linux monitoring agents, with the self-installed code we were talking about; so again, pretty easy start to the process. Once those agents have been downloaded, they are configured to talk to the console, it’s a fairly simple procedure to login, as you can see I am just going over the web. As part of the on-boarding process, customer will get a set of user IDs and passwords and then they can go ahead and modify and change those, so that their administrator have access to this.
Michael Coté: So they started out by going to just a URL that you guys provide, essentially login in their web browser over the Internet.
Phil Fritz: Exactly! They are just using URL over the web and logging in and voila! What you are getting is a web browser rendition of the Tivoli Enterprise Console for those folks that are familiar with the Tivoli Enterprise Console is really divided into three main areas, which you have as your tree view of your resources and we’ve got — we are at the top of the tree view and you see here the event console, all of the events that we have going on and if you are a particular IT operator, your acknowledged events, events you’ve picked up that you are going to manage.
Of course, red is always a warning statement for us; it says something is wrong, and what you see here are all of the critical events that have come up over the last few hours, and these links actually, once you click on them, will drill down into the actual resource, the actual server, the actual condition that is causing the event.
Michael Coté: You can see that tree expanded there and everything, which sort of exposes the path down to the problem.
Phil Fritz: Exactly! It allows you to drill down precisely to the root of the problem. What you’ll see is obviously the initial value that triggered the condition, what is that currently. That’s important to know because sometimes the condition happens and goes away, when you go check it out, oh, it’s not there anymore. While this is a blank what we have typically here is the knowledge base for the particular user. So this is where the person that defined the monitoring policy can say, hey, this errors happens when we see this application behave this way or we see this operating environment behave this way, so that the non-expert, the non-subject matter expert that does the monitoring, can actually understand the event.
Michael Coté: And then you can use it for some sort of datacenter folklore I guess that these things cause this problem and this is actually okay or it’s not okay, or whatever the custom information associated with some of them is.
Phil Fritz: That’s right. It’s kind of the electronic sticky path. Then you can take an action, right. And here are a variety of pre-configured take-action commands like killing a process or breaking a message, and these can be executed manually or you can tell the system to automatically take these corrective actions as well.
Michael Coté: Alright. So I guess the console must sort of batch up a job that gets in behind the firewall to the data collector or action executor or whatnot to do it for you.
Phil Fritz: You’ve got it; you must have done this before.
Again, we can expand these views, and here, for example, we are using the agent base information to look at in a Linux operating system environment. There is a variety of reports and what we call workspace is each of these windows is a separate workspace. There is a variety of tools that are available to modify these views; we’ve got pie charts and graphs and speedometers and whatever visual representation you’d like to represent the information.
These are all again, lots of data we capture around, the CPU information, disk usage, network process, system information, even users to a certain level, depending on the facilities that operating system makes available for monitoring. So again, as we talked about JMX, SMNP data is all captured here.
Michael Coté: You do have a separate reporting module, but when you are in the console, how much historical information you get access to, when you are looking at these charts and things like that?
Phil Fritz: Well, by default, typically, there is a polling interval of about five minutes and there’s typically about a day’s worth of information on the console itself. There is a pruning policy we do put in place because of the space limitations, but if you want to have that data historically for other purposes, we can make that available to you offsite, but yeah, typically, what you want to be able to capture in the console sort of the immediate problems and threats, and then there’s data that’s kept for longer periods of time. Again, it all depends on what you are collecting, how often you are collecting it, so there’s not a real quick answer but yeah —
Michael Coté: And to the point of what we were talking about earlier, I mean there’s plenty of data warehousing and another stuff, because this is just normal Tivoli Software, we’ll hook into those other systems as well, if that’s something that you need.
Phil Fritz: Absolutely! We also have facilities to be able to extract data for running other types of historical reports if you have your own local tool you want to use.
Disclosure: IBM is a client, and sponsored the above videos. Also, over the years working with him, I’ve gotten to be friend with Phil.
http://www.truereligions.in/