Blogs

RedMonk

Skip to content

Nobody gives you free money – Back of the Envelope #004

The Hole, front

While I was in Houston for two weeks, Ed (@egoodwintx) and I met up in person, right downtown. With a bit of urban sound in the background, we discuss the recent highlights of technology earnings calls (things are looking up – we highlight Intel and Apple), the state of the IPO market for tech startups, and the deal with Facebook’s going public without going public.

In addition to clicking play above, you can download the episode directly or subscribe to the Back of the Envelope podcast feed (in iTunes or wherever) to have this episode automatically downloaded for your listening pleasure.

…also, we talk about the business of Houston. Since we were in person, I don’t have extensive show-notes, but there’s our lovely disclaimer:

Since Ed works in a highly regulated job as a portfolio manager, his lawyers require this exciting disclaimer, which you’ll get to hear my friend Charles Lowell read at the beginning of the episode:

This podcast is for entertainment purposes only. The content and opinions expressed in this podcast are merely the opinions and observations of Mr. Goodwin and Mr. Cote. Michael Cote is a technology analyst who may have conflicts of interest concerning the companies mentioned. Ed Goodwin is an investment adviser to various funds that may have a financial interest in any companies mentioned. This podcast should not be construed as investment advice of any kind. Both Mr. Goodwin and Mr. Cote may be buying or selling any of the securities mentioned at any time; either for themselves or on behalf of clients of theirs. The content herein is intended solely for entertainment purposes only. This podcast is not a solicitation of business; all inquiries will be ignored.

Seriously, don’t rely on this podcast for investment advice. Ever.

Now sit back and enjoy the show.

Disclosure: see the list of RedMonk clients for clients mentioned.

Categories: Back of the Envelope.

Tags: , , , ,

RedMonk Links – New Feed

I put together a quick feed that wraps up all of our links feeds into one, RedMonk Links. If you’re like me, you like subscribing to the bookmarks of various people and orginizations – you know, stuff people put in Pinboard.in and Delicious. For exampe, the GigaOm Pro folks “curate” some good ones daily.

Each of us RedMonks – James, Stephen, Tom, and I – bookmark stuff all time. It’s mostly technology related, but sometimes it’s baseball, recipes, and other person stuff. Anyhow, if links is your thing (and you’re sick of us waiting to post round-ups of them), subscribe to the RedMonk Links feed – once Yahoo! Pipes gets ’em, you’ll see ’em!

(I didn’t splice in our own content, there’s plenty of feeds for all that running around.)

Categories: Links, RedMonk.

Links for April 21st through April 25th

First visit to Barton Springs this season

Summer is virtually here in Austin. It’s time to head down to Barton Springs at least every weekend, if not more! We went there during this past Easter weekend, and it was wonderful.

Disclosure: see the RedMonk client list for clients mentioned.

Categories: Links.

3 Tips on Scaling Agile Development

The Cone of Silence

I’m often asked for tips on “scaling up Agile.” This usually means, “we got Agile working at our team level, but as we’ve tried to expand it to the rest of our GIANT COMPANY, we’ve encountered all sorts of problems. People don’t see the light! How can we make them BELIEVERS?!?!”…or something a little more level-headed than that.

At some point, adopting Agile development becomes an organizational change management exercise that has little do with Agile itself. Last week at the IBM Impact Unconference, I was part of the opening “unpanel” where someone had asked for help in adopting Agile in this context…or, at least, that’s the question I chose to answer. Here’s what I said:

  1. You need an executive sponsor, get one – doing Agile at the team level is fine, and you can do a bottoms-up approach. You might even be able to expand out to QA teams (who are traditionally separate from development teams) and others, but to really spread to the rest of the organization, a high-level executive must bless and then force the effort on the organization. Here, the hierarchy of the organization works in your favor: if the big boss says to do it, people tend to do it. Executive sponsorship always gives you permission to screw-up as you learn. It reduces your personal risk (and that of your team). And, let’s face it, the first few iterations of an Agile team are rough: so much of Agile, initially, is about exposing the weaknesses and poor development practices organizations are doing (mainly: promising too much in each release, also, not fully working as a team around one product) – so there’s going to be some failure at first. The executive is your blame-game safety net here: a good Agile executive knows that the first few cycles of doing Agile is all about making people realize they’ve been doing it wrong by exposing the poor results of their practices…and then learning from that, not punishing them.
  2. Learn how to report to Management, and do it – as “lower-level” technical people, you probably could give a rats ass about all that “reporting” management does up the chain. The fact of the matter is, all those damned spreadsheets and project charts are a huge part of their job. Management has to somehow measure and report on the progress beneath them so that they and their management can make decisions about what the keep doing, what to stop doing, new things to try, and plans to start making for good or bad futures. “Reports” are practically the only input management has to their job. In your organization, there’s probably a set of metrics (or KPIs) that your boss, your bosses boss, etc. are looking for. These metrics are used to gauge “how things are going,” but are also used by even higher management to judge how that boss is doing (and, thus, you). As someone trying to push Agile, you should learn what these metrics are and learn how to wangle in all the project management data you have into them. Treat that reporting as “story” or part of your iteration – track how much time you spend on it, the benefit it gives you, make it part of your iteration retrospectives and so on. To get all meta: reporting on the project is a key requirement of the project. (Thus, if it’s taking too long, you can track that and report to management and ask them how they’d like the weigh that time against other features.) Recently, the topic of “technical debt” has been an excellent, new metric to start tracking. Also, here, Israel Gat‘s The Concise Executive Guide to Agile provides loads of good input on “reporting.”
  3. Isolate the resistors, and hope they drop-off – no matter how well things are going, there will be people who simply don’t want to go along with The New Thing. You’ve tried to reason with them and get them to see how great Agile is, but they still just don’t want to do it. When I was part of the Agile-at-Scale efforts at BMC (under Israel Gat), I was shocked that the most resistant people were fellow developers. They were so used to the way things had been and the fiefdoms they’d carved out that the fluid nature of Scrum caused them all sorts of problems. In general, you can get a sense if someone is a “team player,” or if they’re resistant. While the Machiavellian side of you would like to see these people fired, that’s often not as easy as you’d think in large organizations. In these cases, you should isolate these people (and their work) from the rest of the project. Essentially, you want to make sure The Stagnent Resistors don’t infect the your timeline, code quality, and moral. As I said at the Impact Unconference, it’s like what you do with a weird growth you find on a dog: tie a silk string around it, cutting off the circulation, and hope it just falls off eventually.

It’s getting a bit long in the tooth now, but my “smells of Agile” paper from several years back can be helpful for gauging how Agile your organization is: a sniff test, if you will.

Disclosure: IBM, who put on the Impact Unconference mentioned above, is a client.

Categories: Agile.

Data center jobs – getting slightly better

The job situation for IT staff seems to be getting better, if only by a hair. That’s the take away from one of the recent SearchDataCenter.com Advisory Board questions I answered, along with others on the board. Most of the folks, including myself, also gave advice on “standing out,” or differentiating yourself as we’d say in marketing, like this advice from Robert Crawford:

I advise college students to avoid graduating as a commodity Java programmer. Offshore companies have plenty of those. Instead, develop a specialty, such as database administration, networking or security. If you don’t mind hanging around with old people, you might develop some mainframe skills, which are rapidly becoming rare as the baby boomers retire.

Part of my answer goes over carving out a “fief” for yourself in IT:

A new fiefdom has been created with virtualization. It exists right alongside database, security, server, application, Windows, Linux and Unix. If you can wangle yourself in there, you have a lot of empty space on the organization chart to start winning points and advancing. There’s all the usual ways of advancing of course, but I’m also seeing a new path emerge — managing a company’s [Software as a Service] (SaaS) offerings. For example, some companies provide SaaS versions of their software, which entails a lot of operations assistance to deliver and maintain. Being part of your organization’s mobile offerings is, more or less, another face of that. IT, as always, is ever-changing, and if you get typecast into one silo, once that silo is “optimized,” you’re in trouble. Always seek to learn new things that your organization can use to make new money, or at least be an active part of saving money, not a consequence of cutting costs.

There’s lots of good input on the current job market for IT folks (in the US, at least) and advice on getting and securing IT jobs in the piece – check it out!

Also, there’s two other Advisroy Board pieces out you might like: Asset management tools in the modern data center and Making time for IT planning.

Categories: Systems Management.

Tags: ,

Links for April 16th through April 20th

Disclosure: see the RedMonk client list for clients mentioned.

Categories: Links.

For IT, proper energy management has more benefits than just saving power

In the third installment of my Spiceworks Data Commentary series, I add some color and advice to Spicework’s recent post on how much energy the Spiceworks community is estimated to have saved by using an Intel Power Management plugin.

They put together a nice infographic going over the savings:

In my piece, I gather up several anecdotes of hunting down energy waste in data centers and then give tips on implementing a power saving program in IT, including an eye towards some side-benefits. Check it out, and tell me how energy waste hunting has been going in your shop.

Save $200/year while sleeping

I also spoke with Ernest Mueller (he of dev/ops in action fame) about power management. He told me a nice anecdote about one co-worker tweaking their desktop’s sleep mode to really hunt down waste:

Since this is Green Week, actually I was just reading an internal newsletter article from someone who disabled NightWatchman but took it on himself to use a Kill-A-Watt meter to measure power usage and experiment with setting various sleep modes himself, and estimates a $200/year power cost savings using his settings.

Which reminded me of this GreenMonk interview Tom did around 1E’s NightWatchman:

Disclosure: Spiceworks is a client. Sentilla, who’s quoted in the Spiceworks piece is a client as well.

Categories: Systems Management.

Tags: , , , ,

Search as middle-ware at att.com, with Shantanu Deo – make all #017

As I’ve mentioned before, I’ve been seeing search-as-middleware cropping up recently. The idea is that search technologies like solr make for good middleware layers, esp. when you have a large pool of unstructured data. Shantanu Deo at AT&T joins me for this episode of make all to discuss one such implementation for att.com’s backend. It’s an interesting discussion of a new way to use search and, also, gives a preview of his upcoming talk at the Lucene Revolution conference next month.

Download the episode directly right here, subscribe to the feed in iTunes or other podcatcher to have episodes downloaded automatically, or just click play below to listen to it right here:

Show Notes

  • What does the AT&T CMS do?
  • The evolution of CMS into a more portal, application platform
  • Where does this show-up? AT&T home page – tiles served up there, promotional content.
  • How did you draw up what you needed for search? We go over the requirements.
  • How does solr fit in with everything? Sounds like a service…or is it part of the application stack?
  • What kind of hardware stack does it run on? A lot less than you’d think.
  • How does the indexing/crawling work? Searching – using faceted searches.
  • What are the facets y’al are using?
  • The other use: global search. Customizing the CMS to the user, e.g., only showing you services and products available in your area, like Uverse or a MicroCell.
  • Can we do personalized content as a search problem? Assigning URLs to different user groups… using search as a filter then to select only the relevant content for a user. Search as Middleware.
  • Search as a back-end for API-driven, composite applications.
  • Lucene Revolution Talk May 25th to 26th – http://lucenerevolution.org/

Transcript

Michael Coté: Hello everybody! It’s another edition of ‘make all’, the podcast about fun and interesting things with those damn computers. As always, this is your host Michael Coté available at peopleoverprocess.com. And for this episode, we are sort of getting back into the area of search if you will, and all of the fun stuff that’s underneath that. To talk about that, we have someone who is going to be at the upcoming Lucene Revolution Conference giving a talk about how he and his organization have been using Solr.’ Do you want to introduce yourself real quick?

Shantanu Deo: Hi! My name is Shantanu Deo. I work for AT&T and I am the manager for their Content Management System Group. I have been here for a little over two years in various capacities. Earlier I used to be at release management team on that side. Also I did some work with search and before that I was at Amazon, for the better part of years or thereabouts. And before that I have been at various companies, mostly in the electronic manufacturing industry on the software side of that.

Michael Coté: Well, what is — I mean since you mentioned it. What is electronic manufacturing? Is that literally manufacturing various gadgets and gizmos and things like that?

Shantanu Deo: Yeah, basically, I worked for Panasonic and couple of other Japanese companies which make the machines that assemble the PCBs. So it’s like automated pick and place and all the automation aspects associated with —

Michael Coté: You worked on the robots?

Shantanu Deo: Yeah, exactly, exactly.

Michael Coté: Well I think this is the first person we have talked to who was a robot creator, so that’s exciting, or part of the process at least.

I think to start with – I think everyone listening to this knows what a CMS system is, right? It stands for content management system obviously, and a lot of how it springs up nowadays is managing the ton of content that’s involved in public websites, and for internal websites as well. But could you give us an overview of in the case of the work — the CMS stuff that you do at AT&T, like what is the CMS that you are speaking of?

Shantanu Deo: Yeah, sure. So in a large organization, there is always a need to change content on a daily, hourly, whatever, much more frequently basis. So what happens is you obviously have some sort of application behind it that’s running your website but typically that involves coding or whatever your technology choice is like JSP or ASP or whatever development. But just the pure content of it can be abstracted out and managed separately, and that’s essentially what the content management systems typically do, although nowadays they are getting into this whole application space as well.

In fact, we are in the middle of transitioning from an older system to a newer system, and so when we did our due diligence of looking around market space,it seems like a lot of the content management systems are gravitating towards that whole space, where they occupy more central — almost within the app layer type of space. Before, at least, the ones that I have been familiar with were more like passive ones where you generate the content and it’s all done and your application can refer to it.

Michael Coté: Yeah, I have noticed recently that — as you were saying, you were starting to say before I interrupted you there. It does seem like — sort of there is classic CMS which is all about – it’s all about pushing the content out there and may be even hosting the content but also about that work flow of authorization that you are talking about.

Shantanu Deo: Oh, absolutely, yeah.

Michael Coté: And then what I have noticed a lot recently is there is almost this kind of convergence of what we used to call portals and to some extent sort of app servers and then CMS systems, they seem to be kind of trying to plug into each other to somewhat become the same thing, like having mini applications or widgets or portlets, if you want to use that that old (Voice Overlap).

Shantanu Deo: Yeah, yeah, so you have like these marketing campaigns or managing all of that versus integrating with the catalog behind the scenes to make sure that — like to provide more of a wizzywig kind of interface to the content authors. So like you can see, make the change and see it live almost or in context editing if you will. So that you see all the other stuff that you are not actually handling but what your users will see. So all of that seems to be where the industry seems to be heading.

Michael Coté: Yeah, you know, it seems like if anything, Wikipedia has probably trained people to expect that like — why can’t I just click on this page to edit it? Like it shouldn’t be so onerous to like edit content on a webpage.

Shantanu Deo: Exactly, yeah.

Michael Coté: And you know granted in, a lot of what makes anything enterprise, are those workflows of approval and stuff that you don’t just have —

Shantanu Deo: Oh, yeah.

Michael Coté: People really doing things –

Shantanu Deo: Correct, definitely.

Michael Coté: But still that’s one of the things that I’ve enjoyed seeing in the CMS systems off late is really — most all enterprise systems including CMS have kind of nailed the enterprise stuff like being compliant and being performant and having workflows and everything and if they’ve kind of solved those problem they are really like that they’re refocusing on I don’t know, usability. Like just —

Shantanu Deo: Yeah.

Michael Coté: — making it easier for users to use essentially.

Shantanu Deo: Yeah, absolutely. I mean, when we had the demos, I remember, all the people who’ve been touching the existing CMS are pretty much blown away with some of the vendors when they showed the UI. So yeah, lot of people have put in a lot of work towards making that aspect of the system very user-friendly.

Michael Coté: And so speaking of user friendliness to try to make a masterful segue here, one of the things that is really vital to all UIs nowadays or pretty much any application is Search and you know, the issue with Search, I mean we were just talking about a convergence of three different platforms plus the applications, I mean, it’s like all, all large complex software, eventually your software stack becomes everything that you have. And like, so you know, I am curious first to hear, like, when you were thinking about how you were going to apply Search to all of this… Well, first off, if people wanted to see the CMS in action like, where on the Web do they go to see it; where does it surface if you will to work with?

Shantanu Deo: Well, actually you know, if you go to the att.com homepage, let’s say for example, not to be too obvious, but you know that’s where I worked, so you see a lot of the titles, essentially you know, promotional content show up. That whole page is effectively made up of like different pieces of content that somehow comes together you know during application rendering time.

So the pieces might be actually developed specifically to target different regions or different user types and depending upon whether you’re authenticated or non-authenticated or depending upon, some other characteristics of the user; that’s what you would end up seeing.

Michael Coté: Yeah, like I am always getting asked if I want to go to paperless billing. May be there is all sorts of – I don’t know if that’s part of the CMS system but it is.

Shantanu Deo: Yeah.

Michael Coté: I think a lot of listeners probably have an iPhone or have U-verse or somehow have come across AT&T before. So they have these consumer services. So yeah, so I mean given that the breadth that you’re going over, when you guys were thinking about doing Search, right, — what were the criteria that you drew up? Like what did you want to accomplish with Search?

Shantanu Deo: So I mean, I had couple of different touch points for Search, not specifically to CMS. Although there is a Search aspect to the CMS that we are currently implementing and I can you know go over that briefly later, but when you asked me the question specifically, the two touch points I mentioned earlier were actually – the first one I had was basically when I was in my earlier position also at AT&T, where I just happened to be part of the process where we were integrating, I think it was some other vendor who would supply us with some targeted content based on the user, user history, browsing history and what have you —

Michael Coté: Oh, all right.

Shantanu Deo: — that’s when I came across the catalog feed that we were providing and I just kind of thought of putting together like kind of an under the table project if you will to expose that catalog in a different sort of way than we were currently doing on our side. So what I thought was how best to present that with some more user-friendly UI aspects that we were not presently at that time using.

I came across jQuery and stuff and then I realized that this behind this needed to be some sort of an engine that’ll service all of that. At that time I had some experience with another Search vendor, we were working with it as part of my daily duties. But somehow it seemed like it had a lot of need for lot of maintenance or attention, if you will.

Michael Coté: Care and feeding, I think they call it.

Shantanu Deo: Yeah, yeah. So I was kind of looking for alternatives which would not need that much attention, and also be easy to set up and get going. That’s when I came across the Apache Solr, and I thought of giving it a try and it worked just beautifully. So we basically had the Catalog Search application up and going, just kind of me and a colleague of mine by name of Rama, so we just kind of put it together; and I must give credit to my manager at that point who allowed me the space to provide that and then various people in the organization kind of recognized the value of that and decided to support that.

So that came to fruition and actually went live. And it’s been since incorporated into the main ATT Search component. So when you see or if you go to att.com and search some aspects of that like the sliders and those are all the faceted search artifacts that you see and the results you see from that.

Michael Coté: So it sounds like you guys, if I can pull apart some of the stuff you said, there’s at least one way that you’re using that this Search and it’s just traditional search like you go to a website ¬–

Shantanu Deo: Yeah, yeah.

Michael Coté: And then you’re also mentioning the Catalog Search¬ ¬–

Shantanu Deo: The Catalog Search was just basically fronting, ingesting the AT&T Catalog and –

Michael Coté: And then that would be all, is that catalog just all the stuff AT&T has to sell or is it just a content?

Shantanu Deo: Yeah.

Michael Coté: Okay.

Shantanu Deo: Yeah, that’s exactly that. And then fronting that with some other more UI aspects to make it pretty.

Michael Coté: So since you had mentioned previously that you have been using a sort of a search substrate, if you will, that needed a lot of care and feeding and how would you rate the care and feeding that Solr needs, not necessarily versus that but just in general like what’s the day-to-day worrying and fussing with it?

Shantanu Deo: Actually there’s none. So as far as I know in fact, I think this has been up over a year, or more and I’ve not heard of anything go wrong with it at all. So in fact the other day – it was funny –there was some other related issue and people had just forgotten that it existed. So that was kind of a very good endorsement of Solr in my mind because it was so quiet that it just worked basically, so, yeah.

Michael Coté: Would you say that it’s sort of — so the other thing I am curious to hear about because there’s a lot of people who use Solr and have exactly this problem that you have, right? So I wonder if you could tell us like how, it sounds like from the fact that people forgot that it existed which in this light is very positive that —

Shantanu Deo: Exactly, yeah.

Michael Coté: — it’s sort of set up as a service if you will, like instead of being embedded in an application. So I am curious how —

Shantanu Deo: I think that’s how we use it but I think it can probably be used in other incarnations which I may not be aware of. But that’s how we use it like, we have a separate set of boxes that just run that in a dedicated instance, basically if you will.

Michael Coté: Do you have an impressive cluster running it, or what kind of system did you have to build for everything?

Shantanu Deo: Actually you would be surprised, I mean of course, we have fairly decent crossbar for the applications and all that but actually I was surprised as well that we don’t need that many instances of Solr behind the scenes to support all of that traffic. It seems between the caching layer and the apps we have very few instances of Solr supporting all of that search traffic. So I think we have in the range of boxes that you can count on your hand I think to support a much larger application cluster.

Michael Coté: Yeah! I mean it sounds like to support like search over global AT&T stuff. So that’s like, that’s impressive. I guess it’s impressive that the servers that you have are not impressive. You don’t have like some mega cluster running everything for you.

Shantanu Deo: No, we definitely don’t need that, yeah.

Michael Coté: So how did you — I am kind of purposely putting this in a naive way, but if you have search you’re basically crawling things and updating what’s in the index if you will and then the other side of search is an actual user comes and wants to search something, and so I wonder if you can walk us through those two stages like what’s kind of like the crawling or the indexing or the getting content in there that you have?

Shantanu Deo: Yeah, so we have like I think a periodic feed that comes from the catalog as and when that gets updated, and we just do kind of incremental indexing of that content, and obviously when we are talking about the Catalog Search aspects, we didn’t even talk about the other instance where we use Solr differently. But in this case, whenever catalog updates take place, we have that content indexed and we have certain faceted searches for one thing that we provide.

So based on how the schema is set up, once that’s been indexed, basically Solr just switches over to the new index, and we have I think a web service layer fronting that, and whenever the user searches, there is an Ajax request that takes place of web service, and behind the scenes the web service makes logs with Solr, and pretties up the information that it returns and the user sees the results of the Ajax response.

Michael Coté: When you are building out the searching aspects, I mean was it kind of just the traditional like we are building out the search interface, and we need the UI people to do a UI design, and I mean was it just like the normal stuff you would go through for a public website, nothing out of the ordinary or anything?

Shantanu Deo: Yeah, like I mentioned, that particular project was more like an under the table kind of launch where we just kind of decided to see what we could get out of that like we have sliders for the various parameters like price, so just users needed to search for phones up to a certain price point or up to a certain rate or all these different characteristics of phones, manufacturer or whether or not it had cameras or whether or not it had a keyboard, all those aspects of the phone were exposed as facets of the catalog.

Michael Coté: Yeah, it sounds like it was kind of a funner problem domain than other people might have because who doesn’t like phones. So as a programmer, you probably want to optimize all the ways you can search for stuff, whereas I don’t know if you are doing search over a furniture warehouse company or something it may not be quite as exciting to talk about handles and hinges and wood —

Shantanu Deo: Yeah, right, yeah I mean we had a blast. It wasn’t a very big problem that just a couple of guys couldn’t tackle. So it was pretty easy, but yeah, in Solr I guess to its credit makes it very easy to work with as well.

Michael Coté: So another thing I am trying to think through the searches I have done like on the site, but I am kind of coming up blank. But one thing I am curious about is are there ways that you guys involve an individual customer in the search results or ways that you’re thinking about doing that, because it does seem like — I guess the phrase people would use is personalizing the search.

That’s one of the interesting things that Google and other people do – it doesn’t always work, but they try to pull upon the pile of data that Google has about you and all your relationships and kind of hone your searching, if you will.

Shantanu Deo: Right. So I think since I have moved from that theme, I think people have taken that a little step further. I think they have provided additional capabilities, like you are predicting, what is it called, I forget —

Michael Coté: Like as you type it’s kind of searching?

Shantanu Deo: Yeah, exactly, yeah, that one. So that functionality has been added and a couple of other things they are doing. I don’t think it’s currently doing personalized search, but they do other things like substituting, like if you have searched for this term, then you probably meant these other terms.

Michael Coté: Oh, right, right. Yeah. Well, I mean, there is also the question of like what would you really personalize? The only wacky cooked up example I can think is, if you knew someone regularly called from out of the country, you might want to make sure they have a phone that is international enabled, if you will, or something like that.

Shantanu Deo: Actually, yeah, I mean, you are not too far off base basically, we are definitely targeting some of that with this new CMS system that we are working with to personalize some of that. And actually speaking of personalization, this comes to a nice segue to the other use of Solr that I was talking about.

Michael Coté: Oh good, I was going to ask, yeah.

Shantanu Deo: Yeah, presenting at the conference was actually for global search, and this is where we actually have very deep personalization. So that’s an interesting use of a search, if you will, whether or not it happens to be Solr, it’s kind of tangential.

But what the problem we were presented was that AT&T has so many different user types. We have in the order of like kind of several tens of customer types or different characteristics that you would want to personalize a user’s experience on. So that was a project – that’s still ongoing actually, but it will be live shortly – where the business wanted to basically only show a certain aspect of the site or in this case it was the browsing options available or navigations, so to speak. So they wanted to personalize on that.

Michael Coté: So it’s sort of removing things from the site depending on who is searching around?

Shantanu Deo: Yeah, not just removing, but also like showing specific different URLs for that.

Michael Coté: Oh, right, right, right. Yeah. I mean, again, to make a cooked up example or maybe a better one, but like I am a U-verse subscriber, so it would probably be kind of silly to show me subscribing to lesser Internet options.

Shantanu Deo: Yeah, exactly, yeah. Right. So in that case then what we had was the problem like, let’s say, if you had 60, 70 different users, so a user could be in any of those different user groups at the same time. So you could be a U-verse subscriber, but you could be in that particular region where some other service was not available, that sort of thing. So in that case you ended up with the situation where you couldn’t beforehand know which set of category the user would fall in.

Michael Coté: I like the sound of that, because I used to be one of those guys who would — like I also have a AT&T MicroCell and everything, and I would go to that little app where you put in your zip code all the time, and try to figure out if I could get in, without like what you are talking about, you go to the site and you get taunted by things you can’t have, which is terrible.

Shantanu Deo: Yeah, exactly. So yeah, in that case then you end up with like a situation where if you are trying to solve this problem in code, you would have to code for all these different various combinations of user groups and then figure out what to show.

So it becomes a management’s nightmare, not to speak of like the complex coding that you have to do. I mean, not in terms of like coding, but imagine if you want to test for all of these things, it becomes more of a headache. So that’s when we kind of thought of modeling this differently, approaching it differently and looked at it as, can we do it as a search problem?

And that’s where my experience with Solr in that earlier project we talked about earlier came in handy, and we realized when we were kind of working, putting our heads together, that hey, we could model this like a search problem, if we could classify each URL to be belonging to a certain set of user groups.

So that’s when we kind of modeled all of the URLs, even though they were hierarchical in some sense, like some URLs are in the top navigation and some of them are related to a given top navigation header at a secondary level, and even have another level where you would have like for each secondary you could have a bunch of other tertiary URLs or what have you associated with that. So there was just a little association, hierarchical association as well.

Michael Coté: So correct me if I am wrong, but it sounds like the way you kind of solved the problem is — so you kind of inserted search as almost a filter, like search for everything —

Shantanu Deo: Exactly, yeah. It’s exactly that. It’s like a filter query really. So you search the whole thing but you filter on certain aspects of the user’s attributes that you know beforehand. So then you end up with — once you have like — because Solr in this case was — we decided to go with it because we already had it working, so it was a very easy transition to make or addition to make.

So we flattened out the whole structure a little bit and using some encoding appropriate to pushing it into Solr as one flat file, we then attached like user groups for whether you want to show a particular URL or not. We categorized or we added those attributes to each URL and then searched for all the URLs, but based on whether a user is in a group or whether you wanted to hide particular URLs if that user was in that group.

Michael Coté: That’s a good example of this. I mean, I always struggle to phrase this, but kind of it’s like using search as middleware, if you will, and it seems like — it’s almost like if you have — everyone is, rightly so, into using sort of APIs for public web services and things, and having RESTful APIs and lowercase web services as I would say.

And it seems like — I have been hearing a lot of stories over the past year or so that the kind of backend for APIs, if you will, one of the effective ways to do it, like you have been describing is, to use search for it. And maybe you expose it purely as a search, like there’s a program, is the person searching instead of a person, but it’s kind of an interesting, somewhat new way to think about how you go through all that filtering of displaying what someone has permission to see or should see.

Shantanu Deo: Yeah, yeah, exactly, that’s exactly what it ends up being. And so far we have had reasonable success with that, especially I think if you look at it from a maintenance perspective. On an ongoing basis, whenever marketing decides to come up with another category of users, it’s very easy to just add that new user group and just reindex and search and you don’t need to touch any code at all. Whereas if you had gone by in the traditional sense, you would have to recode, retest, and go through a lot of expensive testing iterations. So this is avoided by using search instead.

Michael Coté: Because really, and I am always bad with terminology, but really what you do, you are really updating the corpus or whatever of stuff. You are updating all the stuff that’s being searched over, not really updating the application. So since you are not touching the code, you don’t really need to change it.

Shantanu Deo: Not touching the code, exactly. So then the upgrades are almost instantaneous.

Michael Coté: So what does – to kind of borrow a term ¬– what does the database look like? I mean, what is that corpus or body of text, like how are you managing that?

Shantanu Deo: Well, it’s basically a listing of flat files where all URLs are just listed one after the other, and the encoding scheme captures the structure within that, that the user actually sees. But as far as Solr is concerned or search is concerned, it doesn’t care.

Michael Coté: Oh, that makes it super easy then, because you are just dealing with files.

Shantanu Deo: Yeah, one file actually with everything in it. The encoding is then interpreted by — so Solr acts as the filter for making sure that you see only the URLs that you are meant to see and that the business wants you to see. And then the encoding kicks in, like a small thin layer in between, after Solr comes with results, that then builds the hierarchy based on that encoding scheme and you get to see the proper hierarchical structure.

Michael Coté: Right, right. No, that’s really, as I said, a really interesting use of doing search there. Like instead of having to update a database or do all this other stuff, you just kind of — you write your system such that Solr or the search middleware or whatever it may be, Solr in this case is the thing that does all of that filtering and stuff for you. I mean, I guess at some point in the application there are sort of queries being constructed that are sent to Solr. It’s now like it has magic voodoo that figures out who the user is.

Shantanu Deo: No, it’s not, of course not. I mean, there is some logic that takes place, and that has to take place, that’s the minimal to figure out who you are and all that stuff.

But then it’s just a simple query that — yeah, it’s just a query of, hey, I am a user who belongs to user groups A, B, and C, what do I see? And that’s all pretty simple.

Michael Coté: Yeah, yeah. So I mean, you have a whole presentation at Lucene Revolution about this. I mean, are there some points of the presentation that we haven’t really hit on? We kind of went over a technical overview of what you guys are doing, but I mean, are there any sort of like lessons learned, or any sort of tips or advice that you are going to go over?

Shantanu Deo: Well, I haven’t actually started working, to be honest, on the presentation.

Michael Coté: Well, I will give you a secret and everyone who is listening, I have a presentation that’s due today and I am just now finishing it up. So the ideas have to gestate in your head and you have got to get them perfect and then it’s just a matter of wiring it up.

Shantanu Deo: Yeah. I think — well, one of the things that came up is, in any organization is, you have a lot of like testing infrastructure before stuff hits production. So you have various mappings to consider. The URL that you see in production is not necessarily the URL you would see in your FST or testing environment versus a dev environment versus a QA or a staging environment, what have you, like different orgs have different naming conventions.

So we had all of these, and to make matters a little bit more interesting, we had other groups within the company also using the fundamental data that we were also using, and then they had their own environments to work with.

So I think one of the struggles we had in trying to encapsulate all of this into one file was that we missed out on these other environments. So what do we do with that?

I mean, there are ways to get around it — I mean, this has nothing to do with search, but this is just a practical reality that we encounter at least, to figure out — there is no easy way to get all of that other than somehow coming up with some automated, not exactly fully automated, but some way of replacing your current domain with whatever your test domain has to be, in a consistent way, but do it for only subset of URLs that you are not hosting.

Michael Coté: Yeah, yeah. I mean, I think what you are getting at is, in a large company or organization, there is sort of ownership of data and kind of control of access to that data. And then also, like you said, like the production URLs are different than the test URLs and everything. And so there is a fair amount of time you need to spend to make sure you have proper access to the data, and that as you move through the phases of production, the data is actually accurate. You don’t have like the hot live data all the time to play with.

Shantanu Deo: Yeah, exactly, yeah.

Michael Coté: I always find that when you get in these types of situations that’s where it’s easy for complexity to sneak in. So you sort of — you spend a lot of time keeping it simple. To use an old Mark Pilgrim quote, like a lot of effort went into making this effortless.

Shantanu Deo: Yeah. Very good! That’s a nice quote. But yeah, that’s exactly true, like we thought that we have solved the main problem, but that’s just half the battle, like you have to make sure that things actually work in all the other —

Michael Coté: Yeah. And then like you funnily said at the beginning, like at some point if you are successful at simplifying some process, people forget it’s there and they rediscover it, like that’s always a nice sign of success.

Well, great! I think in the time we had we actually got a good overview there. And it was — like I was saying towards the two-thirds in there, like I have been interested in the sort of search as middleware stuff examples that I have come across, and Solr is definitely like the technology I come across most often that fits into there.

So I appreciate you spending all this time to give us that overview of the talk you will be doing.

Shantanu Deo: Yeah. My pleasure!

Michael Coté: And yeah, it’s the Lucene Revolution Conference and it’s in San Francisco on May 25 and 26 if I recall. And obviously you will be there, unless you get one of your robot friends to come and give the presentation for you. All right!

Shantanu Deo: Yeah. I look forward to it.

Michael Coté: Yeah, definitely. Well, thanks again, and thanks to everyone for listening. And we will talk to you next time.

Disclosure: Lucid Imagination sponsored this episode and is a client.

Categories: Enterprise Software, make all, Open Source, Programming.

Tags: , , , , , ,

Sorting out "cloud security"

Found this is the Palazzo elevators

Security is sort of a nonstarter. You can build a cloud to be as secure as you want, doesn’t matter what techniques you use. I just pretty much ignore that. Randy Bias

Recently, I’ve been trying to hone in on what people mean when they talk about “cloud security.” More specifically, why the lack of cloud security is causing problems with (mostly public) cloud adoption. As with most questions about security, answers are hard to find. They usually revolve around the dreaded, “well, it all depends”…which just begs the question of what those “depends” are.

I don’t have the answers, and despite having helped start one of the early online banking startups, I’m not a security expert. Nonetheless, here are common sentiments and topics I’ve come up against, along with some more nuanced discussion, many of which overlap if you pay close enough attention:

“I don’t know where my data is”

This has more to do with what jurisdiction your data is in (see the next point) than not knowing where it is. By “definition,” if you can use your data, you know where it is: otherwise you wouldn’t be able to retrieve it to use it. The cloud knows where your data is at any given time precisely – and that’s what the real problem is: a government might use that knowledge to take and then expose your data.

There may be regulation and governance that requires you to account for the physical location of you data at any given time (see below). Of course, this all gets sticky (or maybe not – I don’t know the law when it comes to this stuff) with CDN and other edge caches – and there’s always those damned mobile employees with their smart phones, laptops, and unencrypted tape back-ups left on the back-seat.

Fear of Subpoena

“But there’s another concern,” parried Apik. “Isn’t it true that, as a result of the USA PATRIOT Act, the Canadian government instructed departments not to use computers operating within US borders, because it had concerns about the confidentiality and privacy of Canadian data stored on those computers?” –From Security and Resilience in Governmental Clouds

If you’re in the EU (or any non-US state), you fear the PATROIT act. If you’re a Yankee, you fear the EU and it’s privacy and data laws. In both cases, you fear your data being stored on disk in some jurisdiction that will subpoena your data and then make it part of public record (or otherwise “violate” the secretness of your data). It’s not that some hacker will illegally acquire your data, it’s that a government will do it in a highly legal way, causing all sorts of problems.

One person reminded me of the story of a man who’d murdered his wife and otherwise looked innocent (along the lines of India’s “The Google Murder,” if not that story). With two subpoenas – one to get his list of IP addresses through his provider, and another to Google to find all searches that IP address had done – the government tracked down his extensive research into how to cover up a killing. You can imagine similar cases with corporate crime, esp. if hosted email, document storage, and other collaborative applications and data are involved.

The question here becomes what governments you trust and how much your business depends on not complying with judge’s requests, and the law, really. If you do a lot of shredding in your business, then you should be afraid – you need to make sure your “cloud data” gets shredded before it gets exposed.

I’m not condoning breaking the law, but clearly skating along the edges of illegal, unethical, and “pushing the boundaries” happens as part of many companies daily business. In the tech world, for example, if you don’t have the US Justice Department continually knocking on your door to see if you’re being too monopolistic (that is, grabbing too much of the market) you’re probably not pursing as much money as you could be.

Whether or not you’re breaking the law can be largely irrelevant as what you really fear, what the real risk is: exposing secrets that your partners and customers would find repulsive. Think about Wikileaks for corporations: how would the market react if they really knew what you thought about customers, investors, and all those other “idiots” who won’t just shut-up and pay you already.

Pricing is a good example. Good profit margins typically require information asymmetry when it comes to pricing: the most profitable buyers are the ones who don’t know what they should be paying and, thus, pay too much. So if some subpoena exposed pricing info (more than likely because of a completely unrelated matter, that asymmetry would be nicely screwed. There’s nothing specifically “cloud” about this example or most others: it just illustrates what companies (should) worry about

If your data gets subpoenaed, it’s out in public. There’s no chance to spend weeks hunting down where those email and document archives are and then conveniently not find them after your data center gets flooded or whatever. Catching Bradley Manning is useless once all those cables are leaked (at least with respect to the past).

Your application is not secure

[T]here are a lot of people who start and go, oh, I am going to make a website, it’s going to swap MySQL up there. I am going to build all this junk, and then at the end, right before I launch, I am going to make it secure.

But the problem with that is, by that time you are screwed, because to do it secure, means that it actually really reflects a lot of your technology choices.

David Barrett, Expensify.com

You’ve been working on an app behind the firewall, and not had to actually make the application secure enough to run on the public Internet. Once you move it to the cloud, there’s little when it comes to firewall to protect you. You have to start thinking about security and writing your application appropriately.

I suspect this is the one of the real problems with “cloud security”: calling in the technical debt you’ve accumulated under the security column over the years. Folks like CohesiveFT would suggest that they have solutions for this – hopefully they and others do.

Regulations and Compliance

For whatever reason, the compliance you need to follow does not allow for cloud architectures and topologies. It could be something as simple as “you must own the hardware.” PCI is an oft cited one, and my discussion with Expensify’s David Barrett provides one discussion of how common cloud offering didn’t seem to cut it. It’s too long to simply quote here but it boils down to:

  1. Requiring two people with separate keys to do major tasks – “Split Knowledge, Dual Control Key that ruins auto-booting of servers
  2. Networked transactions – “if you are moving like a $10,000 expense report, and your server crashes halfway through, like you want to know, like did that money move or not?”
  3. The need for redundant datacenters for your (not a type) redundants datacenters – “you have to have at least three real-time synchronized data centers, to do real financial activity, in a way that’s actually sort of reliable and secure”)
  4. a whole bunch of other considerations

Most of these, you’ll notice, are not really “security” related, but these items come up in “cloud security” conversations because of their relation to PCI.

Larry Carvalho points out another example:

Regulations force telecom providers in several countries to keep all their data on call records in the home country. If a public cloud provider cannot provide that guarantee, it is one thing that will immediately disqualify a public cloud as a solution.

The point isn’t that “The Cloud” can’t comply with these regulations, it’s that you have to make sure they do…if you care about them.

Who’s responsible when things “go wrong”?

When errors and problems occur, folks have told me they’re uncertain if the cloud provider is responsible, if they are, if the network connection is at fault, if the user is: who gets pinned with the blame, and, we’d assume, the responsibility to fix the problem?

This is the one that flumoxes me the most because I tend to feel that when things go wrong, everyone is responsible. (That’s not realistic, of course.) Also, this seems like another way of saying that some cloud’s customer service and “enterprise relationship management” is bad.

“Accountability” feels like what people are trying to hunt down and most people are unsure about what various cloud offerings have to offer. IBM’s Steve Mills recently suggested that, as an example of good, enterprise customer service, IBM knew when to genuflect as needed in front of the customer. Others have snarked that part of the implicit role of big IT vendors is to “fall on their sword for you” when things go pear-shaped. All of that revolved around saving someone’s job, not really solving the problem – but often, keeping that pay-check coming every month is the only SLA that matters in The Enterprise.

Here, a large part seems to rest on trust. Do you, the consumer of cloud services, trust that the provider is not only secure but will “do the right thing” when problem occur? Traditional enterprise vendors have had decades to prove (or not!) that trust and have established relationships with customers to create that trust. New, cloud providers often don’t have that luxury.

There is little risk/benefit analysis available

The best way to secure an application is to delete it and it’s data. Just wipe it off the face of the earth and there’s no way it can be hacked. Of course, that’s not reasonable: security is irrelevant if you have no customers!

At some point, you weigh the risks (including security) of letting the application loose into the wild against the benefits (usually money) it will get you. You do risk/benefit analysis and at some point when the benefits out-weigh the risks enough, you’re “secure enough” to release the application.

Think about e-commerce, online banking, PayPal, online gambling – all of these are (to retro-actively apply the term) cloud-based applications that could go seriously wrong (loose lots of money) if compromised…and yet they can be seriously profitable if they work.

These might take the form of SWOT or other analysis that go over worst and best case scenarios when it comes to security and business benefit. There’s even less “standardized security processes for the cloud” available. Several enterprise architects trying to sort out cloud security have told me that they can’t find a risk/benefit analysis, well, “template” for the cloud space. (See this EU report on cloud security for more discussion on this topic.)

Cloud security is not much different than plain old security

The end result of all my scouting around for what “cloud security” means leads me to one conclusion: it mostly means the same thing as plain, old security. That is, the same general practices and thinking that lead to secure applications can guide you to get security in the cloud (these are just a few):

  1. Your applications are going to be insecure if you don’t spend a lot of time securing them.
  2. You want your providers to follow secure practices and keep their systems patched and up-to-date.
  3. There’s always going to be some paper-work (some helpful, some not) that you have to comply to.
  4. Much of your security work will be counter-measures and hunting down bad agents. Technology is often used more as a tool than then as “the solution.” See these examples from Jeff Jonas for some fascinating Las Vegas “hacks.”
  5. You need to balance all of this effort, and possible disasters that can occur, against the possible up-side of the application you’re sticking on the cloud.

What people fear is The Fear of the New – The Unknown. They think something about cloud computing is wildly different – more than likely, it’s about exposing your technical debt in security, your current weaknesses, not introducing new problems.

Background and more

Categories: Cloud, Enterprise Software, Systems Management.

Tags: , , , , , ,

Eating the full cloud pie – highlights from Randy Bias' guest apparance

There is no half-steppin’ in cloud, guest Randy Bias of Cloudscaling, IT Management and Cloud Podcast #087 – Transcript

View more documents from Michael Coté

Going full-tilt on cloud is a lot different than just installing some cloud products and stacks. That’s the take-away from reviewing a conversation I had recently with Cloudscaling’s Randy Bias, the full transcript is in the above PDF (or go to the original IT Management & Cloud podcast show-notes for the plain-text transcript).

Here are some highlights from that conversation (all from Randy):

On “Enterprise Clouds”

I have had this kind of like rant about the enterprise cloud lately is because, I really figured out lately that this whole approach to building sort of these “enterprise clouds” is fundamentally broken from the ROI point of view.

I mean, you have sort of got this weird disconnect or you have got the larger service providers, I don’t want to name anybody’s name, they are pretty obvious when you go out there and look at them, they have got these big enterprise spaces and they are saying, “hey, enterprises don’t want what Amazon has got, they want something different, they need to support all these legacy applications.”

So they are trying to build these very complex, very expensive clouds that are not going to be anywhere near cross-competitor with Amazon. And at the same time you look at the centralized IT department and they are making a decision. They are like, “well, are we going to outsource all these legacy apps and our jobs go away, or do we just build our own internal private cloud?” Most of them are choosing to go down building that internal private cloud route.

So you have got centralized IT going to the enterprise vendors to build an infrastructure that looks exactly the way these external public enterprise clouds look, same people, same technology, same management processes. And I don’t understand how there’s — I don’t see success in the future for either of those paths, and they are inherently competing with each other as well, and Amazon has kind of run away.

On security

Security is sort of a nonstarter. You can build a cloud to be as secure as you want, doesn’t matter what techniques you use. I just pretty much ignore that.

On what cloud operations looks like

Any kind of infrastructure cloud is basically going to look a lot like Amazon. Your CAPEX costs are going to be reduced by something like 75%. Your operational costs are going to be reduced similarly, at least for the infrastructure side. And you will probably see a change of a factor of 10 or a 100x in the number of infrastructure people you need to run a successful private cloud.

Any kind of IT that provides simply basic services to the business probably shouldn’t be run by the internal IT department. The internal IT department should be focused on those parts of the business that are fundamentally differentiating and that should be what your private cloud is focused on.

On the need to be transformative, not just install things

[T]he things there that people are still looking at this as sort of a product problem instead of a transformation problem, and I have literally had senior enterprise people say to me, “wow, we are buying this new automation software, we are going to put it in our data center and we are going to turn our data center into a cloud,” and I just tragically fell off my chair laughing it was like, there is no software you can buy to automate your data center and turn it into a cloud, if there was somebody would have been successful with all the attempts that have happened over the past 30 years to automate data centers. I mean, that’s not what’s happening.

…people look at it as sort of being solved by products, and I don’t think it can be solved by products, it has to be solved by a combination of products, architecture, and cultural change.

On standardization, simplifying IT

How has Amazon got 400 engineers and data center techs basically running 80,000 plus physical servers? I mean, it’s because they are doing things very differently [than traditional IT]….

And part of the economies of scale is like very homogenous environments. Like Google is reputed to have five hardware configurations across one to two million servers, whereas in a typical enterprise environment I have seen hundreds of hardware configurations across a much smaller footprint.

If you liked the above, check out the full episode, it’s chock-full of nice cloud commentary.

Categories: Cloud, IT Management Podcast.

Tags: ,