Blogs

RedMonk

Skip to content

Justin Sheehy on Basho, NoSQL, and Velocity 2011

While at Velocity 2011, I asked Basho‘s Justin Sheehy to tell us how things have been going at Basho and what the current state of the NoSQL world is. We also have a good discussion of how developers are finding the “post-relational database world” and how GitHub plays into Basho’s business.

Transcript

As usual with these un-sponsored episodes, I haven’t spent time to clean up the transcript. If you see us saying something crazy, check the original audio first. There are time-codes where there were transcription problems.

Michael Coté: Well, hello everybody! Here we are in Santa Clara at Velocity 2011 and I ran into an old friend of RedMonk and I thought I’ll get an update about kind of what’s going on and then what’s going on in the NoSQL area. What don’t you introduce yourself?

Justin Sheehy: Sure. I am Justin Sheehy. I am the CTO of Basho Technologies. We make Riak, Webmachine, and some other open source software.

Michael Coté: So I mean how things have been going for Basho?

Justin Sheehy: Oh, it’s been going fantastic. The past year or so has been really exciting. Things are looking better on the money front and more important on the developer’s front.

Michael Coté: A good front.

Justin Sheehy: That’s right. That’s all that matters. So we are looking for — the next few months are really going to be amazing. We’ve got a fantastic team that’s only gotten bigger and better. In the next few months, people even just watching GitHub are going to see things start flying really fast and furious.

Michael Coté: Well, you know what, why don’t we rabbit hole into that first? How are you fitting GitHub into — I mean obviously, the development side, but how does that fit into the business side?

Justin Sheehy: Sure. So a big part of our business is it’s not just a software business; it’s an open source software business and today, the easiest and best way to engage with the open source community, no question about it is GitHub. And to me, that’s much less about the specific technologies involved, Git’s great and all that, but it’s much more about the way people are used to interacting there and it’s a very contribution and communication heavy environment.

Michael Coté: Right.

Justin Sheehy: Since we moved our development to GitHub, the amount of community involvement with the code as opposed to just the ideas and the documentation has really shot up and it’s been great for the product.

Michael Coté: That’s interesting. So there’s sort of trackable more contributions code-wise.

Justin Sheehy: Yes, no question about it. The rate of people actually contributing improvement and –

Michael Coté: And do you pay attention to like people who follow your stuff and who fork it and things like that.

Justin Sheehy: We look at it and we track it because you kind of be crazy not to sense the information is there, but I am skeptical of the things like number of followers and things like that and maybe even number of forks mean all that much compared to things like number of poll requests. I mean that’s heavy engagement.

Michael Coté: Yeah, I guess –

Justin Sheehy: But if someone does the fork and then does a bunch of work in it, I’d like to see that.

Michael Coté: Yeah.

Justin Sheehy: But I also see that there are lot of projects out there that get forked a lot and by itself, that doesn’t yet mean anything. It’s been an early indicator, but to me it’s when people start talking back and that can be in the form of poll requests or lots of other things that’s really exciting.

Michael Coté: So for people that don’t know, can you explain the portfolio that you guys have of your core products?

Justin Sheehy: Sure. So our core product and the things we sell is Riak. It’s a distributed database and two biggest reasons people go to it are for extremely high availability and for easy scalability and that easy part is a big deal; it’s really easy to install, really easy to operate, really easy to interact with the developer. Around that, we’ve built an ecosystem of other open source tools, the sole niches we cared about and that we put out there in the open source community. Things like Webmachine, which is a toolkit for building REST styled applications, things like Rebar, which is a build tool, and all those sorts of things, but RIAK is the product that the company is built on.

Michael Coté: And being a database to super-generalize it, what kind of data are people storing in it most commonly?

Justin Sheehy: Sure. So from a business point of view, there’s certainly not a — I would love it if there was a vertical to focus on. But it doesn’t work that way, just the same way that it doesn’t work the way for MySQL or Oracle or anything like that. It’s not the same shape of a database as, say, the ones I just named. It’s not a traditional table-based relational database. But we’ve found that the minor adjustments that people from the relational way of thinking to the way that they started in Riak are very small compared to the operational adjustments they would have to make to solve their availability and scalability problems with those kinds of systems.

Michael Coté: Right. And sort of performance benefits, like let be reload the question, but what it is for someone who is kind of used to SQL or relational stuff or the traditional way for doing database is like what — can you up through the sort of like a typical, I want to use a charge word like enlightenment, but how do they get to enlightenment to like oh, I get it. Here is why I should be using this rather than MySQL or –

Justin Sheehy: Sure. So I actually don’t think people shouldn’t be using that other stuff. There are ton of applications and I can think a couple of times really recently that I was saying to someone, I think the right answer to your problem today is MySQL.

Michael Coté: Right.

Justin Sheehy: Those are fantastic technologies

Michael Coté: Or Oracle Coherence.

Justin Sheehy: Oh, sure! Yeah, Cameron and company build a great product, but there are cases at last about — it has almost nothing to do with SQL or anything like that. In fact, most people using the databases that speaks SQL to them, a lot of the time you are not writing SQL. They are going through ORMs or some other document layer and if they are doing that, the impedance mismatch that they’ve got to the relational database is huge already and they don’t actually have to change much about the way that they are thinking about their own code. There are object layers and document layers for things like Riak too. So for a lot of the programmers that happen to be using relational business, most of them aren’t really using it for relations anyway.
(00:05:00)

Michael Coté: Yeah.

Justin Sheehy: So from many of those people that aren’t using it for huge ad-hoc relational queries most of time, it’s really easy. And then when they do want to do interesting ad-hoc queries, yeah, we have a different programming model, but that part is not hard.

Michael Coté: Right, right. Well that makes sense. So then broadening the topic a little bit like we were actually talking about this one while recording, I kind of forget when NoSQL kind of started, but it did seem to reach like an apex of fury.

Justin Sheehy: Oh, definitely.

Michael Coté: — like about a year ago or so, and you always know when these things nowadays reach some fury when, there’s almost a redefinition of what the word is. Now I remember there was a big discussion of what is the in-home –

Justin Sheehy: Oh, yeah, definitely.

Michael Coté: Remember that? So anyhow, I mean, like; well, first off like how long do you think this, whatever you want to call this space has been, this sort of post-relational database is the only thing sort of –?

Justin Sheehy: Sure! So I started really 2008 and 2009.

Michael Coté: Yeah.

Justin Sheehy: You had the first NoSQL events, they named themselves that, right then, and it’s been going ever since. But I think while you could use phrases like post-relational or whatever it’s referred to the artifacts, the databases, I think that the term NoSQL doesn’t make any sense to the technology category, right.

It’s a negation, or you can play games all there trying to redefine what the No is, and maybe it’s not only hey; but even if you do that, it still doesn’t tell you anything about the category, right. It doesn’t tell anything what the things are.

Michael Coté: Yeah, yeah.

Justin Sheehy: And so instead of trying to play games with the word to make that okay, I think it’s not a category at all. But what it is and what has been going on for the past, I guess three or four years now, is I think of it more as a movement, and by that up, a series of events in time. And what that movement is about or it gets and what they know is really about, is about a monoculture of database architecture, right, in the sense that a few decades ago, Oracle 1. Oracle 1, the early database was predating, MySQL certainly predating Modern and PostgreSQL.

Michael Coté: Yeah, yeah.

Justin Sheehy: – and so on, and Microsoft SQL and all these. Oracle defined the database architecture, everybody else followed, everybody did now. And so for the past couple of decades when people were building a new software project, they’d make a whole bunch of interesting choices, right; what languages to write in, what operating systems to use, but they weren’t really making any interesting choice about their database architecture because choosing MySQL or Postgres or Oracle isn’t — that’s a choice on detailed features –

Michael Coté: And like you’re saying, they built up the whole O/R mapping rule to kind of isolate themselves from that unmovable choice.

Justin Sheehy: Right, and so if NoSQL is anything, it’s a movement that’s sort of breaking up that monoculture of database architecture, right, a lot of the products that get put together and various software components as part of NoSQL are part of a movement; they’re not part of a useful category, right. Many of them have very little in common except that they’re all are sort of objection to the idea that there’s only one way that’s same to think about structuring and storing your data.

Michael Coté: Yeah, yeah.

Justin Sheehy: And it’s not that you need to switch from the old one-way to a new one-way, it’s equally broken. But the idea that just like you choose operating systems and you choose programming languages and you chose frameworks, you can choose database architectures. And that’s what’s been going on, and I think it’s finally starting to reach a point of, sort of general awareness. Even a lot of people that, in a lot of situations quite rightly still want to pick the same one they picked before, are becoming aware that there’s a choice, and I think that’s a big deal.

Michael Coté: Yeah, yeah. So you definitely feel a lot more exploring of other options nowadays.

Justin Sheehy: Right.

Michael Coté: Yeah, yeah. Now I guess that is like, that was the lasting effect like where the NoSQL stuff is a denominate. Now I think you’re right; it is, people are aware of it, as we used to say, it’s kind of on the shortlist.

Justin Sheehy: Yeah.

Michael Coté: Like people are willing to consider it.

Justin Sheehy: Right.

Michael Coté: — instead of just thinking it some wacky experimental thing.

Justin Sheehy: Even if they don’t consider it, they know they could have, and that’s what wasn’t even true before, right. Most developers five years ago weren’t even aware that there was an interesting choice for data storage other than the Oracle-shaped model, but whether it was embody to MySQL or Postgres or Oracle or whatever.

And so now, they know that choice exists, just like say someone that only ever programmed in Java, right, just to pick an example of something outside databases, might choose no, I’m never going to write my programs in C++, and they never choose that and it’s never on their personal shortlist.

Michael Coté: Yeah.

Justin Sheehey: They know that choice exists. And that’s the piece that’s new in databases right now. And that the mainstream software community is staring to become aware that a choice exists, even the people that aren’t really caring about their choice themselves just yet.

Michael Coté: That’s right. Yeah, yeah. That makes sense. Well, great! Well, thanks for the update.

Justin Sheehy: Well, thanks a lot. It was my pleasure.

Disclosure: GitHub is a client.

Categories: Conferences, Open Source, Programming.

Tags: , , , , , ,