Al Harris On Kiro And Spec-Driven Development

Al Harris On Kiro And Spec-Driven Development

Share via Twitter Share via Facebook Share via Linkedin Share via Reddit

Get more video from Redmonk, Subscribe!

In this New Builders conversation, Al Harris, Principal Engineer at AWS, discusses the development of a natural language system aimed at helping developers translate specifications into code and vice versa, particularly as it relates to AWS’s agentic IDE, Kiro. The conversation explores the significance of this translation process, the challenges faced in natural language processing, and the potential future of software development with AI integration.

This is a RedMonk video, sponsored by Amazon.

Links

Transcript

Rachel Stephens (00:04)
Hi, I’m Rachel Stephens and I am a research director with RedMonk, the developer focused analyst firm. Essentially that means that my job is to help do investigation and research into technology tool trends, particularly from the point of view of the developer and practitioner. And one of the trends that has taken the development world by storm in 2025 is agentic IDEs and specifically spec-driven development.

And so there is no one better to talk with me than the lead developer of Kiro. So with me today, I have Al Harris, a principal engineer at AWS. Al, thank you so much for giving me some time to talk through some of these exciting trends.

Al Harris (00:43)
Yeah, thank you. Really excited to be here, Rachel. I can talk about Kiro and specs all day. you know, I think we’ll have to, it’ll be hard to keep this short.

Rachel Stephens (00:50)
It’s gonna be great. I’m looking forward to it. So, Kiro came into the world this summer and is going GA at reInvent. I would just love to hear more about the process of getting this project off the ground, from getting it from instantiation to where it is today. Can you run us through what was that process like?

Al Harris (01:11)
Yeah, definitely. At Amazon, we go through the PR FAQ process, we write kind of the press release before we start writing code in most cases. And so the PR FAQ for Kiro was written, I think sometime early last year, early 24. I first found out about the project when I was actually on a paternity leave, got an email from my VP congratulating me on starting Kiro, you know, I came back from leave and said, Hey, what is Cairo? And then we kind of hit the ground running from there.

Rachel Stephens (01:37)
I love this. Congratulations on both of your babies.

Al Harris (01:37)
there, was a lot of ambiguity. Exactly. Yeah.

Yeah. They were about the same age. the, yeah, there was a lot of ambiguity early on. We knew we wanted to build a system that let developers take, spec to code and code to spec, using something, something along the line of natural language, to specify the system you’re building and then to also be able to extract, a module, a component, a service, whatever, back into natural language. And that was sort of the brief at the start. We did not know we wanted to build an IDE. Yeah, go for it.

Rachel Stephens (02:02)
Mm-hmm.

Okay, so at the start, it wasn’t for for sure an IDE, it was just a developer tool. Okay.

Al Harris (02:12)
Just a dev tool. Exactly. Yeah.

Yeah. Yeah. The goal was help developers build systems better using natural language. The IDE was the form factor we sort of landed on after going through a couple of prototypes.

Rachel Stephens (02:24)
And how long did it get from kind of like that initial PRFAQ to developing code to getting it out the door?

Al Harris (02:31)
We went through a few months of prototyping phase. We built, I think, three or four different iterations of the system kind from the ground up. In fact, I was just going through some of the old docs prepping for this. And it was a little bit funny. You kind of forget about these things a year on, but we had three totally different ideas. We were kicking out looking at market competitors, image competitors saying, hey, what are people doing that works? What are people doing that we don’t like? But ultimately, what is the tool we want to build?

Rachel Stephens (02:33)
Mm-hmm.

Al Harris (02:55)
The anecdote I kept using with the team, and I think it was actually a poor way to describe it, but at the time it was my best way of explaining what I wanted was, I felt maybe the first thing I should say is that I was sort of an LLM and GenAI hater when I was put on this project. Pretty bearish on the technology. It’s like, I’m not seeing the value. I’ve seen some of these kind of early tools, but I’m not realizing any benefit myself.

but after integrating with these things and building sort of the first iteration of what we considered to be a spec, was like, Oh, this is rough, but there’s really something kind of cool here. Like this is obviously not a shippable product, but when I have my thumb on the scale and I can control how this system is going to work, I can actually achieve something I want to go build. And I had to be very specific and I had a spec to describe how I wanted to build, for example, a restful CDK app, super

constrained problem set, but I was able to move very quickly on that. And that demo gave me lot of confidence that there’s definitely something we could do here. I think we prototyped for a few months. We had sort of a rough plan of where we wanted to go by November of last year. November to January, we got, I would say, our first sort of iteration of a semi-complete system put together. And then we had a demo video that went out January.

of this year that I think garnered a lot of attention from our own leadership and people started taking us a little bit more seriously. I mean, we were a small team and we still are a reasonably small team, but for those first six months, we were three to five people just kind of sitting in a closet iterating.

Rachel Stephens (04:22)
Yeah.

I saw the first demo of it in February of this I saw one of the early ones. But I actually I also was going back through my notes for this. And my one of my initial notes was “spec is holy shit.” I love I loved the markdown way of describing your intention. Like, for me, it’s always resonated as a general thought process.

Al Harris (04:33)
Yeah. Hahaha! Yeah. we’ll probably get there a little later in the conversation. But we initially didn’t even have Markdown, like that was not the plan. We had this whole structured notebook. And it was like, we built all this stuff, and then looked at it and said, this doesn’t work for everyone. Like, this is not a universal interface, we need to scrap this and start over. So yeah, definitely getting back to plain text every time.

Rachel Stephens (04:49)
Yes. I want to spend some time going through what those iterations looked like. before we dive, because you’ve been actively building this for 14, 15 months, it sounds like. Which means that you were already in flight building this tool when kind of the concept of vibe coding like that infamous Andrej Karpathy tweet about what it means to vibe code. That happened after you have started.

Al Harris (05:14)
Yep. Yeah.

Rachel Stephens (05:31)
MCP existed at the end of last year, but it didn’t really take off into the world until early 2025. So that all kind of came into being as you were going. Tell me about the product development process in a world where everything is moving so fast and when you have to adapt to some of these like really significant and seismic changes in the entire landscape.

Al Harris (05:51)
Yeah, I mean, there’s no way to do it other than throwing out your preconceived notions every half year. I mean, products that people were really, really bullish on six months ago are not talked about today. You have to keep up. I think we’re sort of, doing our best and hoping to maintain relevancy, but it’s challenging. I think the goal has to be really small teams that can move quickly. There have been some interesting, I think, blog posts coming out from other

DEs at Amazon, Joe Mag and Anthony Liguori posting on sort of some of the things they’ve been doing with GenAI moving really quickly. But ultimately we have a very small team of core people that really understand the system that we built and they can move quickly. And even if we say, hey, we’re gonna scrap this whole thing that we just spent three weeks on, which was a lifetime at that point. I’ve built a lot of products at Amazon by the way, and this is like by far the fastest pace thing I’ve ever worked on. It’s absolutely wild.

I’m not sure if it’s been a week or 10 years. like you mentioned, MCP existed last year and end of 2024, but it was something we talked about like, oh, that’s pretty cool. That’s interesting. I think by Feb of this year, every VP I knew was talking about MCP. It’s like, you know, we have CEOs talking about it. It was a very weird shift to have what felt to me like a detail.

An implementation detail of the system suddenly is at a very high level being discussed. But ultimately we didn’t change the process much. We had to keep going back and saying, we are not happy with what we’ve built. We don’t think this is it. The ecosystem has changed and we need to be able to kind of scrap the old way of thinking that we had six weeks ago, three weeks ago, whatever, and start over. And that’ll kind of, I think, talk a little bit about how spec changed, but even just how what the product was changed over and over again.

Rachel Stephens (07:09)
Yeah.

Al Harris (07:35)
Initially, example, vibe coding, our agentic chat experience was completely separate from the spec experience. These were basically two unrelated features within the product suite. And those have kind of come back together over time. But that was kind of one of the big changes is, need, again, text is sort of this universal entry point. We need to make sure that all parts of these systems understand and can sort of ingress you to the other features. So if I’m doing inline editing,

Rachel Stephens (07:35)
Yeah.

Al Harris (08:03)
it can get me into spec development, or if I’m doing spec dev, can move me over to writing code in a specific file, things like that. And that’s when things really started to feel better, I would say, which was quite close to launch when those things became cohesive.

Rachel Stephens (08:11)
Mm-hmm. you. You mentioned that there were many iterations of what it meant to kind of create a spec. How do you define spec-driven development now? And maybe maybe compare and contrast it to vibe coding and just, I think that’s a more common entry point for people.

Al Harris (08:27)
Yeah. Yeah, I would say personally, the way I keep defining spec-driven dev and even people on the team might disagree with me on this, but it is a structured workflow with a specific set of artifacts that you generate and persist that will give you reliable and reproducible results. Each of those statements is doing quite a bit of heavy lifting. And sort of throughout the process of developing Kiro and shipping Kiro, those things have changed.

But that is, think, across the ecosystem, looking at what spec means to different companies and different people trying to build. This is not a novel goal, right? A lot of people are trying to do it right now. And I think we’re all trying to figure out what the best way to do it is. But it is that I want a structured workflow. I want you to take me through some semblance of an SDLC. These things exist for reasons, right? We all went to agile training a decade ago because it was the way to do things. And then XP and da da. Now we’re all going to be spec devs.

There always has been sort of this SDLC and you have to adhere to something, I think, to get reproducible results across a large team. One person can do whatever they want and do it pretty well because you can hold that process in your head, but you need to formalize the process and then actually persist and share the artifacts, I think, in order to be successful as a scaling team.

Rachel Stephens (09:42)
Gotcha. One thing I’d love to hear you talk about more. So we’re talking about how Kiro can produce reproducible artifacts and some of the different ways that you can give Kiro direction. You can give it via steering. You can give it hooks. You can give it your actual spec or you can prompt. Can you kind of talk about where somebody should do each? what exists for what,

Al Harris (09:50)
Thanks. that is a good question. And that is a question we’re still working on. You know, we’ve built all these features and we built them because we think they’re useful. Even even within the team, I had a 90 minute session with the whole team, we kind of sat down and said, hold on, you know, we need to think about the way we’re doing things. And ultimately, like we built, for example, steering was intended to be a way to provide additional context to the agent at different life cycles.

sort of your development. you would have sort of an I forget what we call it and always included steering dock for things like this is how you run tests and builds in my system. Here’s how I like to lay out the code. Here’s how you know I prefer. Yeah, please.

Rachel Stephens (10:46)
Can I tell you what I always use steering for? I always make my agent talk like a pirate to me. This is my favorite one.

Al Harris (10:52)
That’s a good one. We were just talking about somebody on the team has theirs chat to them like a net runner. So it says, chew them a lot. Yeah, yeah, there’s a lot of fun ones. These were the best way for us to burn tokens, I think, early on. But even for steering, an idea we were kicking around was, well, we don’t have a good concept today of memory in Kiro We certainly want it. We know it’s a gap.

Rachel Stephens (11:02)
I’m gonna have to try that one next.

Mm-hmm.

Al Harris (11:18)
But can we have the poor man’s version of memory by having a steering dock that’s manually included that has very extensive docs on, let’s say, how to write integration tests, and then a short steering that’s always included that’s just a pointer to that thing, right? If you need to know more about running integration tests, read this file. So even within the team, we’re still trying to define those things. But ultimately, steering exists to help you build the context you want the agent to know about, either based on what you’re doing right now or what you’re doing all the time.

Rachel Stephens (11:26)
Mm-hmm.

Mm-hmm.

Al Harris (11:44)
Personally, when I’m doing development, I have an extension I distribute with the team called Luva, which I think is Portuguese for glove, but it was just a name that, know, Kiro suggested to me. Basically, it’s the tool that I give our team to help them develop Kiro more efficiently. And that’s something that, you know, code quality sort of doesn’t matter because I’m just going to ship this to five people to move fast. But in that case, every time I do anything with a spec, I have, I think, 19 specs in that, even though it’s a fairly small code base. Each of those is a feature I’ve built.

And then every time I run a spec task, we’ll kind of figure out what failed, what didn’t go well, how did we debug it, how did we build it? And I’ll ask the system to go and update its steering files. So over time, it’s gotten really efficient. It understands some build process well. This is of course just a trite example, but ultimately nothing’s changed in the GenAI world. You still have to go sharpen your tools. that is using the steering. Our intern just shipped an improvement to hooks. So hooks have more.

support now. I’m hoping that is out by the time this video drops. I’m very optimistic. But so now you can do things like run commands at different life cycles. And that’s useful for things like internally at Amazon, people use multi package workspaces quite a bit when they develop and understanding which work which I would say parts of your workspace are loaded is really useful. So running like a workspace show shell command

is really helpful at the beginning of a coding session. So that’s like an example of a very common ask we’ve gotten internally from Amazon builders. And then ultimately, spec is gonna be the way that you kind of say, we’re gonna go build a feature, we wanna document it, we wanna go through the process, et cetera.

Rachel Stephens (13:15)
So if I were going to shorthand what you just said, so like it would be you’re not having to write so much spec every time. Is that the right way to think of it? Or not quite?

Al Harris (13:19)
I would, I think that’s a good way to use it. I think of them as sort of orthogonal concerns. Spec is what I’m doing and steering is how I’m doing it. Yeah.

Rachel Stephens (13:30)
Okay, gotcha, that makes sense. Help me just place Kiro inside the broader landscape. So we’ve spent a lot of time talking about spec-driven development. A lot of the tools out there have kind of adopted the vibe coding approach to things. Talk to me just how you view the landscape of your tool and all the tools around you.

Al Harris (13:53)
Yeah, mean, ultimately, the feature I still want to ship is the same feature I started on day one. And that is the ability for devs to go from spec to code or code to spec. My charter is to do that in the most efficient way possible. Shipping an IDE was sort of a necessary evil to do that. But we’re also now exploring how we can decouple sort of that process from the IDE surface. That’s not a short term plan or even a goal right now, but that is one way we’re thinking about

Rachel Stephens (14:01)
Mm-hmm.

Interesting.

Al Harris (14:19)
right? And that’s, I think, natural, given the direction the ecosystem is moving, right? People love Claude Code Web. You know, the I can tag Cursor in a a GitHub issue, and it just goes and works on it. Like, that’s that’s kind of the dream about we want to find the right way to build spec in a way that you can iterate on it sort of asynchronously, or with multiple people or whatever. But ultimately, like,

Rachel Stephens (14:19)
Mm-hmm.

Al Harris (14:42)
that only matters if we can do it in a high quality way. And that means we can continue to have some semblance of the software dev lifecycle that we think matters or that you think matters. TBD, we’re looking at different entry points there because right now requirement design task list is kind of heavyweight for everybody, even though we think they’re high value artifacts. ultimately we wanna be best in class in the spec-to-code space.

There are going to be people who are going to blow pants off us in the IDE space. We just don’t have the org size for it. We don’t have the experience. We want to help customers build code and ship solutions to problems. And I think that’s always been the Amazon and AWS way, right? We’re not necessarily a tooling company. We are a solving your problem company.

Rachel Stephens (15:26)
That actually takes me into one of the questions that I think is most intriguing about Kiro and kind of talking about that Amazon way is this product kind of broke a lot of the norms that in terms of products that we see come out of Amazon and AWS. I think for me most notably is that Kiro shipped as its own standalone brand rather than being under either the Amazon or AWS umbrella. Talk to me just about.

Al Harris (15:37)
Who?

Mm-hmm.

Rachel Stephens (15:49)
what that looks like from the inside, why did you make choices that maybe maybe all these choices happened way above both of our pay grades, I don’t know but just what do you what do see from the inside?

Al Harris (15:58)
Certainly these decisions were made well above my pay grade. I’ve never had a chance to influence the CEO before, this was an interesting one to be certain. Where Andy Jassy and Matt Garman want to position, this is ultimately totally their prerogative and I have no skin in the fight. What I care about is that developers can use Kiro. And that means supporting social sign-on and social sign-on sort of…

Rachel Stephens (16:23)
I love, that’s another good one to highlight.

Al Harris (16:24)
in it. Yes. But that inherently breaks a ton of like assumptions that everybody AWS always has, which is you have an AWS account, you have a credit card on a file, you have an address and a yada yada yada. that’s so much friction. I don’t know the number of clicks to create an account. It’s a larger number. Our edict was I think two clicks to be signed into Kiro like a download and a start. I’m not sure if we hit that or not. But it’s it’s

That was critical to us, right? Is to make sure that you as a standalone developer can use Kiro and ideally love Kiro. you know, my director went and figured out all the escalations we needed to do social sign-on and do all those things. And so my job is to make sure that when you use it, you’re loving it. But that was, that ultimately I think actually led to a lot of these decisions just because we broke the mold a lot from AWS. In terms of how we operate, it’s not…

Rachel Stephens (17:14)
Mm-hmm.

Al Harris (17:17)
fundamentally different than other teams at AWS. We are, I think, smaller and we’re trying really hard to sort of insulate a little bit more so we can move fast because we need to move fast in this space, kind like we said earlier. Let’s move in. Yeah, exactly. Yeah, we can’t spend six months thinking about an idea. It has to be done in two weeks. Yeah.

Rachel Stephens (17:29)
Yeah, space is moving quick. Yep. Yeah. So speaking of all the things we’ve got done, can you give us an update on what’s launching at re:Invent?

Al Harris (17:43)
Yeah, this is.

Rachel Stephens (17:44)
I mean other than the tool at large, you’re going GA, correct?

Al Harris (17:48)
GA I think is mostly, you know, it’s for looks. Not much is functionally changing with the GA.

Rachel Stephens (17:52)
Because you really shipped a polished product upfront. think that’s another part of what was a mold breaker though, is that it was not something that shipped on kind of a deadline driven shipping. was when we are ready to ship this.

Al Harris (17:56)
that’s… I mean, yeah, the, the, want to say the two to three weeks leading up to the launch, it was a daily review with my senior leadership basically saying, what did we get done yesterday? Do we love this yet? No, let’s go back and try again and see you in 24 hours. Like we did that kind of daily for quite a while, just about trying to kind of get these little, the little things polished up. And I mean, if you’ve used it since then, you know, we’re not, it’s not a perfect product. There’s a ton of things we’re not happy with and a ton of things we have to do. So that said, I know you had some preview.

Rachel Stephens (18:33)
For sure.

Al Harris (18:37)
build that you saw. it’s still, it launched much better than that. Those last three weeks were critical to getting it polished. But I think coming into re-invent, we’re launching a bunch of really, a bunch of cool things. You know, my buddy Jason, Jason just launched a remote MCP. Peter’s launching multi workspace support. Everybody on the team is kind of shipping some feature. We’re going to have more efficient summarization coming from Karthik. Raghav, the newest engineer on the team added checkpointing.

So these are all things that are launching that are not like novel and huge features in and of themselves, but they’re turning it into a complete and cohesive product and one that you can use just sort of all day. Also, we’re shipping a way to share config across users. So this is going to be bundles of some sort of registry for MCP, steering docs.

yeah, we want to be able to share these packages of you how do I integrate with X and Y external company really quickly and easily. I know we’re doing a lot of work in the next few weeks to make enterprise onboarding easier.

So yeah, it’s a lot of little things and then a couple big things coming down the pipe, but ultimately it’s just really trying to get the product to a place that we feel like it’s whole.

Rachel Stephens (19:40)
Now I’m realizing that you probably have a lot of limits on what you can tell me in terms of outlying roadmap, but can you kind of just paint a picture on how you see the world of spec-driven and development advancing in 2026?

Al Harris (19:54)
I think that ultimately we need to solidify the processes that have been built. Right now, spec-driven development, even in Kiro, is still very much vibe coding with a structure. It’s going to be putting analytic tools, neuro-symbolic analysis. I think there’s going to be a little bit of talk about this at reInvent, but using some of the scientists at Amazon to build. non-LLM and sort of non-textual ways to represent the correctness of the solution that you’re building with spec.

Rachel Stephens (20:24)
What I just heard is making it more deterministic. Is that accurate or no?

Al Harris (20:28)
Yes, that’s a great way to put it. Thank you. You stole the words. not out of my mouth because I didn’t say it, but no, no, definitely. Yeah, it’s making it more deterministic, making it something that you can trust and making it something that is actually reliable. Because right now for us, reproducible and reliability determinism in the system is based on really begging the model to try hard to do a good job. And that absolutely does not scale. So we’re looking at different techniques to

Rachel Stephens (20:33)
sure I wasn’t lost I didn’t mean to steal words

Al Harris (20:55)
for example, measure the ambiguity and requirements you give the system to say, as you’re building a design, how do we actually, how do we have high confidence that the implemented solution when you run, when you sort of run the spec tasks actually meet the requirements of the initial requirement list you generated? Those are all like non-trivial problems in and of themselves, but those are the areas we’re investigating, making sure that.

We’re not just saying it looks, you know, saying it’s better, but like it actually does lead to better software. Cause ultimately you probably don’t want to care about the code in most cases, but you do care about the fact that whatever the LLM does is actually adhering to the invariance and the properties you provided at the start. That’s super critical to us. And right now we’re, begging and borrowing, but we’re not, we’re not, know, there’s, there’s no Al Harris guarantee on that for sure. We’re, and that’s what we’re working towards.

Rachel Stephens (21:44)
Fair enough. Well, yeah, to me that sounds like there are things that LLMs are amazingly good at and there are things that are inherently challenging with them. So trying to channel the things that they are good at and then help augment everything else, like that makes sense to me.

Al Harris (22:01)
Making sure we’re not using this hammer for every every problem. You know, that’s not a nail.

Rachel Stephens (22:05)
Yeah, It’s a good analogy. Well, Al, thank you so much for, I know you are busy all the time, so I appreciate you taking any time out of your day to talk with me and chat about all the things about spec-driven development.

Al Harris (22:20)
No problem whatsoever. I’m super excited to be here and thanks again for having me. It’s been fun chatting.

Rachel Stephens (22:25)
been great. Thanks.

More in this series

The New Builders (8)