10 Things Developers Want from their Agentic IDEs in 2025

In the afterglow of 2025’s avalanche of AI developer tooling announcements, I’m revisiting the “10 Things Developers Want from AI Code Assistants” series I authored in 2023 and 2024 because THIS YEAR THINGS HAVE REALLY CHANGED. The shift is so significant that I’ve updated the title from “AI Code Assistants” to “Agentic IDEs” to reflect the fundamental evolution in how developers interact with AI-powered development environments.

When I wrote the first installment in 2023, developers were primarily focused on features like autocomplete, test generation, and tab completion. The tools were reactive—they responded to developer prompts and suggestions but remained firmly in a supporting role. The developer was always in the driver’s seat, with AI handling discrete tasks.

2025 has ushered in a paradigm shift. The tools developers are now clamoring for don’t just assist—they act. Agentic IDEs represent a move from passive suggestion to autonomous execution. These aren’t just code assistants anymore; they’re environments where agents can reason about your entire codebase, modify multiple files, run terminal commands, execute tests, and iterate based on feedback—all while you work on something else entirely.

The distinction matters because it changes what developers are asking for. In 2023, developers wanted better autocomplete. In 2024, they wanted multi-file editing. In 2025, they delegate entire workflows to agents and have confidence in the results. The conversation has shifted from “help me write this function” to “build this feature while I review another PR.”

The marketplace has exploded with options. Market incumbents Cursor and GitHub Copilot stand, but 2025 saw significant new entrants and evolutions.

Windsurf (now owned by Cognition, the makers of Devin) positions itself as the “next-generation agentic IDE” with its Cascade agent that indexes your entire codebase. More on this in my “Hot Vibe Code Summer” post.
ByteDance’s Trae, a free VS Code-based IDE offers unlimited access to Claude and GPT-4o, and has attracted developers with its Builder Mode for automated project generation.
AWS introduced Kiro this summer, an agentic IDE which differentiates itself through its dual spec-driven development and vibe coding modes.
IBM unveiled Project Bob at TechXchange 2025, an AI-first IDE that orchestrates multiple LLMs.
Augment Code became the first AI coding assistant to achieve ISO/IEC 42001 certification for AI management systems.
Zencoder released Zenflow, an orchestration layer that coordinates multiple AI agents through structured workflows.
JetBrains has brought agentic capabilities to its IDE family through Junie.
Google closed out the year by launching Antigravity, an agentic development platform that places autonomous agents at the center of the development process. The team that built it was augmented by people they acquired with an acquisition of some of the Windsurf assets.

Meanwhile, CLI-based options like Claude Code and OpenAI’s Codex have gained significant traction among terminal-first developers. Cline CLI, Continue.dev, Aider, and OpenDevin provide model-agnostic options for developers seeking privacy and cost control.

Today the agentic IDE market is crowded. This diversity means developers have more choices than ever, but it also means their demands have become more sophisticated.

The List

Here are 10 things developers want from their agentic IDEs in 2025

1. Background Agents

The promise of “fire and forget” has captured the developer imagination. Developers want to queue up tasks, let agents work in the background or even overnight, and return to review completed pull requests. As Addy Osmani writes, “Imagine coming into work to find overnight AI PRs for all the refactoring tasks you queued up – ready for your review.”

Tools like GitHub Copilot’s coding agent, Cursor’s background agents, and Google Antigravity’s asynchronous task dispatch are responding to this demand. Simon Willison’s description of “embracing the parallel coding agent lifestyle” resonates with developers who want to supervise multiple AI “developers” working simultaneously rather than being tethered to a single synchronous assistant.

2. Persistent Memory

We expect memory to be a top concern into 2026. Memory is absolutely strategic. Developers are frustrated by agents that forget everything between sessions. Many are asking for agents that remember past decisions, recognize patterns from previous work, and maintain awareness of project history.

Claude Code’s memory features allow it to “remember your preferences across sessions, like style guidelines and common commands in your workflow,” while Windsurf’s Cascade that includes Memories, a “system for sharing and persisting context across conversations.” Developers want their agentic IDE to become a living system of record that captures not just code but reasoning.

3. Predictable Pricing

The pricing turbulence and rug pulls of 2025 left developers frustrated. Cursor’s shift to usage-based pricing caught users off guard. Claude Code users reported restrictive limits inconsistent with their Max subscriptions. Replit’s effort-based pricing led to cost overruns and developer blowback when Agent 3 spawned subagents for even minor edits. Developers want to see exactly what they’re spending: token usage per prompt, cost per session, and clear limits before they’re hit. Kiro exemplifies emerging expectations by showing teams precisely how much each prompt costs, offering more granular usage data and an “Auto” mode that selects a model for each prompt based on a “combination of cost-effectiveness and Sonnet 4-level quality.” This sensitivity to token cost is one developers increasingly demand as a baseline functionality for their agentic IDEs.

Thank you! The costs have gone up significantly! Love Agent 3 but it’s getting very expensive for me

— William Geronco (@williamgeronco) September 16, 2025

4. MCP (Model Context Protocol)

The Model Context Protocol, introduced by Anthropic in 2024, became the fastest adopted standard that RedMonk has ever seen. It followed an immediate S-surve adoption that reminds of Docker’s rapid market saturation. By end of year it was donated to the Linux Foundation’s Agentic AI Foundation, and has become the expected standard for connecting agents to tools and data. For obvious reasons tools vendors love it, and developers too. Tools platforms became part of agent workflows overnight. With adoption by OpenAI, Google DeepMind, Microsoft, and AWS, today’s developers assume MCP will “just work.”

While the MCP roadmap boasts that this standard is “enabling entirely new categories of AI-powered applications,” currently developers appreciate its usefulness in terms of integrations. Devs want their agentic IDE to connect seamlessly to Google Drive, Slack, GitHub, databases, and internal systems via MCP without writing custom code for each. They expect an MCP registry for discovering servers, robust authentication and authorization, and the ability to switch out components without rebuilding their entire workflow.

5. Multi-Agent Orchestration

Running multiple agents in parallel is an advanced use case, but one that is becoming increasingly expected. Tools like Claude Squad, Conductor, and Verdent Deck let developers spawn multiple agents working on different tasks simultaneously, each in isolated environments. These elite users want dashboards showing which agents are working on what, the ability to pause, redirect, or terminate agents mid-task, and intelligent conflict resolution when agents work on overlapping code.

Cursor 2.0’s multi-agent orchestration and Antigravity’s parallel agent dispatch suggest where the market is heading, but muti-agent orchestration remains out of reach for less skilled developers. As Gergely Orosz observes, parallel agent work demands skills typically honed by experienced tech leads. This creates an accessibility gap:

So far, the only people I’ve heard are using parallel agents successfully are senior+ engineers.

6. Spec-Driven Development

The specification-based approach adopted by Kiro, Tessl, and GitHub Spec Kit has struck a chord with developers seeking structure. Rather than trusting agents to interpret vague prompts, spec-driven development uses requirements.md, design.md, and tasks.md files that serve as the source of truth for agent behavior by providing a contract that both humans and AI can reference. Claims that we are returning to the days of waterfall aside, developers want their intentions captured in durable artifacts that survive context window limits and session boundaries. They want agents that update specs as code evolves, flag when implementation diverges from design, and use specifications as checkpoints for verification.

7. Reliability

The reliability complaints of 2025 have been loud. Although Claude boasts >99% uptime, Anthropic’s status page shows near-daily incidents. Windsurf users report latency and crashing during long-running agent sequences. Antigravity’s “model provider overload” errors frustrated early adopters. Developers don’t want impressive demos—they want tools that work consistently under production load. Flow state is sacred, and nothing destroys it faster than waiting for a sluggish response or recovering from a mid-task crash.

8. Human-in-the-Loop Controls

This is a little controversial, but many developers want human-in-the-loop controls to ensure that their agent won’t go off the rails. Sure, vibe coders can’t be bothered, but professional AI Developers want fine-grained permissions for what agents can and cannot do autonomously. They want approval gates before destructive actions (rm -rf, database writes, deployments), configurable autonomy levels per task type, and clear audit trails of every agent action. Closely related to human-in-the-loop controls is the emerging discipline of AI agent evals: tests specifically designed to evaluate an autonomous agent’s performance and safety. While parts can be automated, ultimately these assessments must be performed by humans with domain expertise.

The core of this debate is friction versus safety. One illustrative example of what retaining a human-in-the-loop looks like today is Microsoft and Red Hat’s approach to MCP. In order to appeal to enterprise users, they institute security that requires least-privilege permissions and surfacing all sensitive operations to the user. This least-privilege model is a direct countermeasure to the potential security disaster of a fully autonomous agent.

9. Rollbacks

When agents can modify hundreds of files autonomously, the ability to undo becomes critical. As Denis Volkhonskiy, SWE Agent Advocate at Nebius Academy, explains: “In most cases, it is better to roll back: this way you save tokens and have better output with fewer hallucinations.” Volkhonskiy is speaking specifically about Claude Code’s checkpoint system, which automatically saves code state before each change and enables instant rewind via the /rewind command, but the demand for checkpoints (see Kiro, Augment Code, Zencoder) and rollbacks is one we hear often for agentic IDEs. has become a template for what developers expect. Developers want confidence that they can pursue ambitious, wide-scale tasks knowing they can always return to a known-good state. This isn’t just about git. Checkpoints need to capture conversation context, tool outputs, and intermediate states that traditional version control doesn’t track.

10. Skills

Simon Willison argues that “Claude Skills are awesome, maybe a bigger deal than MCP,” and you’ll get no argument from RedMonk. Developers want the ability to package reusable workflows into shareable, version-controlled modules. Rather than re-engineering the same prompts repeatedly, developers want to define a code review or security audit skill once and invoke it consistently across projects and teams.

According to a study Anthropic conducted of its own engineers, it is the:

Repetitive or boring” tasks that engineers are most likely to automate: “In our survey, on average people said that 44% of Claude-assisted work consisted of tasks they wouldn’t have enjoyed doing themselves.

Skills are catching on more broadly with OpenAI Codex CLI’s support for skills.md. Developers want their institutional knowledge encoded in skills that new team members can adopt immediately, that enforce consistent standards, and that improve over time. The model of prompts for exploration, skills for repetition, is becoming standard practice.

Looking Ahead to 2026

As I look toward 2026, I’m struck by how the role of the developer continues to evolve. The skills that matter are shifting toward architecture, system design, prompt engineering, and quality judgment. Although we aren’t there yet, successful developers in the coming years will effectively delegate, review, and guide multiple AI agents working in concert.

The companies building these tools would do well to remember what I argued in my 2023 entry to this series, as it remains true today: agentic IDEs exist to serve developers, not replace them. Their entire value proposition depends on making developers more effective, which means these tools must adapt to how developers actually work rather than forcing developers to adapt to them. Breathless predictions of engineering teams rendered obsolete have not materialized, and they won’t. What we’re witnessing instead is an augmentation of developer capabilities, with AI handling more of the mechanical work while humans retain responsibility for judgment, design, and quality.

This year’s 10 things developers want from their agentic IDEs should shape how vendors approach this market. Developer tools live or die by practitioner adoption, and developers are notoriously unforgiving of tools that waste their time, break their flow, or fail to deliver on promises. The companies that treat developer feedback as optional, or that prioritize flashy demos over day-to-day reliability, will find themselves abandoned for competitors who listen.

Disclaimer: AWS, Microsoft/GitHub, IBM/ Red Hat, Tessl, and Google are RedMonk clients.

console.log()