AI Slop & the Vulnerability Treadmill

It has not been a relaxing few months for software security teams.

In December, React disclosed its first critical CVE: an unauthenticated remote code execution flaw in Server Components. In March, not only was Aqua Security’s Trivy, a widely-used security scanning tool, compromised twice in three weeks through a GitHub Actions misconfiguration, but hackers also compromised a maintainer account for the Axios npm cURL package in order to publish backdoored versions containing a cross-platform remote access trojan that silently exfiltrated credentials. In April, Vercel disclosed a security incident originating from a compromised third-party AI tool, Context AI, used by an employee that gave attackers access to customer environment variables.

Many in the vulnerability space lay the blame for these cascading incidents squarely on AI—and they’re not wrong, though the story is more complicated than “AI bad.” AI-generated code is letting vulnerabilities slip into production at an alarming rate. Researchers at Georgia Tech’s Vibe Security Radar tracked CVEs directly attributable to AI coding tools and found that March 2026 alone produced more than all of 2025 combined. AI tools are also allowing bad actors to infiltrate systems in new and creative ways. The Axios npm compromise wasn’t a brute-force attack—it was “AI-enabled social engineering.” AI allows attackers to mount more elaborate and convincing campaigns against open source maintainers, while simultaneously flooding the ecosystem with code that is outpacing security teams’ ability to keep up.

In my previous post on the AI Slopageddon I covered the contribution quality crisis. AI-generated pull requests are overwhelming maintainers. The social contract between contributors and projects is breaking down. And AI platforms are making it worse.

This piece is a companion. It’s about the state of supply chain security in the age of AI. We look at slop vulnerability reports, a CVE database that may already be too slow to matter, and a software supply chain with security defaults designed for a world that no longer exists.

Meh-trics

_{Rachel Stephens’s laptop boasts a “Meh-trics” sticker.}

At RedMonk, we talk about metrics a lot, and often with suspicion. We’ve watched an entire ecosystem spring up around the monetization of perceived value in software—GitHub stars, contribution graphs, download counts, bounty payouts—proxy after proxy, each one promising to measure something real about the health and quality of a project, and each one an invitation to be gamed. Gaming isn’t new (Goodhart’s Law, amirite?). What’s new is the low cost of doing it well. AI has made it trivially easy to game software industry reward systems without delivering the outcomes they were designed for.

AI has collapsed the effort required to manufacture every signal of credibility the software ecosystem depends on, and the consequences are rippling through every incentive structure the community has built. Bug bounty programs are drowning in AI-generated reports that cost pennies in tokens to produce, but hours of expert time to debunk. Salaried internship programs like the Linux Foundation’s Mentorship Program, Google Summer of Code, and Outreachy are grappling with a murkier version of the same problem: maintainers I’ve spoken with are struggling to determine whether participants are actually doing the work or using AI to get their foot in the door without intending to meaningfully contribute afterwards. Even the resume-enhancing cachet of being an open-source contributor, a formerly reliable signal on a junior developer’s resumes, is losing its value as the cost of faking it approaches zero. AI is hollowing out the systems that once rewarded genuine effort from the inside.

What happens when the signals on which we’ve built our ecosystem stop meaning what we thought they meant, and what incentive structures we might build instead? To answer these questions, I looked at the vulnerability space, because that’s where the perverse incentives cut deepest and the money flows most visibly.

The Black, White, and Gray Markets for CVEs

Every vulnerability has a price. The question is who is paying, and for what?

There has always been a black market for exploits, so a white market emerged to balance it out (shout out to Bryan Boreham, Distinguished Engineer at Grafana Labs, for framing it this way in our recent conversation at Monki Gras). For this reason, a sophisticated, tiered global economy has emerged that is willing to pay for exploits.

At the bottom are the outright black market forums where weaponized exploits and stolen credentials trade hands with no pretense of legality. A threat actor claiming affiliation with ShinyHunters posted Vercel’s internal database on BreachForums that included customer API keys, environment variables, portions of source code at an asking price of $2 million. When suspected North Korean hackers from UNC1069 compromised a maintainer account for Axios, and shipped a remote access trojan to every developer who ran npm install during a three-hour window, that was a state-sponsored actor treating the open-source supply chain as an ATM for the North Korean missile program.

Above these sit the gray market brokers that purportedly sell to governments and intelligence agencies. Crowdfense offers “rewards ranging from USD 10,000 to USD 7 million for full exploit chains or previously unreported capabilities.” A UAE-based startup called Advanced Security Solutions launched in 2025 offering $20 million for “tools that can hack any smartphone.” Zerodium, arguably the most notorious broker in the space, spent years publishing a public price list for zero-days—up to $2.5 million for a full-chain iOS exploit—before quietly going dark in early 2025. These are the gray market players who sell to government and intelligence clients.

The bug bounty, by contrast, is the white market, or maybe more accurately, the counter-market. This time-tested resource at many companies exists to outbid the alternative. The entire premise is that if you pay researchers enough to disclose responsibly, they won’t sell to Crowdfense or post on BreachForums instead. Last year, HackerOne reported that 83% of surveyed organizations now use bug bounties, with total payouts reaching $81 million across all programs.

But bug bounties have become unsustainable at the exact moment they’re most needed, and even, in some cases, legally required. Bug bounty programs are being hit hard by AI—slop and non-slop. To begin, AI tools are finding all kinds of bugs.

The Arms Race

Security people love to call things an “arms race,” but in the case of AI the metaphor unfortunately fits. AI is the best weapon in both the attackers and the defenders arsenal, and the balance of power shifts depending entirely on who’s wielding it and whether anyone is paying them.

On the attack side: the cost of generating exploit code from a published CVE has collapsed just as thoroughly as the cost of generating a junk vulnerability report. The prt-scan campaign in March and April 2026 spent six weeks opening hundreds of pull requests against repositories with pull_request_target misconfigurations, rotating through throwaway accounts and using AI-generated, language-appropriate diffs to look like plausible contributions.

On the defense side: AISLE reported that their AI system discovered all twelve zero-day vulnerabilities announced in OpenSSL’s January 2026 security release, including bugs that had lurked undetected for 25 to 27 years, one of which predated OpenSSL itself, inherited from its 1990s predecessor SSLeay. Meanwhile, Anthropic’s Claude Opus 4.6 found over 500 high-severity zero-day vulnerabilities in well-tested open-source codebases without specialized tooling. These projects had fuzzers running against them for years, but the model found what the fuzzers missed. According to Stanislav Fort, AISLE’s founder and Chief Scientist:

AI is simultaneously collapsing the median (“slop”) and raising the ceiling (real zero-days in critical infrastructure).

Google, for its part, is betting that the answer is platform-level defense at machine speed. At Next ’26, the company unveiled what it’s calling “Agentic Defense“: a cybersecurity platform that merges Google’s Threat Intelligence and Security Operations with Wiz, the cloud security firm it acquired earlier this year. The headline numbers are sobering context for the investment: Google’s own M-Trends 2026 report found that the handoff time between an initial intrusion and a secondary threat actor has collapsed from eight hours to 22 seconds over the past three years (p. 55). At that velocity, human-speed triage is a rounding error.

Perhaps most relevant to the supply chain story, Wiz introduced an AI-Bill of Materials (AI-BOM) that automatically inventories every AI framework, model, and IDE extension across an environment: a direct response to the shadow AI problem that made the Vercel breach possible, where a single employee’s use of an unsanctioned AI tool cascaded into a platform-wide compromise.

But the strategic question isn’t whether AI is good or bad for security. It’s the incentive structure. Currently, the incentives reward finding over fixing.

Project Glasswing and the Bounty Crisis

When Anthropic announced Project Glasswing in April, they framed it as a response to a supply chain security crisis that had already arrived. Anthropic committed $100 million in usage credits for its Claude Mythos Preview model, along with $4 million in direct donations to open-source security organizations, explicitly to put frontier AI vulnerability detection into the hands of maintainers who otherwise couldn’t afford it. According to Jim Zemlin, CEO of the Linux Foundation:

I am optimistic but the urgency is real. We are in the most dangerous period, the transition when attackers might gain a significant advantage as the technology ecosystem digests the impact of AI.

Assuming we don’t write off Mythos as an elaborate marketing stunt, the subtext of Project Glasswing is that the white market for vulnerabilities—bug bounties, responsible disclosure, coordinated patching—will lose the economic argument without external incentivization. And the evidence for that subtext is already here.

cURL’s bug bounty program, running since 2019, found 87 confirmed vulnerabilities and paid out over $100,000. It worked until AI collapsed the cost of submitting garbage while leaving the cost of evaluating it unchanged. Daniel Stenberg, founder and lead developer of cURL, was forced to kill the program this January because his team was spending more time debunking AI-generated reports than writing code.

When I interviewed Viktor Petersson, co-founder of Screenly, he described the same dynamic from the other side. His company has received 331 vulnerability reports since launching its program less than six months ago. Thirty-nine were confirmed vulnerabilities. A huge proportion were duplicates. The volume got heavy enough that they had to build custom internal triage tooling and distribute review across multiple teams because no single person could keep up. After debating whether to kill the program, they decided to keep it—”it’s essentially an ongoing free pen test”—but Viktor was blunt about what he’s hearing from peers: more programs are going to be shut down. The teams simply can’t keep up.

The Node.js project tried to address this by imposing a minimum HackerOne Signal score to submit reports, eventually requiring new researchers to show up in the OpenJS Foundation Slack and talk to a human. It’s a reasonable filter, but every gate you build to keep slop out also risks keeping legitimate newcomers away. There is no clean solution here, only trade-offs.

Here is the core economic dysfunction: generating a plausible-sounding vulnerability report now costs pennies in tokens. Evaluating whether it’s real still costs an hour of expert time reproducing the steps, which are probably garbage, but which you can’t write off because what if.

You Can’t Just Shut It Down

_{Grace Hopper’s “First actual case of bug being found”}

Bug bounty programs are becoming unsustainable at the exact moment that reporting vulnerabilities is becoming legally required.

The EU Cyber Resilience Act takes effect in stages, and the first enforced milestone hits September 11, 2026. From that date, all manufacturers of products with digital elements sold into the EU must report actively exploited vulnerabilities to ENISA within 24 hours. You need a vulnerability disclosure program. You need SBOMs. You need continuous monitoring. You need all of this even for legacy products already on the market.

Stenberg could pull the plug on cURL’s bounty because it is a volunteer-maintained project with no fiduciary obligation to a regulator. A company selling software into the EU will not have that luxury come September. They’ll be legally obligated to maintain exactly the kind of intake mechanism that’s currently being firehosed with AI slop, and the penalty for non-compliance can run up to €15 million or 2.5% of global annual revenue.

And the CVE database itself? Although roughly 50,000 CVEs were published in 2025, up 22% from 2024, some security teams aren’t relying on lists of CVEs anymore (NIST CVE database, GitHub archived CVE records, Vendor-specific security advisories, OSV.dev, OpenCVE, VulnDB, CISA KEV Catalog) owing to (among other things) AI acceleration. By the time a CVE is published, the assumption is it’s already being exploited in the wild. All you need to do is point an LLM at it to write an exploit.

Some companies have responded by scanning package repositories like PyPI and npm in near real-time, using pattern analysis on code commits to detect compromises before a CVE is even assigned.

The implications for the VEX (Vulnerability Exploitability eXchange) and Common Security Advisory Framework (CSAF) infrastructure are uncomfortable. If the CVE is the lagging indicator everyone assumes it is, then the entire system of databases, scanners, and compliance checklists built on top of it is measuring yesterday’s weather. And the CRA is about to mandate that companies build their security programs around exactly these lagging indicators. That’s a perverse incentive of a different sort: regulatory compliance that optimizes for the appearance of security rather than its substance.

Paying for the Fix

The gap between “finding” and “fixing” is where money actually needs to flow, and a few models are trying to make that happen.

Sonar’s Tidelift pays maintainers directly to implement enterprise-grade secure development practices. Their 2024 survey data found that paid maintainers are 55% more likely to implement critical security and maintenance practices than unpaid ones. The mechanism isn’t complicated, pay people to do the work and more of the work gets done. Less clear is how “pay the maintainers” is going to intersect with “now my open source project has a fiduciary obligation under the CRA.”

Germany’s Sovereign Tech Agency has invested over €23 million in 60-plus open-source projects since 2022. Its Resilience program takes a notably different approach than traditional bounties: it reduces technical debt first through engineering contributions, then runs bug bounties, and crucially pays bounties to the maintainers who resolve these reported issues. The design was informed by Dr. Ryan Ellis’s research at Northeastern University, which found that bounties can actually undermine security for under-maintained projects by drawing attention and creating financial burdens the project can’t absorb. A 2025 feasibility study proposes a pan-European Sovereign Tech Fund with a minimum budget of €350 million building on Germany’s model.

Although models exist and work, they cover only a fraction of the ecosystem while the CRA is about to mandate vulnerability programs for everyone selling into the EU. Who pays for the assessment burden when the report-to-assessment cost ratio is this broken?

The Treadmill

Here’s where we are. Reports are cheap. Assessments are expensive. Bug bounty programs are simultaneously being killed and legally mandated. The CVE database may already be too slow to matter. The supply chain has consolidated around a platform whose defaults were designed for a different era. And AI is the best tool on both sides of the fight, with the outcome determined not by the technology but by whether anyone has the budget and the organizational will to use it for defense rather than noise generation.

The answer probably starts with flipping the ratio: making assessment as cheap as generation, paying for fixes instead of just finds, and treating supply chain security as the board-level priority it has been pretending to be. Until then, we’re on a treadmill that’s speeding up without an emergency stop button.

Disclaimer: GitHub/Microsoft and Google are RedMonk clients.

console.log()