What's New

What changed across AI agents this week.

Model releases with vendor-verified dates. Platform pricing and verdict changes from our daily audit. New editorial coverage. One feed, updated as it happens.

Weekly digest

Get the digest of what changed

Model releases, platform updates, and pricing shifts — summarised weekly. No spam, unsubscribe in one click.

This week

17 changes in the last 7 days.

New article2026-06-23

Claude Code vs OpenAI Codex (2026): Which Coding Agent Wins

Claude Code or OpenAI Codex in 2026? Pricing, ChatGPT bundling, agent autonomy, IDE integration, and the specific tasks where each one wins after side-by-side testing.

New article2026-06-23

Claude Opus vs Sonnet 4.6 (2026): When Each Model Actually Wins

Claude Opus 4.8 costs 1.6x more than Sonnet 4.6 but isn't always 1.6x better. The 70/20/10 routing pattern, specific task types where each wins, and the cost math at production volume.

Platform update2026-06-22

Hermes — pricing

Free and open-source under MIT. You pay only for model API tokens (200+ models accessible through its marketplace integration — Claude, GPT, Gemini, DeepSeek, Kimi, GLM, local models) plus your own hosting. Hosting on a $5-$20/month VPS ha…

Platform update2026-06-22

Hermes — verdict

The most technically sophisticated open-source agent harness in 2026. Server-deployed, model-agnostic, and the only platform with a genuine self-improvement loop that compounds over months of use. Right pick when you have technical capacit…

New article2026-06-22

Claude Pro vs Max vs API: When Each One Actually Wins

Claude Pro at $20, Max at $100, or pay-per-token API — which one is right for your usage? Cost math at three usage tiers, the crossover points, and the hybrid pattern most serious users run.

New article2026-06-22

Hermes vs Aider: when each one is the right pick

Hermes and Aider both run open-source and model-agnostic, but they answer different questions. The decision rule, the cost math, and when each is the right pick.

New article2026-06-22

Hermes vs Cline: which open-source agent fits your workflow

Hermes and Cline are open-source and model-agnostic but built for different work. The honest decision rule, cost math, and when each is the right pick.

New article2026-06-22

Hermes vs OpenHands: the open-source agent comparison that matters

Hermes and OpenHands are both top-tier open-source agents — but they're built for different jobs. Decision rule, cost math, and when each is the right pick.

New article2026-06-22

How to Avoid Hitting Claude Usage Limits (2026 Builder's Guide)

The patterns that let serious builders use Claude 3-5x more without hitting limits. Prompt caching, model routing, subagents, hooks, and when to move work to the API.

Platform update2026-06-19

Windsurf — pricing

Free tier with meaningful daily usage — generous enough for individual evaluation, no credit card required. Pro at $15/month unlocks unlimited Supercomplete and higher Cascade usage. Teams at ~$30/user/month adds collaboration and admin co…

Platform update2026-06-19

Windsurf — verdict

The fastest AI IDE for developers who want autonomous execution baked into the editor. Cascade and Flows handle multi-file refactors and goal-driven work end-to-end. OpenAI-acquired and actively developed. The right pick when speed and IDE…

New article2026-06-17

Evals as PRDs: How AI Teams Are Replacing Specs With Tests

Evals are becoming the new PRD for AI agent development — quantifiable tests that define 'done' for coding agents. The frameworks, the patterns, and the failure modes.

New article2026-06-17

How to Create an AI Agent: A Tested Builder's Guide (2026)

How to create an AI agent in 2026: four real paths from no-code to fully custom, with the platform we'd pick for each, time to first agent, and what it actually costs.

New article2026-06-17

Loop Engineering: How to Design Self-Prompting AI Agents

Loop engineering, the four patterns that turn AI agents from manual chat tools into autonomous systems. Heartbeats, crons, hooks, and goals, with the platforms that ship each.

New article2026-06-17

Multi-Agent AI: When to Use It, When to Skip It, What Actually Works

Multi-agent AI compared honestly, the three patterns that work in production, the four that don't, and the cost math that decides which is right.

Platform update2026-06-16

Kilo Code — pricing

Free tier (Kilo Auto — no credit card, no token limit on the free model tier). Paid plans run $20–$50/month for higher-tier model access. BYOK is the recommended path for serious use: connect your own API key to any of 500+ models via Kilo…

Platform update2026-06-16

Kilo Code — verdict

The best-priced agentic coding tool for developers who need JetBrains support or want to switch models mid-session. Apache 2.0, BYOK, multi-IDE. The right pick when Cursor's editor lock-in and Claude Code's model lock-in both bother you.

Earlier (last 60 days)

23 prior changes.

Model release2026-06-13

GLM 5.2 released

z.ai · $0.98/$3.08 per 1M tokens · 1.0M context · balanced tier

Model release2026-06-12

Kimi K2.7 Code released

Moonshot AI · $0.612/$3.069 per 1M tokens · 262k context · value tier

New article2026-06-11

The best AI coding agents in 2026

The 12 AI coding agents builders deploy in 2026, Claude Code, Cursor, Cline, Windsurf, Aider, Amp, and more. Honest verdicts on which to pick for which work.

New article2026-06-11

The best AI voice agents in 2026

The four voice AI platforms builders deploy in 2026 — Retell, Vapi, Bland, ElevenLabs. Pricing, latency, languages, and which to pick for which use case.

New article2026-06-11

The best no-code AI agent builders in 2026

The no-code AI agent platforms non-technical builders ship with in 2026 — Lindy, Relevance AI, Stack AI, Manus. Pricing, capabilities, which to pick.

Model release2026-06-09

Claude Fable 5 released

Anthropic · $10/$50 per 1M tokens · 1M context · frontier tier

Model release2026-05-28

Claude Opus 4.8 released

Anthropic · $5/$25 per 1M tokens · 1M context · frontier tier

Model release2026-05-28

Claude Opus 4.8 Fast released

Anthropic · $10/$50 per 1M tokens · 1M context · frontier tier

Model release2026-05-20

Grok Build 0.1 released

xAI · $1/$2 per 1M tokens · 256k context · value tier

Model release2026-05-19

Gemini 3.5 Flash released

Google · $1.5/$9 per 1M tokens · 1M context · balanced tier

New article2026-05-17

The ARR framework: which tasks should you actually give to an AI agent?

A short mental model for deciding which tasks belong with an AI agent and which don't. Three letters. Autonomous, Recurring, Reviewable. Skip the rest.

New article2026-05-17

Director vs doer: the mindset shift that separates working AI agents from broken ones

Stop prompting. Start directing. The mindset change builders need to make once they move from chatbots to agents, and the practices that come with it.

New article2026-05-17

Hermes vs Cursor: a comparison nobody else makes, and why it matters

Hermes and Cursor get compared by people who don't know they're different categories. What each does, why the question matters, and which to pick.

New article2026-05-17

How much does it cost to build an AI agent in 2026?

AI agent development costs in 2026: no-code ($30–$300/mo), low-code ($50–$300/mo), custom builds ($2k–$50k first month). Cost-per-task, hidden items.

New article2026-05-17

The lethal trifecta: the AI agent security trap nobody warns you about

Three capabilities safe alone, catastrophic combined: private data, internet access, untrusted input. How the AI agent security trap works and how to break it.

New article2026-05-03

Self-hosted AI is bigger than you think

Three of the top productivity AI tools by usage are self-hosted and open source. That's not the narrative. Here's the usage data behind it.

New article2026-05-01

Why two open-source agents quietly own the productivity category in 2026

Two open-source agents own 95% of productivity tokens on the public model-marketplace leaderboards. Here's why the market concentrated so fast, and what it means for builders.

Model release2026-04-30

Grok 4.3 released

xAI · $1.25/$2.5 per 1M tokens · 1M context · frontier tier

New article2026-04-29

OpenClaw vs Hermes: which open-source agent should you self-host in 2026?

OpenClaw vs Hermes head to head, the two open-source agents builders actually run. Trade-offs that matter and the usage data behind the choice.

New article2026-04-27

Skills vs MCP servers vs subagents: the architectural map for builders

Five concepts in Claude Code overlap with each other. Most explainers stop at definitions. Here's when to actually use which.

Model release2026-04-24

GPT-5.5 released

OpenAI · $5/$30 per 1M tokens · 400k context · frontier tier

Model release2026-04-24

DeepSeek V4 Flash released

DeepSeek · $0.09/$0.18 per 1M tokens · 128k context · value tier

New article2026-04-24

What Claude Skills actually are (and why most people are getting them wrong)

Most builders think Claude Skills are saved prompts. The architecture is different, and it's the reason Skills are actually useful.

How this feed works

Three sources feed this page, all automated. Model releases come from our daily pricing audit cross-referenced with vendor announcement pages. Platform updates come from our platform-change detector, which compares today's review against yesterday's and surfaces changes to pricing or verdict. New articles come from this site's editorial cluster.

Nothing here is hand-curated — the same daily-audit infrastructure that runs the open pricing dataset and the platform reviews also publishes this feed. The point: a single trusted place where you can see what changed without hunting across 27 vendor pages.

Weekly digest

Get the weekly summary by email.

One email per week with the model releases, pricing changes, and platform updates that landed since the last digest. Free, unsubscribe in one click.

What changed across AI agents this week.

Get the digest of what changed

17 changes in the last 7 days.

Claude Code vs OpenAI Codex (2026): Which Coding Agent Wins

Claude Opus vs Sonnet 4.6 (2026): When Each Model Actually Wins

Hermes — pricing

Hermes — verdict

Claude Pro vs Max vs API: When Each One Actually Wins

Hermes vs Aider: when each one is the right pick

Hermes vs Cline: which open-source agent fits your workflow

Hermes vs OpenHands: the open-source agent comparison that matters

How to Avoid Hitting Claude Usage Limits (2026 Builder's Guide)

Windsurf — pricing

Windsurf — verdict

Evals as PRDs: How AI Teams Are Replacing Specs With Tests

How to Create an AI Agent: A Tested Builder's Guide (2026)

Loop Engineering: How to Design Self-Prompting AI Agents

Multi-Agent AI: When to Use It, When to Skip It, What Actually Works

Kilo Code — pricing

Kilo Code — verdict

23 prior changes.

GLM 5.2 released

Kimi K2.7 Code released

The best AI coding agents in 2026

The best AI voice agents in 2026

The best no-code AI agent builders in 2026

Claude Fable 5 released ↗

Claude Opus 4.8 released ↗

Claude Opus 4.8 Fast released ↗

Grok Build 0.1 released

Gemini 3.5 Flash released ↗

The ARR framework: which tasks should you actually give to an AI agent?

Director vs doer: the mindset shift that separates working AI agents from broken ones

Hermes vs Cursor: a comparison nobody else makes, and why it matters

How much does it cost to build an AI agent in 2026?

The lethal trifecta: the AI agent security trap nobody warns you about

Self-hosted AI is bigger than you think

Why two open-source agents quietly own the productivity category in 2026

Grok 4.3 released

OpenClaw vs Hermes: which open-source agent should you self-host in 2026?

Skills vs MCP servers vs subagents: the architectural map for builders

GPT-5.5 released ↗

DeepSeek V4 Flash released ↗

What Claude Skills actually are (and why most people are getting them wrong)

Get the weekly summary by email.

Claude Fable 5 released

Claude Opus 4.8 released

Claude Opus 4.8 Fast released

Gemini 3.5 Flash released

GPT-5.5 released

DeepSeek V4 Flash released