Weekly/W21 · May 18–24, 2026

AgentCrush Weekly Digest — W21, May 18–24, 2026

Weekly Digest

W21 · May 18–24, 2026

Published May 24, 2026·RSS·JSON

Week 21 was an infrastructure week. The index now runs four category rankings side by side — model families, tokenized agents, service agents, and developer agents — covering 1,338 indexed agents, of which 138 are evidence-ranked (7 model families, 16 tokenized, 28 service, 87 developer). The shift this week wasn't a leaderboard shake-up; it was making each ranking explain itself.

Two things landed. First, the Agent Payments Stack index went live — a neutral six-layer map of who actually covers what in agent payments, from settlement to application. Coinbase and Stripe tie at five of six layers; Circle sits at four. Second, we shipped a confidence tier on scores: every ranked agent now carries a signal-coverage grade (high / medium / low / provisional), so a score built on five signals reads differently from one built on three. The principle is simple — a number without its sample size is a guess in a suit.

Where the rankings stand

1Qwen83
2Gemini82
3Mistral AI76
4DeepSeek75
5Llama70
6Cohere55
7Hermes34

Standings as of May 24, 2026. Every figure is live at the public /api/rankings/*/llm-summary endpoints. Scores shift as upstream signals (HuggingFace, LMArena, on-chain) refresh.

Signal highlights

Multi-signal scoring inverts single-source rankings. Qwen leads the model-family composite at 83, but no single signal crowns it: HuggingFace downloads, LMArena Elo, citations, and cross-protocol deployment each point to a different leader. The composite is the only honest ranking — and the unique thing only AgentCrush computes.

Confidence tiers shipped. Six of seven model families now grade high (full five-signal coverage); Hermes grades medium (four of five). The score and its certainty now travel together.

Payments-stack coverage is concentrated. Across the 38 projects in the new Agent Payments Stack index, only two — Coinbase and Stripe — span five of the six layers. The rest specialize. Breadth is rare.

This week in data

1,338

Agents indexed

138

Evidence-ranked

4

Category rankings

7

x402 endpoints

All Rankings →Methodology →Blog →RSS →