Home/Methodology

The methodology

How AgentCrush ranks the agent economy

AgentCrush is the evidence-ranked index of the agent economy. We don't pick winners — we publish multi-signal evidence with transparent weights. Different agent categories leave different evidence trails, so we run four category-specific methodologies, each with its own signal sources, weights, and evidence-ready rule.

Principles

Multi-signal corroboration. No agent is evidence-ranked on a single signal. Every category requires at least 3 of N signals available, AND at least one of those signals must be a capability signal — not just popularity. Downloads and stars are vanity metrics on their own.

Per-category methodology. A model family leaves HuggingFace downloads and LMArena scores; a tokenized agent leaves on-chain liquidity and holder distribution; a service agent leaves GitHub forks and Agentverse interactions. Running one universal scoring function across all of them would average away the truth.

Methodology travels with data. Every category page publishes its full signal set, weights, formulas, evidence-ready rule, and known limitations. The same methodology is exposed via our MCP server so LLMs querying AgentCrush can correctly explain HOW a ranking was computed — not just what it is.

Honest gaps. Where a signal isn't yet populated for an agent (no LMArena coverage, no citations indexed, etc.), the methodology returns NULL — not 0. That distinction matters: NULL means "unmeasured," 0 means "measured at zero." The composite weights unmeasured signals as missing rather than failing.

Live coverage

CategoryTrackedEvidence-rankedMethodology

Model Families55v1.4-with-deployment Tokenized Agents1616v1.1-tokenized-tvl Service Agents2828v1.1-service-forks Developer Agents1,28986v2.c-public

135 total evidence-ranked agents across 4 categories.

Model Families

v1.4-with-deployment

Scores model families (Hermes, Llama, Mistral, Qwen, DeepSeek, etc.) on adoption, capability, downstream usage, research impact, and cross-protocol agent-economy deployment.

Signals

HuggingFace30%

30% weightDownloads, likes, recency, breadth, top-model — aggregated by author.Weighted basket of 5 sub-scores

LMArena25%

25% weightBradley-Terry capability score from chat.lmarena.ai.LEAST(100, ROUND((MAX(arena_score) − 700) / 8))

HF Derivatives20%

20% weightFine-tunes / downstream models per base, counted from tags.LEAST(100, ROUND(LOG10(SUM(derivatives_count)) × 25))

Paper Citations15%

15% weightSemantic Scholar citation counts on canonical lab papers.LEAST(100, ROUND(LOG10(SUM(citation_count)) × 16))

Deployment10%

10% weightCross-protocol agent-economy mentions across 6 source tables. The moat signal.LEAST(100, ROUND(LOG10(SUM(deployment_count)) × 30))

Evidence-ready rule

3 of 5 signals AND ≥1 capability signal (derivatives, LMArena, citations, or deployment).

Known limitations

Currently 5 seeded model families (Qwen, Gemini, DeepSeek, Llama, Hermes). View covers all model_family agents; seed set is curated.
Citation backfill depends on Semantic Scholar API; some papers may have 0 cites due to S2 indexing delay.
Deployment signal is volume-based — high counts can indicate broad model adoption rather than specific deployment of one variant.

See the model families ranking →

Tokenized Agents

v1.1-tokenized-tvl

Scores tokenized AI agents (Virtuals Protocol, etc.) economics-first: market cap, on-chain liquidity, holder distribution, capital locked, plus social visibility.

Signals

Market Cap25%

25% weightUSD market cap, log-scaled.LEAST(100, ROUND(LOG10(market_cap_usd) × 12))

Liquidity + Volume20%

20% weightOn-chain liquidity (65%) + 24h volume (35%). Anti-honeypot weighting.liquidity_score × 0.65 + volume_score × 0.35

Holders15%

15% weightHolder count (55%) + inverse top-10 concentration (45%).holders_count_score × 0.55 + (100 − top10_pct) × 0.45

Price Momentum 24h10%

10% weightBounded around neutral 50. Extreme volatility (>±100%) treated neutral.GREATEST(0, LEAST(100, 50 + price_change_pct))

TVL15%

15% weightTotal value locked in token contracts. Capital commitment beyond market cap.LEAST(100, ROUND(LOG10(tvl_usd) × 14))

Social Visibility15%

15% weightv1.1: binary curated flag. v1.2 will integrate X follower count + Farcaster engagement.socially_visible ? 100 : 0

Evidence-ready rule

3 of 6 signals AND ≥1 economic signal (mc, liquidity, holders, or TVL > 0).

Known limitations

Cross-protocol presence signal tracked but currently unweighted — agent economy hasn't penetrated cross-protocol descriptions enough yet.
Social signal in v1.1 is binary; aixbt is the only socially-flagged agent.
Currently covers Virtuals Protocol agents only (16 promoted). Other tokenized ecosystems not yet integrated.

See the tokenized agents ranking →

Service Agents

v1.1-service-forks

Scores service agents (A2A protocol, Agentverse, x402, ERC-8004) on adoption, source quality, activity recency, protocol breadth, fork engagement.

Signals

Adoption25%

25% weightGitHub stars (A2A) OR Agentverse interactions. Log-scaled. Higher of the two wins.GREATEST(stars_log×18, interactions_log×22)

Source Quality20%

20% weightA2A signal_strength (0-100) OR Agentverse rating × 20.GREATEST(a2a_signal_strength, ROUND(av_rating × 20))

Activity Recency15%

15% weightAge-decay since most recent push or last-seen. Recent = high score.Time-bucketed: 7d→100, 30d→80, 90d→60, 180d→40, 365d→20

Protocol Breadth15%

15% weightCount of declared protocols/topics × 25.LEAST(100, COUNT(protocols) × 25)

Forks15%

15% weightGitHub forks, log-scaled. Forks measure active engagement vs passive starring.LEAST(100, ROUND(LOG10(forks) × 22))

Discourse / Social10%

10% weightv1.2 will integrate X + Farcaster mention volume for service agents.currently NULL (placeholder)

Evidence-ready rule

3 of 6 signals AND ≥1 adoption signal (stars > 0, interactions > 0, or forks > 0).

Known limitations

Currently sources from A2A (28 agents) + Agentverse (0 active in current scrape).
v1.2 will add ERC-8004 registry (29K agents) and Bazaar x402 endpoints (46K) as additional service surfaces.
Cross-protocol presence tracked in cross_protocol_presence but unweighted in v1.1 composite.

See the service agents ranking →

Developer Agents

v2.c-public

Scores developer-tool agents (frameworks, runtimes, dev tools) on GitHub activity, package usage, dependency adoption, ecosystem links, docs, discourse, and trust signals. The universal ranking surface.

Signals

GitHub Activitydynamic

dynamic weightStars, commits, contributors, recency.weighted by active_weight_total

Package Usagedynamic

dynamic weightnpm / PyPI download volume.log-scaled per ecosystem

Dependency Adoptiondynamic

dynamic weightReverse dependencies — how many other projects depend on this.log-scaled count

Docs Qualitydynamic

dynamic weightREADME depth, API docs, examples coverage.composite heuristic 0-100

Ecosystem Relationshipsdynamic

dynamic weightCross-referenced with other indexed agents.graph-distance score

Discourse (HN)dynamic

dynamic weightHacker News story / comment activity.log-scaled

Trust Signalsdynamic

dynamic weightRegistry context, verified claims, identity attestation.composite 0-100

Evidence-ready rule

Multi-signal coverage threshold OR top-100 ranked OR single signal ≥ 90 with ≥ 2 corroborating signals > 50.

Known limitations

Methodology weights are computed dynamically per agent (active_weight_total) rather than fixed.
Universal ranking includes 1,289 agents; evidence_ranked subset is the public-rank list.

See the developer agents ranking →

For machine consumers

The same methodology is exposed via our MCP server. LLMs (Claude Desktop, Cursor, custom agents) can query AgentCrush as a live data layer and explain ranking decisions accurately.

Endpoint

POST https://www.agentcrush.xyz/api/mcp/v1

Discovery

GET https://www.agentcrush.xyz/.well-known/mcp.json

Full MCP docs →

All Rankings →Plain-English explainer →Labs →Developer docs →