Models & Providers

21st Agents supports multiple AI models from different providers. Use Anthropic Claude natively or connect any model through OpenRouter for maximum flexibility.

Which runtimes are supported natively?

21st Agents ships with two native runtimes that work out of the box:

Anthropic · Claude Code

Full-featured agentic runtime with tool use, web access, and sandboxed execution. Models: Claude Opus 4.6, Sonnet 4.6, Haiku 4.5.

OpenAI · Codex Soon

OpenAI's optimized runtime with multiple reasoning levels. Models: GPT-5.2 Codex, GPT-5.2 Codex (High), GPT-5.1 Codex Mini.

Can I use models beyond Anthropic and OpenAI?

Yes. Through OpenRouter, you can route your agent to hundreds of models from Google, Meta, DeepSeek, Alibaba, Zhipu AI, Mistral, and other providers — all through a single, unified API. OpenRouter handles authentication, billing, and failover across providers so you don't have to manage multiple API keys.

This is especially useful when you want to:

Use cost-efficient open-weight models like DeepSeek V3.2 or Llama 4 for high-volume, lower-complexity tasks
Access the largest context windows (Gemini 3.1 Pro at 1M tokens, Llama 4 Scout at 10M tokens)
Experiment with specialized models like GLM-5 for agentic workloads or Qwen 3.5 for multilingual support across 119 languages
Set up automatic fallback chains — if one provider is down, OpenRouter routes to an alternative

How does model switching work?

You can let end users switch between models at runtime without redeploying your agent. Enable “Allow model switching” in the agent config and select which models to expose. This is useful when you want users to choose between faster (cheaper) and more capable (pricier) options.

For example, you might expose Claude Haiku 4.5 for quick questions and Claude Opus 4.6 for deep analysis — or let users pick between DeepSeek V3.2 for cost efficiency and Gemini 3.1 Pro for maximum context.

Popular models available through 21st Agents

The table below lists popular models you can use. Native runtimes work out of the box; all others are available via OpenRouter integration.

Provider	Model	Context	Best for
Anthropic	`Claude Opus 4.6anthropic/claude-opus-4-6`	200K	Top-tier reasoning, complex multi-step tasks, code generation
	`Claude Sonnet 4.6anthropic/claude-sonnet-4-6`	200K	Best balance of speed, cost, and intelligence
	`Claude Haiku 4.5anthropic/claude-haiku-4-5-20251001`	200K	Fastest and cheapest, great for simple tasks
Google	`Gemini 3.1 Progoogle/gemini-3.1-pro`	1M	Highest benchmark scores, massive context window
Google	`Gemini 3.0 Flashgoogle/gemini-3.0-flash`	1M	Fast and affordable with million-token context
DeepSeek	`DeepSeek V3.2deepseek/deepseek-v3.2`	128K	Near-frontier performance at 10x lower cost, strong reasoning
DeepSeek	`DeepSeek R1deepseek/deepseek-r1`	128K	Advanced chain-of-thought reasoning
Meta	`Llama 4 Maverickmeta-llama/llama-4-maverick`	256K	Open-weight, strong coding and multilingual
Meta	`Llama 4 Scoutmeta-llama/llama-4-scout`	10M	Largest context window available, open-weight
Alibaba	`Qwen 3.5 Maxqwen/qwen-3.5-max`	128K	119 languages, competitive with frontier models
Alibaba	`Qwen 3.5 Coderqwen/qwen-3.5-coder`	128K	Optimized for code generation and analysis
Zhipu AI	`GLM-5zhipu/glm-5`	128K	Top open-weight model, strong agentic and SWE tasks
Zhipu AI	`GLM-4.7 Flashzhipu/glm-4.7-flash`	128K	Lightweight MoE, efficient for local coding tasks
OpenAI	`GPT-5.4openai/gpt-5.4`	128K	Frontier reasoning and analysis
OpenAI	`GPT-5.2 Codexopenai/gpt-5.2-codex`	128K	Optimized for code, multiple reasoning levels

How do I choose the right model?

For complex agentic tasks — use Claude Opus 4.6 or GPT-5.4. These models excel at multi-step reasoning, tool use, and long-horizon tasks where accuracy matters more than cost.

For balanced production workloads — Claude Sonnet 4.6 or Gemini 3.1 Pro offer the best price-to-performance ratio. Sonnet is the default for most 21st agents.

For high-volume, cost-sensitive tasks — DeepSeek V3.2 delivers near-frontier performance at roughly 10x lower cost than Claude Opus. Qwen 3.5 and Llama 4 are strong open-weight alternatives.

For massive context windows — Gemini 3.1 Pro (1M tokens) or Llama 4 Scout (10M tokens) can process entire codebases, legal documents, or research corpora in a single call.

For multilingual applications — Qwen 3.5 supports 119 languages with strong performance across all of them. GLM-5 is another strong option for Chinese and multilingual workloads.

For quick, simple tasks — Claude Haiku 4.5 or GLM-4.7 Flash are the fastest and cheapest options while still being highly capable.

How does pricing work with different models?

When using native runtimes (Anthropic Claude Code), pricing is based on your 21st Agents plan. When routing through OpenRouter, you pay OpenRouter's per-token rates for the chosen model — these vary significantly between providers. DeepSeek and Qwen models are among the most affordable, while frontier models like Claude Opus 4.6 and GPT-5.4 cost more per token but deliver higher quality.

You can set a maxBudgetUsd per agent run to control costs regardless of which model you use.