Agents SDK

Models & Providers

21st Agents supports multiple AI models from different providers. Use Anthropic Claude natively or connect any model through OpenRouter for maximum flexibility.

Which runtimes are supported natively?

21st Agents ships with two native runtimes that work out of the box:

Anthropic · Claude Code

Full-featured agentic runtime with tool use, web access, and sandboxed execution. Models: Claude Opus 4.6, Sonnet 4.6, Haiku 4.5.

OpenAI · Codex Soon

OpenAI's optimized runtime with multiple reasoning levels. Models: GPT-5.2 Codex, GPT-5.2 Codex (High), GPT-5.1 Codex Mini.

Can I use models beyond Anthropic and OpenAI?

Yes. Through OpenRouter, you can route your agent to hundreds of models from Google, Meta, DeepSeek, Alibaba, Zhipu AI, Mistral, and other providers — all through a single, unified API. OpenRouter handles authentication, billing, and failover across providers so you don't have to manage multiple API keys.

This is especially useful when you want to:

  • Use cost-efficient open-weight models like DeepSeek V3.2 or Llama 4 for high-volume, lower-complexity tasks
  • Access the largest context windows (Gemini 3.1 Pro at 1M tokens, Llama 4 Scout at 10M tokens)
  • Experiment with specialized models like GLM-5 for agentic workloads or Qwen 3.5 for multilingual support across 119 languages
  • Set up automatic fallback chains — if one provider is down, OpenRouter routes to an alternative

How does model switching work?

You can let end users switch between models at runtime without redeploying your agent. Enable “Allow model switching” in the agent config and select which models to expose. This is useful when you want users to choose between faster (cheaper) and more capable (pricier) options.

For example, you might expose Claude Haiku 4.5 for quick questions and Claude Opus 4.6 for deep analysis — or let users pick between DeepSeek V3.2 for cost efficiency and Gemini 3.1 Pro for maximum context.

Popular models available through 21st Agents

The table below lists popular models you can use. Native runtimes work out of the box; all others are available via OpenRouter integration.

ProviderModelContextBest for
AnthropicClaude Opus 4.6anthropic/claude-opus-4-6200KTop-tier reasoning, complex multi-step tasks, code generation
Claude Sonnet 4.6anthropic/claude-sonnet-4-6200KBest balance of speed, cost, and intelligence
Claude Haiku 4.5anthropic/claude-haiku-4-5-20251001200KFastest and cheapest, great for simple tasks
GoogleGemini 3.1 Progoogle/gemini-3.1-pro1MHighest benchmark scores, massive context window
Gemini 3.0 Flashgoogle/gemini-3.0-flash1MFast and affordable with million-token context
DeepSeekDeepSeek V3.2deepseek/deepseek-v3.2128KNear-frontier performance at 10x lower cost, strong reasoning
DeepSeek R1deepseek/deepseek-r1128KAdvanced chain-of-thought reasoning
MetaLlama 4 Maverickmeta-llama/llama-4-maverick256KOpen-weight, strong coding and multilingual
Llama 4 Scoutmeta-llama/llama-4-scout10MLargest context window available, open-weight
AlibabaQwen 3.5 Maxqwen/qwen-3.5-max128K119 languages, competitive with frontier models
Qwen 3.5 Coderqwen/qwen-3.5-coder128KOptimized for code generation and analysis
Zhipu AIGLM-5zhipu/glm-5128KTop open-weight model, strong agentic and SWE tasks
GLM-4.7 Flashzhipu/glm-4.7-flash128KLightweight MoE, efficient for local coding tasks
OpenAIGPT-5.4openai/gpt-5.4128KFrontier reasoning and analysis
GPT-5.2 Codexopenai/gpt-5.2-codex128KOptimized for code, multiple reasoning levels

How do I choose the right model?

For complex agentic tasks — use Claude Opus 4.6 or GPT-5.4. These models excel at multi-step reasoning, tool use, and long-horizon tasks where accuracy matters more than cost.

For balanced production workloads — Claude Sonnet 4.6 or Gemini 3.1 Pro offer the best price-to-performance ratio. Sonnet is the default for most 21st agents.

For high-volume, cost-sensitive tasks — DeepSeek V3.2 delivers near-frontier performance at roughly 10x lower cost than Claude Opus. Qwen 3.5 and Llama 4 are strong open-weight alternatives.

For massive context windows — Gemini 3.1 Pro (1M tokens) or Llama 4 Scout (10M tokens) can process entire codebases, legal documents, or research corpora in a single call.

For multilingual applications — Qwen 3.5 supports 119 languages with strong performance across all of them. GLM-5 is another strong option for Chinese and multilingual workloads.

For quick, simple tasks — Claude Haiku 4.5 or GLM-4.7 Flash are the fastest and cheapest options while still being highly capable.

How does pricing work with different models?

When using native runtimes (Anthropic Claude Code), pricing is based on your 21st Agents plan. When routing through OpenRouter, you pay OpenRouter's per-token rates for the chosen model — these vary significantly between providers. DeepSeek and Qwen models are among the most affordable, while frontier models like Claude Opus 4.6 and GPT-5.4 cost more per token but deliver higher quality.

You can set a maxBudgetUsd per agent run to control costs regardless of which model you use.

Ready to try a different model?

Check the Agents customization guide to configure your runtime and model, or visit OpenRouter models to browse all available models with live pricing.