Guide

Learn more about how to use AI models with TypingMind

Filtered by: openrouter
ByteDance Seed: Seed 1.6 Flash logoByteDance Seed: Seed 1.6 Flash via OpenRouter

Access ByteDance Seed: Seed 1.6 Flash via OpenRouter

Seed 1.6 Flash is an ultra-fast multimodal deep thinking model by ByteDance Seed, supporting both text and visual understanding. It features a 256k context window and can generate outputs of up to 16k tokens.

3 min read
ByteDance Seed: Seed 1.6 logoByteDance Seed: Seed 1.6 via OpenRouter

Access ByteDance Seed: Seed 1.6 via OpenRouter

Seed 1.6 is a general-purpose model released by the ByteDance Seed team. It incorporates multimodal capabilities and adaptive deep thinking with a 256K context window.

3 min read
MiniMax: MiniMax M2.1 logoMiniMax: MiniMax M2.1 via OpenRouter

Access MiniMax: MiniMax M2.1 via OpenRouter

MiniMax-M2.1 is a lightweight, state-of-the-art large language model optimized for coding, agentic workflows, and modern application development. With only 10 billion activated parameters, it delivers a major jump in real-world capability while maintaining exceptional latency, scalability, and cost efficiency. Compared to its predecessor, M2.1 delivers cleaner, more concise outputs and faster perceived response times. It shows leading multilingual coding performance across major systems and application languages, achieving 49.4% on Multi-SWE-Bench and 72.5% on SWE-Bench Multilingual, and serves as a versatile agent “brain” for IDEs, coding tools, and general-purpose assistance. To avoid degrading this model's performance, MiniMax highly recommends preserving reasoning between turns. Learn more about using reasoning_details to pass back reasoning in our [docs](https://openrouter.ai/docs/use-cases/reasoning-tokens#preserving-reasoning-blocks).

3 min read
Z.AI: GLM 4.7 logoZ.AI: GLM 4.7 via OpenRouter

Access Z.AI: GLM 4.7 via OpenRouter

GLM-4.7 is Z.AI’s latest flagship model, featuring upgrades in two key areas: enhanced programming capabilities and more stable multi-step reasoning/execution. It demonstrates significant improvements in executing complex agent tasks while delivering more natural conversational experiences and superior front-end aesthetics.

3 min read
Google: Gemini 3 Flash Preview logoGoogle: Gemini 3 Flash Preview via OpenRouter

Access Google: Gemini 3 Flash Preview via OpenRouter

Gemini 3 Flash Preview is a high speed, high value thinking model designed for agentic workflows, multi turn chat, and coding assistance. It delivers near Pro level reasoning and tool use performance with substantially lower latency than larger Gemini variants, making it well suited for interactive development, long running agent loops, and collaborative coding tasks. Compared to Gemini 2.5 Flash, it provides broad quality improvements across reasoning, multimodal understanding, and reliability. The model supports a 1M token context window and multimodal inputs including text, images, audio, video, and PDFs, with text output. It includes configurable reasoning via thinking levels (minimal, low, medium, high), structured output, tool use, and automatic context caching. Gemini 3 Flash Preview is optimized for users who want strong reasoning and agentic behavior without the cost or latency of full scale frontier models.

3 min read
Mistral: Mistral Small Creative logoMistral: Mistral Small Creative via OpenRouter

Access Mistral: Mistral Small Creative via OpenRouter

Mistral Small Creative is an experimental small model designed for creative writing, narrative generation, roleplay and character-driven dialogue, general-purpose instruction following, and conversational agents.

3 min read
AllenAI: Olmo 3.1 32B Think logoAllenAI: Olmo 3.1 32B Think via OpenRouter

Access AllenAI: Olmo 3.1 32B Think via OpenRouter

Olmo 3.1 32B Think is a large-scale, 32-billion-parameter model designed for deep reasoning, complex multi-step logic, and advanced instruction following. Building on the Olmo 3 series, version 3.1 delivers refined reasoning behavior and stronger performance across demanding evaluations and nuanced conversational tasks. Developed by Ai2 under the Apache 2.0 license, Olmo 3.1 32B Think continues the Olmo initiative’s commitment to openness, providing full transparency across model weights, code, and training methodology.

3 min read
Xiaomi: MiMo-V2-Flash (free) logoXiaomi: MiMo-V2-Flash (free) via OpenRouter

Access Xiaomi: MiMo-V2-Flash (free) via OpenRouter

MiMo-V2-Flash is an open-source foundation language model developed by Xiaomi. It is a Mixture-of-Experts model with 309B total parameters and 15B active parameters, adopting hybrid attention architecture. MiMo-V2-Flash supports a hybrid-thinking toggle and a 256K context window, and excels at reasoning, coding, and agent scenarios. On SWE-bench Verified and SWE-bench Multilingual, MiMo-V2-Flash ranks as the top #1 open-source model globally, delivering performance comparable to Claude Sonnet 4.5 while costing only about 3.5% as much. Note: when integrating with agentic tools such as Claude Code, Cline, or Roo Code, **turn off reasoning mode** for the best and fastest performance—this model is deeply optimized for this scenario. Users can control the reasoning behaviour with the `reasoning` `enabled` boolean. [Learn more in our docs](https://openrouter.ai/docs/use-cases/reasoning-tokens#enable-reasoning-with-default-config).

3 min read
NVIDIA: Nemotron 3 Nano 30B A3B (free) logoNVIDIA: Nemotron 3 Nano 30B A3B (free) via OpenRouter

Access NVIDIA: Nemotron 3 Nano 30B A3B (free) via OpenRouter

NVIDIA Nemotron 3 Nano 30B A3B is a small language MoE model with highest compute efficiency and accuracy for developers to build specialized agentic AI systems. The model is fully open with open-weights, datasets and recipes so developers can easily customize, optimize, and deploy the model on their infrastructure for maximum privacy and security. Note: For the free endpoint, all prompts and output are logged to improve the provider's model and its product and services. Please do not upload any personal, confidential, or otherwise sensitive information. This is a trial use only. Do not use for production or business-critical systems.

3 min read
NVIDIA: Nemotron 3 Nano 30B A3B logoNVIDIA: Nemotron 3 Nano 30B A3B via OpenRouter

Access NVIDIA: Nemotron 3 Nano 30B A3B via OpenRouter

NVIDIA Nemotron 3 Nano 30B A3B is a small language MoE model with highest compute efficiency and accuracy for developers to build specialized agentic AI systems. The model is fully open with open-weights, datasets and recipes so developers can easily customize, optimize, and deploy the model on their infrastructure for maximum privacy and security. Note: For the free endpoint, all prompts and output are logged to improve the provider's model and its product and services. Please do not upload any personal, confidential, or otherwise sensitive information. This is a trial use only. Do not use for production or business-critical systems.

3 min read
OpenAI: GPT-5.2 Chat logoOpenAI: GPT-5.2 Chat via OpenRouter

Access OpenAI: GPT-5.2 Chat via OpenRouter

GPT-5.2 Chat (AKA Instant) is the fast, lightweight member of the 5.2 family, optimized for low-latency chat while retaining strong general intelligence. It uses adaptive reasoning to selectively “think” on harder queries, improving accuracy on math, coding, and multi-step tasks without slowing down typical conversations. The model is warmer and more conversational by default, with better instruction following and more stable short-form reasoning. GPT-5.2 Chat is designed for high-throughput, interactive workloads where responsiveness and consistency matter more than deep deliberation.

3 min read
OpenAI: GPT-5.2 Pro logoOpenAI: GPT-5.2 Pro via OpenRouter

Access OpenAI: GPT-5.2 Pro via OpenRouter

GPT-5.2 Pro is OpenAI’s most advanced model, offering major improvements in agentic coding and long context performance over GPT-5 Pro. It is optimized for complex tasks that require step-by-step reasoning, instruction following, and accuracy in high-stakes use cases. It supports test-time routing features and advanced prompt understanding, including user-specified intent like "think hard about this." Improvements include reductions in hallucination, sycophancy, and better performance in coding, writing, and health-related tasks.

3 min read
OpenAI: GPT-5.2 logoOpenAI: GPT-5.2 via OpenRouter

Access OpenAI: GPT-5.2 via OpenRouter

GPT-5.2 is the latest frontier-grade model in the GPT-5 series, offering stronger agentic and long context perfomance compared to GPT-5.1. It uses adaptive reasoning to allocate computation dynamically, responding quickly to simple queries while spending more depth on complex tasks. Built for broad task coverage, GPT-5.2 delivers consistent gains across math, coding, sciende, and tool calling workloads, with more coherent long-form answers and improved tool-use reliability.

3 min read
Mistral: Devstral 2 2512 (free) logoMistral: Devstral 2 2512 (free) via OpenRouter

Access Mistral: Devstral 2 2512 (free) via OpenRouter

Devstral 2 is a state-of-the-art open-source model by Mistral AI specializing in agentic coding. It is a 123B-parameter dense transformer model supporting a 256K context window. Devstral 2 supports exploring codebases and orchestrating changes across multiple files while maintaining architecture-level context. It tracks framework dependencies, detects failures, and retries with corrections—solving challenges like bug fixing and modernizing legacy systems. The model can be fine-tuned to prioritize specific languages or optimize for large enterprise codebases. It is available under a modified MIT license.

3 min read
Mistral: Devstral 2 2512 logoMistral: Devstral 2 2512 via OpenRouter

Access Mistral: Devstral 2 2512 via OpenRouter

Devstral 2 is a state-of-the-art open-source model by Mistral AI specializing in agentic coding. It is a 123B-parameter dense transformer model supporting a 256K context window. Devstral 2 supports exploring codebases and orchestrating changes across multiple files while maintaining architecture-level context. It tracks framework dependencies, detects failures, and retries with corrections—solving challenges like bug fixing and modernizing legacy systems. The model can be fine-tuned to prioritize specific languages or optimize for large enterprise codebases. It is available under a modified MIT license.

3 min read
Relace: Relace Search logoRelace: Relace Search via OpenRouter

Access Relace: Relace Search via OpenRouter

The relace-search model uses 4-12 `view_file` and `grep` tools in parallel to explore a codebase and return relevant files to the user request. In contrast to RAG, relace-search performs agentic multi-step reasoning to produce highly precise results 4x faster than any frontier model. It's designed to serve as a subagent that passes its findings to an "oracle" coding agent, who orchestrates/performs the rest of the coding task. To use relace-search you need to build an appropriate agent harness, and parse the response for relevant information to hand off to the oracle. Read more about it in the [Relace documentation](https://docs.relace.ai/docs/fast-agentic-search/agent).

3 min read
Z.AI: GLM 4.6V logoZ.AI: GLM 4.6V via OpenRouter

Access Z.AI: GLM 4.6V via OpenRouter

GLM-4.6V is a large multimodal model designed for high-fidelity visual understanding and long-context reasoning across images, documents, and mixed media. It supports up to 128K tokens, processes complex page layouts and charts directly as visual inputs, and integrates native multimodal function calling to connect perception with downstream tool execution. The model also enables interleaved image-text generation and UI reconstruction workflows, including screenshot-to-HTML synthesis and iterative visual editing.

3 min read
Nex AGI: DeepSeek V3.1 Nex N1 (free) logoNex AGI: DeepSeek V3.1 Nex N1 (free) via OpenRouter

Access Nex AGI: DeepSeek V3.1 Nex N1 (free) via OpenRouter

DeepSeek V3.1 Nex-N1 is the flagship release of the Nex-N1 series — a post-trained model designed to highlight agent autonomy, tool use, and real-world productivity. Nex-N1 demonstrates competitive performance across all evaluation scenarios, showing particularly strong results in practical coding and HTML generation tasks.

3 min read
EssentialAI: Rnj 1 Instruct logoEssentialAI: Rnj 1 Instruct via OpenRouter

Access EssentialAI: Rnj 1 Instruct via OpenRouter

Rnj-1 is an 8B-parameter, dense, open-weight model family developed by Essential AI and trained from scratch with a focus on programming, math, and scientific reasoning. The model demonstrates strong performance across multiple programming languages, tool-use workflows, and agentic execution environments (e.g., mini-SWE-agent).

3 min read
Body Builder (beta) logoBody Builder (beta) via OpenRouter

Access Body Builder (beta) via OpenRouter

Transform your natural language requests into structured OpenRouter API request objects. Describe what you want to accomplish with AI models, and Body Builder will construct the appropriate API calls. Example: "count to 10 using gemini and opus." This is useful for creating multi-model requests, custom model routers, or programmatic generation of API calls from human descriptions. **BETA NOTICE**: Body Builder is in beta, and currently free. Pricing and functionality may change in the future.

3 min read
OpenAI: GPT-5.1-Codex-Max logoOpenAI: GPT-5.1-Codex-Max via OpenRouter

Access OpenAI: GPT-5.1-Codex-Max via OpenRouter

GPT-5.1-Codex-Max is OpenAI’s latest agentic coding model, designed for long-running, high-context software development tasks. It is based on an updated version of the 5.1 reasoning stack and trained on agentic workflows spanning software engineering, mathematics, and research. GPT-5.1-Codex-Max delivers faster performance, improved reasoning, and higher token efficiency across the development lifecycle.

3 min read
Amazon: Nova 2 Lite logoAmazon: Nova 2 Lite via OpenRouter

Access Amazon: Nova 2 Lite via OpenRouter

Nova 2 Lite is a fast, cost-effective reasoning model for everyday workloads that can process text, images, and videos to generate text. Nova 2 Lite demonstrates standout capabilities in processing documents, extracting information from videos, generating code, providing accurate grounded answers, and automating multi-step agentic workflows.

3 min read
Mistral: Ministral 3 14B 2512 logoMistral: Ministral 3 14B 2512 via OpenRouter

Access Mistral: Ministral 3 14B 2512 via OpenRouter

The largest model in the Ministral 3 family, Ministral 3 14B offers frontier capabilities and performance comparable to its larger Mistral Small 3.2 24B counterpart. A powerful and efficient language model with vision capabilities.

3 min read
Mistral: Ministral 3 8B 2512 logoMistral: Ministral 3 8B 2512 via OpenRouter

Access Mistral: Ministral 3 8B 2512 via OpenRouter

A balanced model in the Ministral 3 family, Ministral 3 8B is a powerful, efficient tiny language model with vision capabilities.

3 min read
Mistral: Ministral 3 3B 2512 logoMistral: Ministral 3 3B 2512 via OpenRouter

Access Mistral: Ministral 3 3B 2512 via OpenRouter

The smallest model in the Ministral 3 family, Ministral 3 3B is a powerful, efficient tiny language model with vision capabilities.

3 min read
Mistral: Mistral Large 3 2512 logoMistral: Mistral Large 3 2512 via OpenRouter

Access Mistral: Mistral Large 3 2512 via OpenRouter

Mistral Large 3 2512 is Mistral’s most capable model to date, featuring a sparse mixture-of-experts architecture with 41B active parameters (675B total), and released under the Apache 2.0 license.

3 min read
Arcee AI: Trinity Mini (free) logoArcee AI: Trinity Mini (free) via OpenRouter

Access Arcee AI: Trinity Mini (free) via OpenRouter

Trinity Mini is a 26B-parameter (3B active) sparse mixture-of-experts language model featuring 128 experts with 8 active per token. Engineered for efficient reasoning over long contexts (131k) with robust function calling and multi-step agent workflows.

3 min read
Arcee AI: Trinity Mini logoArcee AI: Trinity Mini via OpenRouter

Access Arcee AI: Trinity Mini via OpenRouter

Trinity Mini is a 26B-parameter (3B active) sparse mixture-of-experts language model featuring 128 experts with 8 active per token. Engineered for efficient reasoning over long contexts (131k) with robust function calling and multi-step agent workflows.

3 min read
DeepSeek: DeepSeek V3.2 Speciale logoDeepSeek: DeepSeek V3.2 Speciale via OpenRouter

Access DeepSeek: DeepSeek V3.2 Speciale via OpenRouter

DeepSeek-V3.2-Speciale is a high-compute variant of DeepSeek-V3.2 optimized for maximum reasoning and agentic performance. It builds on DeepSeek Sparse Attention (DSA) for efficient long-context processing, then scales post-training reinforcement learning to push capability beyond the base model. Reported evaluations place Speciale ahead of GPT-5 on difficult reasoning workloads, with proficiency comparable to Gemini-3.0-Pro, while retaining strong coding and tool-use reliability. Like V3.2, it benefits from a large-scale agentic task synthesis pipeline that improves compliance and generalization in interactive environments.

3 min read
DeepSeek: DeepSeek V3.2 logoDeepSeek: DeepSeek V3.2 via OpenRouter

Access DeepSeek: DeepSeek V3.2 via OpenRouter

DeepSeek-V3.2 is a large language model designed to harmonize high computational efficiency with strong reasoning and agentic tool-use performance. It introduces DeepSeek Sparse Attention (DSA), a fine-grained sparse attention mechanism that reduces training and inference cost while preserving quality in long-context scenarios. A scalable reinforcement learning post-training framework further improves reasoning, with reported performance in the GPT-5 class, and the model has demonstrated gold-medal results on the 2025 IMO and IOI. V3.2 also uses a large-scale agentic task synthesis pipeline to better integrate reasoning into tool-use settings, boosting compliance and generalization in interactive environments. Users can control the reasoning behaviour with the `reasoning` `enabled` boolean. [Learn more in our docs](https://openrouter.ai/docs/use-cases/reasoning-tokens#enable-reasoning-with-default-config)

3 min read
Prime Intellect: INTELLECT-3 logoPrime Intellect: INTELLECT-3 via OpenRouter

Access Prime Intellect: INTELLECT-3 via OpenRouter

INTELLECT-3 is a 106B-parameter Mixture-of-Experts model (12B active) post-trained from GLM-4.5-Air-Base using supervised fine-tuning (SFT) followed by large-scale reinforcement learning (RL). It offers state-of-the-art performance for its size across math, code, science, and general reasoning, consistently outperforming many larger frontier models. Designed for strong multi-step problem solving, it maintains high accuracy on structured tasks while remaining efficient at inference thanks to its MoE architecture.

3 min read
TNG: R1T Chimera (free) logoTNG: R1T Chimera (free) via OpenRouter

Access TNG: R1T Chimera (free) via OpenRouter

TNG-R1T-Chimera is an experimental LLM with a faible for creative storytelling and character interaction. It is a derivate of the original TNG/DeepSeek-R1T-Chimera released in April 2025 and is available exclusively via Chutes and OpenRouter. Characteristics and improvements include: We think that it has a creative and pleasant personality. It has a preliminary EQ-Bench3 value of about 1305. It is quite a bit more intelligent than the original, albeit a slightly slower. It is much more think-token consistent, i.e. reasoning and answer blocks are properly delineated. Tool calling is much improved. TNG Tech, the model authors, ask that users follow the careful guidelines that Microsoft has created for their "MAI-DS-R1" DeepSeek-based model. These guidelines are available on Hugging Face (https://huggingface.co/microsoft/MAI-DS-R1).

3 min read
TNG: R1T Chimera logoTNG: R1T Chimera via OpenRouter

Access TNG: R1T Chimera via OpenRouter

TNG-R1T-Chimera is an experimental LLM with a faible for creative storytelling and character interaction. It is a derivate of the original TNG/DeepSeek-R1T-Chimera released in April 2025 and is available exclusively via Chutes and OpenRouter. Characteristics and improvements include: We think that it has a creative and pleasant personality. It has a preliminary EQ-Bench3 value of about 1305. It is quite a bit more intelligent than the original, albeit a slightly slower. It is much more think-token consistent, i.e. reasoning and answer blocks are properly delineated. Tool calling is much improved. TNG Tech, the model authors, ask that users follow the careful guidelines that Microsoft has created for their "MAI-DS-R1" DeepSeek-based model. These guidelines are available on Hugging Face (https://huggingface.co/microsoft/MAI-DS-R1).

3 min read
Anthropic: Claude Opus 4.5 logoAnthropic: Claude Opus 4.5 via OpenRouter

Access Anthropic: Claude Opus 4.5 via OpenRouter

Claude Opus 4.5 is Anthropic’s frontier reasoning model optimized for complex software engineering, agentic workflows, and long-horizon computer use. It offers strong multimodal capabilities, competitive performance across real-world coding and reasoning benchmarks, and improved robustness to prompt injection. The model is designed to operate efficiently across varied effort levels, enabling developers to trade off speed, depth, and token usage depending on task requirements. It comes with a new parameter to control token efficiency, which can be accessed using the OpenRouter Verbosity parameter with low, medium, or high. Opus 4.5 supports advanced tool use, extended context management, and coordinated multi-agent setups, making it well-suited for autonomous research, debugging, multi-step planning, and spreadsheet/browser manipulation. It delivers substantial gains in structured reasoning, execution reliability, and alignment compared to prior Opus generations, while reducing token overhead and improving performance on long-running tasks.

3 min read
AllenAI: Olmo 3 32B Think logoAllenAI: Olmo 3 32B Think via OpenRouter

Access AllenAI: Olmo 3 32B Think via OpenRouter

Olmo 3 32B Think is a large-scale, 32-billion-parameter model purpose-built for deep reasoning, complex logic chains and advanced instruction-following scenarios. Its capacity enables strong performance on demanding evaluation tasks and highly nuanced conversational reasoning. Developed by Ai2 under the Apache 2.0 license, Olmo 3 32B Think embodies the Olmo initiative’s commitment to openness, offering full transparency across weights, code and training methodology.

3 min read
AllenAI: Olmo 3 7B Instruct logoAllenAI: Olmo 3 7B Instruct via OpenRouter

Access AllenAI: Olmo 3 7B Instruct via OpenRouter

Olmo 3 7B Instruct is a supervised instruction-fine-tuned variant of the Olmo 3 7B base model, optimized for instruction-following, question-answering, and natural conversational dialogue. By leveraging high-quality instruction data and an open training pipeline, it delivers strong performance across everyday NLP tasks while remaining accessible and easy to integrate. Developed by Ai2 under the Apache 2.0 license, the model offers a transparent, community-friendly option for instruction-driven applications.

3 min read
AllenAI: Olmo 3 7B Think logoAllenAI: Olmo 3 7B Think via OpenRouter

Access AllenAI: Olmo 3 7B Think via OpenRouter

Olmo 3 7B Think is a research-oriented language model in the Olmo family designed for advanced reasoning and instruction-driven tasks. It excels at multi-step problem solving, logical inference, and maintaining coherent conversational context. Developed by Ai2 under the Apache 2.0 license, Olmo 3 7B Think supports transparent, fully open experimentation and provides a lightweight yet capable foundation for academic research and practical NLP workflows.

3 min read
Google: Nano Banana Pro (Gemini 3 Pro Image Preview) logoGoogle: Nano Banana Pro (Gemini 3 Pro Image Preview) via OpenRouter

Access Google: Nano Banana Pro (Gemini 3 Pro Image Preview) via OpenRouter

Nano Banana Pro is Google’s most advanced image-generation and editing model, built on Gemini 3 Pro. It extends the original Nano Banana with significantly improved multimodal reasoning, real-world grounding, and high-fidelity visual synthesis. The model generates context-rich graphics, from infographics and diagrams to cinematic composites, and can incorporate real-time information via Search grounding. It offers industry-leading text rendering in images (including long passages and multilingual layouts), consistent multi-image blending, and accurate identity preservation across up to five subjects. Nano Banana Pro adds fine-grained creative controls such as localized edits, lighting and focus adjustments, camera transformations, and support for 2K/4K outputs and flexible aspect ratios. It is designed for professional-grade design, product visualization, storyboarding, and complex multi-element compositions while remaining efficient for general image creation workflows.

3 min read
xAI: Grok 4.1 Fast logoxAI: Grok 4.1 Fast via OpenRouter

Access xAI: Grok 4.1 Fast via OpenRouter

Grok 4.1 Fast is xAI's best agentic tool calling model that shines in real-world use cases like customer support and deep research. 2M context window. Reasoning can be enabled/disabled using the `reasoning` `enabled` parameter in the API. [Learn more in our docs](https://openrouter.ai/docs/use-cases/reasoning-tokens#controlling-reasoning-tokens)

3 min read
Google: Gemini 3 Pro Preview logoGoogle: Gemini 3 Pro Preview via OpenRouter

Access Google: Gemini 3 Pro Preview via OpenRouter

Gemini 3 Pro is Google’s flagship frontier model for high-precision multimodal reasoning, combining strong performance across text, image, video, audio, and code with a 1M-token context window. Reasoning Details must be preserved when using multi-turn tool calling, see our docs here: https://openrouter.ai/docs/use-cases/reasoning-tokens#preserving-reasoning-blocks. It delivers state-of-the-art benchmark results in general reasoning, STEM problem solving, factual QA, and multimodal understanding, including leading scores on LMArena, GPQA Diamond, MathArena Apex, MMMU-Pro, and Video-MMMU. Interactions emphasize depth and interpretability: the model is designed to infer intent with minimal prompting and produce direct, insight-focused responses. Built for advanced development and agentic workflows, Gemini 3 Pro provides robust tool-calling, long-horizon planning stability, and strong zero-shot generation for complex UI, visualization, and coding tasks. It excels at agentic coding (SWE-Bench Verified, Terminal-Bench 2.0), multimodal analysis, and structured long-form tasks such as research synthesis, planning, and interactive learning experiences. Suitable applications include autonomous agents, coding assistants, multimodal analytics, scientific reasoning, and high-context information processing.

3 min read
Deep Cogito: Cogito v2.1 671B logoDeep Cogito: Cogito v2.1 671B via OpenRouter

Access Deep Cogito: Cogito v2.1 671B via OpenRouter

Cogito v2.1 671B MoE represents one of the strongest open models globally, matching performance of frontier closed and open models. This model is trained using self play with reinforcement learning to reach state-of-the-art performance on multiple categories (instruction following, coding, longer queries and creative writing). This advanced system demonstrates significant progress toward scalable superintelligence through policy improvement.

3 min read
OpenAI: GPT-5.1 logoOpenAI: GPT-5.1 via OpenRouter

Access OpenAI: GPT-5.1 via OpenRouter

GPT-5.1 is the latest frontier-grade model in the GPT-5 series, offering stronger general-purpose reasoning, improved instruction adherence, and a more natural conversational style compared to GPT-5. It uses adaptive reasoning to allocate computation dynamically, responding quickly to simple queries while spending more depth on complex tasks. The model produces clearer, more grounded explanations with reduced jargon, making it easier to follow even on technical or multi-step problems. Built for broad task coverage, GPT-5.1 delivers consistent gains across math, coding, and structured analysis workloads, with more coherent long-form answers and improved tool-use reliability. It also features refined conversational alignment, enabling warmer, more intuitive responses without compromising precision. GPT-5.1 serves as the primary full-capability successor to GPT-5

3 min read
OpenAI: GPT-5.1 Chat logoOpenAI: GPT-5.1 Chat via OpenRouter

Access OpenAI: GPT-5.1 Chat via OpenRouter

GPT-5.1 Chat (AKA Instant is the fast, lightweight member of the 5.1 family, optimized for low-latency chat while retaining strong general intelligence. It uses adaptive reasoning to selectively “think” on harder queries, improving accuracy on math, coding, and multi-step tasks without slowing down typical conversations. The model is warmer and more conversational by default, with better instruction following and more stable short-form reasoning. GPT-5.1 Chat is designed for high-throughput, interactive workloads where responsiveness and consistency matter more than deep deliberation.

3 min read
OpenAI: GPT-5.1-Codex logoOpenAI: GPT-5.1-Codex via OpenRouter

Access OpenAI: GPT-5.1-Codex via OpenRouter

GPT-5.1-Codex is a specialized version of GPT-5.1 optimized for software engineering and coding workflows. It is designed for both interactive development sessions and long, independent execution of complex engineering tasks. The model supports building projects from scratch, feature development, debugging, large-scale refactoring, and code review. Compared to GPT-5.1, Codex is more steerable, adheres closely to developer instructions, and produces cleaner, higher-quality code outputs. Reasoning effort can be adjusted with the `reasoning.effort` parameter. Read the [docs here](https://openrouter.ai/docs/use-cases/reasoning-tokens#reasoning-effort-level) Codex integrates into developer environments including the CLI, IDE extensions, GitHub, and cloud tasks. It adapts reasoning effort dynamically—providing fast responses for small tasks while sustaining extended multi-hour runs for large projects. The model is trained to perform structured code reviews, catching critical flaws by reasoning over dependencies and validating behavior against tests. It also supports multimodal inputs such as images or screenshots for UI development and integrates tool use for search, dependency installation, and environment setup. Codex is intended specifically for agentic coding applications.

3 min read
OpenAI: GPT-5.1-Codex-Mini logoOpenAI: GPT-5.1-Codex-Mini via OpenRouter

Access OpenAI: GPT-5.1-Codex-Mini via OpenRouter

GPT-5.1-Codex-Mini is a smaller and faster version of GPT-5.1-Codex

3 min read
Kwaipilot: KAT-Coder-Pro V1 (free) logoKwaipilot: KAT-Coder-Pro V1 (free) via OpenRouter

Access Kwaipilot: KAT-Coder-Pro V1 (free) via OpenRouter

KAT-Coder-Pro V1 is KwaiKAT's most advanced agentic coding model in the KAT-Coder series. Designed specifically for agentic coding tasks, it excels in real-world software engineering scenarios, achieving 73.4% solve rate on the SWE-Bench Verified benchmark. The model has been optimized for tool-use capability, multi-turn interaction, instruction following, generalization, and comprehensive capabilities through a multi-stage training process, including mid-training, supervised fine-tuning (SFT), reinforcement fine-tuning (RFT), and scalable agentic RL.

3 min read
Kwaipilot: KAT-Coder-Pro V1 logoKwaipilot: KAT-Coder-Pro V1 via OpenRouter

Access Kwaipilot: KAT-Coder-Pro V1 via OpenRouter

KAT-Coder-Pro V1 is KwaiKAT's most advanced agentic coding model in the KAT-Coder series. Designed specifically for agentic coding tasks, it excels in real-world software engineering scenarios, achieving 73.4% solve rate on the SWE-Bench Verified benchmark. The model has been optimized for tool-use capability, multi-turn interaction, instruction following, generalization, and comprehensive capabilities through a multi-stage training process, including mid-training, supervised fine-tuning (SFT), reinforcement fine-tuning (RFT), and scalable agentic RL.

3 min read
MoonshotAI: Kimi K2 Thinking logoMoonshotAI: Kimi K2 Thinking via OpenRouter

Access MoonshotAI: Kimi K2 Thinking via OpenRouter

Kimi K2 Thinking is Moonshot AI’s most advanced open reasoning model to date, extending the K2 series into agentic, long-horizon reasoning. Built on the trillion-parameter Mixture-of-Experts (MoE) architecture introduced in Kimi K2, it activates 32 billion parameters per forward pass and supports 256 k-token context windows. The model is optimized for persistent step-by-step thought, dynamic tool invocation, and complex reasoning workflows that span hundreds of turns. It interleaves step-by-step reasoning with tool use, enabling autonomous research, coding, and writing that can persist for hundreds of sequential actions without drift. It sets new open-source benchmarks on HLE, BrowseComp, SWE-Multilingual, and LiveCodeBench, while maintaining stable multi-agent behavior through 200–300 tool calls. Built on a large-scale MoE architecture with MuonClip optimization, it combines strong reasoning depth with high inference efficiency for demanding agentic and analytical tasks.

3 min read
Amazon: Nova Premier 1.0 logoAmazon: Nova Premier 1.0 via OpenRouter

Access Amazon: Nova Premier 1.0 via OpenRouter

Amazon Nova Premier is the most capable of Amazon’s multimodal models for complex reasoning tasks and for use as the best teacher for distilling custom models.

3 min read
Perplexity: Sonar Pro Search logoPerplexity: Sonar Pro Search via OpenRouter

Access Perplexity: Sonar Pro Search via OpenRouter

Exclusively available on the OpenRouter API, Sonar Pro's new Pro Search mode is Perplexity's most advanced agentic search system. It is designed for deeper reasoning and analysis. Pricing is based on tokens plus $18 per thousand requests. This model powers the Pro Search mode on the Perplexity platform. Sonar Pro Search adds autonomous, multi-step reasoning to Sonar Pro. So, instead of just one query + synthesis, it plans and executes entire research workflows using tools.

3 min read
Mistral: Voxtral Small 24B 2507 logoMistral: Voxtral Small 24B 2507 via OpenRouter

Access Mistral: Voxtral Small 24B 2507 via OpenRouter

Voxtral Small is an enhancement of Mistral Small 3, incorporating state-of-the-art audio input capabilities while retaining best-in-class text performance. It excels at speech transcription, translation and audio understanding. Input audio is priced at $100 per million seconds.

3 min read