Know if it has beenreleased.Instantly.

PiunikaWeb reported Grok 4.3 beta appearing in the SuperGrok Heavy model picker without an x.ai/news announcement; xAI had not published a model card or API pricing at the time of the first sightings.

View details

Llama 4 Behemoth

byMeta

TBD

View details

MAI Voice 2 Flash

byMicrosoft AI

TBD

View details

Recent Releases

Latest versions that shipped

Gemma DiffusionGemma

byGoogle

Jun 10

Google's open-weight model family built from Gemini research, for developers who need efficient on-device and server-side models.

Release summary: Experimental open Gemma model that generates text via diffusion instead of autoregressive decoding. 26B MoE (3.8B active) built on Gemma 4 and Gemini Diffusion research, released under Apache 2.0. Delivers up to 4x faster token generation on dedicated GPUs (1000+ tok/s on H100, 700+ on RTX 5090) by drafting 256-token blocks in parallel with bi-directional attention. Fits in 18GB VRAM when quantized. Best for speed-critical local workflows like in-line editing, code infilling, and rapid iteration; standard Gemma 4 remains recommended for maximum output quality.

View details

Claude Fable 5

byAnthropic

Jun 9

Anthropic's Mythos-class Claude model with safeguards tuned for general availability, sitting above Opus in capability.

Release summary: Anthropic's first generally available Mythos-class model: state-of-the-art on software engineering, knowledge work, vision, and scientific research, with the largest lead over prior Claude models on long, complex tasks. Ships with safety classifiers that route some cybersecurity, biology, and distillation queries to Opus 4.8 (triggering in under 5% of sessions on average). Available on the Claude API as claude-fable-5, Claude apps, and major cloud platforms at $10/M input and $50/M output.

View details

Claude Mythos 5

byAnthropic

Jun 9

Anthropic's Mythos-class frontier Claude models with reduced safeguards for trusted partners in cybersecurity and life sciences research.

Release summary: Same underlying model as Claude Fable 5 with cyber safeguards lifted for vetted partners. Strongest cybersecurity capabilities of any model in the world at launch, deployed through Project Glasswing as an upgrade to Mythos Preview. Restricted to Glasswing partners initially, with a broader trusted access program planned. API pricing at $10/M input and $50/M output.

View details

MAI Thinking 1

byMicrosoft AI

Jun 2

Microsoft AI's flagship reasoning models for math, coding, and enterprise deployment, built from scratch without third-party distillation.

Release summary: Microsoft AI's flagship MoE reasoning model (~35B active, ~1T total parameters) for math, coding, and enterprise workloads. Competitive with Claude Opus 4.6 on SWE-Bench Pro at a smaller inference footprint, preferred to Sonnet 4.6 in blind Surge human evals, with built-in safety guardrails and copyright protection. Available in Microsoft Foundry private preview.

View details

MAI Code 1 Flash

byMicrosoft AI

Jun 2

Inference-efficient agentic coding models integrated with GitHub Copilot, VS Code, and the Microsoft stack.

Release summary: 5B-parameter inference-efficient agentic coding model custom-trained for GitHub Copilot and VS Code. Plans and reasons through multi-step coding tasks, supports broad language ecosystems, and is positioned as comparable to Claude Haiku at lower cost. Rolling out in GitHub Copilot in VS Code.

View details

MAI Image 2.5

byMicrosoft AI

Jun 2

Text-to-image and image-editing models for photorealistic, design-ready visuals with precise edit control.

Release summary: Text-to-image and image-editing model for photorealistic, design-ready output with fine-grained edit control, reliable text rendering, and branding or product workflows. Includes an ultra-efficient Flash variant; Microsoft reports Arena scores surpassing Nano Banana Pro. Available via MAI Playground and Microsoft Foundry.

View details

MAI Transcribe 1.5

byMicrosoft AI

Jun 2

Speech-to-text models for accurate, domain-aware transcription across dozens of languages and noisy audio.

Release summary: Speech-to-text model with 4.9% average WER on FLEURS across 43 languages (automatic detection), contextual biasing for domain terminology, and ~5.7x lower latency than cited competitors. Outperforms Scribe v2, Whisper-large-v3, GPT-4o-Transcribe, and Gemini 3.1 Flash on many language benchmarks. Priced at $0.36 per hour via Azure Speech / Foundry.

View details

MAI Voice 2

byMicrosoft AI

Jun 2

Text-to-speech models for expressive, low-latency speech with multilingual voice matching and long-form stability.

Release summary: Multilingual text-to-speech with 15 languages, instant voice matching from short reference clips, expressive emotion control, and stable long-form output for audiobooks, podcasts, and lectures. Built-in guardrails require authorized, consented voices. Priced at $0.22 per 1M characters via MAI Playground and Azure Speech.

View details

Microsoft Frontier Tuning 1

byMicrosoft AI

Jun 2

Private reinforcement-learning tuning for MAI models on your workflows and data, deployable in Microsoft Foundry or Copilot.

Release summary: Microsoft's private reinforcement-learning tuning service for MAI models on customer workflows, data, and M365 context. Models train in your environment with self-serve, developer, or co-create paths; early adopters report up to 10x efficiency versus general frontier models (e.g. Excel-tuned MAI vs GPT 5.4). Includes a Mayo Clinic healthcare co-development partnership.

View details

Mellum 2

byJetBrains

Jun 1

JetBrains focal models for software engineering: code completion, routing, RAG, and agentic workflows with open weights.

Release summary: 12B-parameter Mixture-of-Experts focal model (2.5B active per token) for routing, RAG, sub-agents, and private deployments. Ships open from day one with base, instruct, and thinking checkpoints under Apache 2.0.

View details

Claude Opus 4.8

byAnthropic

May 28

Anthropic's most capable model for complex reasoning, coding, and agentic tasks.

Release summary: Upgrade to Anthropic's Opus class with stronger performance across coding, agentic tasks, and professional work, plus improved consistency for long-running tasks. Adds effort control in claude.ai and Cowork, dynamic workflows in Claude Code (research preview) for large parallel subagent runs, and fast mode at 2.5x speed with pricing three times lower than on prior Opus models. Same standard API pricing as Opus 4.7 ($5/M input, $25/M output). Available via Claude API as claude-opus-4-8.

View details

Gemini 3.5 Flash

byGoogle

May 19

Google's multimodal AI model series built by DeepMind.

Release summary: Gemini 3.5 Flash is Google's agentic and coding-focused Flash model: frontier-class scores on Terminal-Bench 2.1 (76.2%), GDPval-AA, and MCP Atlas at roughly 4x the output tokens per second of other frontier models.

View details

Qwen 3.7-Max

byAlibaba

May 19

Alibaba's Qwen frontier models for agents, coding, and long-horizon automation.

Release summary: Qwen3.7-Max agent foundation with 1M-token context, long-horizon tool use, and frontier coding and office automation scores.

View details

Composer 2.5

byCursor

May 18

Cursor's agentic coding model that powers Composer in the IDE, built for long-horizon programming tasks.

Release summary: Cursor's updated frontier coding model for Composer, trained with scaled RL, harder synthetic tasks, and targeted textual feedback. Strong gains over Composer 2 on Terminal-Bench 2.0 (69.3%), SWE-bench Multilingual (79.8%), and CursorBench v3.1 harder tasks (63.2%). Same standard pricing as Composer 2 ($0.50/M input, $2.50/M output); a faster default variant costs more ($3.00/M input, $15.00/M output). Double usage for the first week after launch.

View details

Mistral Medium 3.5

byMistral

Apr 29

Mistral's flagship dense models for instruction, reasoning, and agentic workloads.

Release summary: 128B dense Mistral flagship with 256K context, multimodal inputs, and configurable reasoning_effort for chat versus agentic coding.

View details

DeepSeek V4-Pro

byDeepSeek

Apr 24

DeepSeek's advanced AI model known for coding capabilities.

Release summary: DeepSeek V4-Pro flagship with frontier reasoning, long-context coding, and competitive scores on agentic engineering benchmarks.

View details

GPT 5.5

byOpenAI

Apr 23

OpenAI's flagship large language model series.

Release summary: OpenAI's smartest general-purpose frontier model for agentic coding, computer use, and knowledge work. State-of-the-art on Terminal-Bench 2.0 (82.7%), SWE-Bench Pro (58.6%), Expert-SWE (73.1%), GDPval (84.9%), OSWorld-Verified (78.7%), and BrowseComp (84.4%). Matches GPT-5.4 per-token latency while using fewer tokens on Codex tasks. Rolling out in ChatGPT and Codex; API pricing at $5/M input and $30/M output with 1M context.

View details

Kimi K2.6

byMoonshot AI

Apr 21

Moonshot AI's open-weight model series for coding, agents, and long-horizon tasks.

Release summary: Open-weight coding and agent model from Moonshot AI. Reported SOTA among open models on Humanity's Last Exam with tools (54.0%), SWE-Bench Pro (58.6%), and strong scores on SWE-bench Multilingual (76.7%), BrowseComp (83.2%), Toolathlon (50.0%), CharXiv with Python (86.7%), and MathVision with Python (93.2%). Positioned for GPT-5.4-class coding at much lower cost than closed frontier APIs, with long-horizon runs (4,000+ tool calls, 12+ hours), large agent swarms (300 parallel sub-agents, 4,000 steps per run), and multimodal front-end capabilities. Available on kimi.com (chat and agent mode); Kimi Code at kimi.com/code targets production workflows.

View details

MiniMax M2 M2.7

byMiniMax

Mar 18

MiniMax's agentic coding and reasoning model family.

Release summary: MiniMax-M2.7 improves end-to-end project delivery, log analysis, and complex agent harnesses; ~56% SWE-Pro and strong Terminal-Bench 2 scores among open models.

View details

MiniMax Text M2.7

byMiniMax

Mar 18

MiniMax general-purpose text models for chat and tool use.

Release summary: MiniMax M2.7 text API model focused on software engineering, office suites, and complex agent skills with strong SWE-Pro and GDPval-AA scores.

View details

Mistral Small 4

byMistral

Mar 16

Efficient Mistral models tuned for fast inference and everyday assistant tasks.

Release summary: Efficient Mistral Small 4 models for low-latency assistants, on-device style deployments, and cost-sensitive APIs.

View details

Claude Sonnet 4.6

byAnthropic

Feb 17

Anthropic's balanced model offering strong performance at a lower cost than Opus.

Release summary: Full upgrade across coding, computer use, long-context reasoning, agent planning, and knowledge work. Features 1M token context window (beta) and significant improvements in consistency and instruction following. Users prefer it to Sonnet 4.5 roughly 70% of the time, and over Opus 4.5 59% of the time — approaching Opus-level intelligence at $3/$15 per million tokens.

View details

GLM 5

byZhipu AI

Feb 11

Zhipu AI's open-weight large language model series.

Release summary: Zhipu AI's open-weight 744B parameter MoE model (40B active per token) trained on 28.5T tokens. First open model to score 50+ on the Artificial Analysis Intelligence Index v4.0, outperforming Gemini 3.0 Pro and GPT-5.2 across multiple benchmarks. Released under MIT License.

View details

Devstral 2

byMistral

Dec 9

Mistral's coding-focused model line for agents, IDEs, and software engineering tasks.

Release summary: Devstral 2 coding model for IDE agents and the Mistral Vibe CLI with stronger repository-scale refactors than Devstral 1.

View details

Grok 4.1 Fast

byxAI

Nov 19

xAI's frontier Grok model family for reasoning, coding, and real-time knowledge.

Release summary: Faster, lower-latency Grok 4.1 variant for real-time assistants and high-throughput API workloads.

View details

Claude Haiku 4.5

byAnthropic

Oct 15

Anthropic's fastest Claude model line, tuned for low latency and high throughput on everyday tasks.

Release summary: Haiku 4.5 targets near-frontier coding and agent quality with a much smaller compute footprint than Sonnet-class models.

View details

Llama 4 Scout

byMeta

Apr 5

Meta's open Llama multimodal model family for developers and researchers.

Release summary: Meta Llama 4 Scout multimodal MoE model (17B active) with native image and video understanding and a 10M-token context window.

View details