The AI Tasting Menu

Every Model. Every Drop.
One Menu.

A living tracker of the top LLM providers — who shipped what, what changed, and what it means for your stack. Updated as models drop. No fluff, just the good stuff.

🍽 On the Menu

Nine providers. Dozens of models. One place to track them all.

Hot
OpenAI
GPT‑5 · GPT‑5.2 · GPT‑5.3‑Codex · Instant · Thinking · Pro

The kitchen that started the fire. GPT‑5.2 dropped December 11, 2025 as three courses — Instant for quick bites, Thinking for the slow braise, and Pro for the full degustation. It beats human experts on 70.9% of professional tasks at 11x speed and <1% cost. 400K context window. 100% on AIME 2025 math. August 2025 knowledge cutoff across all variants. Then GPT‑5.3‑Codex landed February 5, 2026 as the most capable agentic coding model yet — the first model instrumental in creating itself. Still the model everyone benchmarks against, and still the kitchen where all the heat comes from.

Latest drop → GPT‑5.3‑Codex
Hot
Anthropic
Claude Opus 4.6 · Sonnet 4.5 · Haiku 4.5

The safety-first chef who also happens to be a virtuoso. Claude Opus 4.6 arrived February 2026 as the most intelligent model in the Claude 4.5 family — excelling at coding, planning, debugging large codebases, and financial analysis. Adaptive thinking with effort controls from low to max, context compaction for marathon sessions, and the kind of careful reasoning that reads your entire codebase before touching a single line. Sonnet 4.5 holds the coding crown for speed-to-quality ratio. Haiku 4.5 is the amuse-bouche — fast, cheap, and surprisingly sharp. The kitchen that measures twice and cuts once.

Latest drop → Claude Opus 4.6
Hot
Google
Gemini 3 · Pro · Flash · Deep Think

Google's most capable model yet — and it arrived like a five-alarm brunch rush. Gemini 3 Pro launched November 18, 2025 at #1 on LMArena (1501 Elo), with PhD-level reasoning, 1M-token context, and native multimodal understanding across text, images, audio, and video. Flash is the espresso shot — next-gen intelligence at lightning speed, now the default for all users. Deep Think is the sommelier's pairing — extended reasoning for Ultra subscribers that hits 93.8% on GPQA Diamond. Quietly powering Search, Workspace, Chrome, and Android while loudly winning benchmarks. Google Antigravity dropped alongside it as a full agentic development platform.

Latest drop → Gemini 3 Pro
Hot
Meta
Llama 4 · Maverick · Scout · (Behemoth in training)

The open-source heavyweight that sets the whole buffet free. Llama 4 landed April 5, 2025 as Meta's first Mixture-of-Experts architecture — natively multimodal from day one. Scout packs a 10M-token context window into a single H100 GPU. Maverick runs 128 experts across 400B parameters with a 1M context window, beating GPT-4o and Gemini 2.0 Flash on multimodal benchmarks at a fraction of the cost. Both models speak 12 languages. Behemoth (288B active, ~2T total) is still training in the kitchen — previewed as one of the smartest LLMs in the world. Free to download, fine-tune, and deploy. The potluck that feeds everyone.

Latest drop → Llama 4 Maverick
Hot
DeepSeek
DeepSeek‑V3 · R1 · V3.1 · V3.2

The dark horse that showed up to the Michelin dinner in sneakers — and got a star. DeepSeek R1 exploded in January 2025, matching OpenAI o1 on reasoning at a fraction of the compute. R1-0528 followed in May with 685B parameters, approaching o3 and Gemini 2.5 Pro performance while boosting accuracy from 70% to 87.5%. V3.1 arrived August 2025 as a hybrid that switches between chain-of-thought reasoning and direct answers in a single model. MIT-licensed, open-source, and trained on 2,048 GPUs while competitors burn through 16,000+. The $294K training run that made $100M budgets sweat.

Latest drop → DeepSeek V3.2
Hot
Mistral
Mistral Large 3 · Ministral 3B · 8B · 14B

Europe's answer to Silicon Valley, and the answer is "hold my croissant." Mistral 3 launched December 2, 2025 — all Apache 2.0, all open. Large 3 is a 675B-parameter MoE beast with 41B active parameters and a 256K context window, trained from scratch on 3,000 H200 GPUs. Ranks #2 among open-source non-reasoning models on LMArena. The Ministral 3 family (3B, 8B, 14B) covers everything from edge devices to enterprise — each available as base, instruct, and reasoning variants. Multilingual across 40+ languages. Runs on everything from Blackwell NVL72 racks to Jetson robots to RTX laptops. The boulangerie that delivers at every scale.

Latest drop → Mistral Large 3
Hot
xAI
Grok 4.1 · Fast · Thinking

The wildcard chef who traded the white coat for a leather jacket — and still cooks better than half the kitchen. Grok 4.1 dropped November 17, 2025 and immediately seized #1 on LMArena at 1483 Elo in Thinking mode — a 31-point margin over the nearest non-xAI model. Its non-thinking mode alone outscores every other model's full reasoning config. Trained on 100x more compute than predecessors using the Colossus supercomputer. Real-time X/Twitter data integration, tool-calling agents, and an emotional intelligence score that tops EQ-Bench3. Oh, and SpaceX just acquired xAI on February 2, 2026. The kitchen just got rocket fuel.

Latest drop → Grok 4.1
Hot
Perplexity
pplx-api · Sonar · Sonar Pro · Sonar Reasoning Pro · Deep Research

Search meets LLM — the Google alternative for people who want answers, not ten blue links. Sonar, built on Llama 3.3 70B, leads the SimpleQA factuality benchmark with an F-score of 0.858. Sonar Pro handles multi-step research queries with double the citations. Reasoning Pro adds chain-of-thought for complex analysis. Deep Research automates exhaustive multi-source investigations. Runs at 1,200 tokens/second on Cerebras inference. The Comet browser shipped as a native AI-first web experience. Real-time citations, grounded answers, and the sous chef that always checks its sources.

Latest drop → Sonar Pro
Hot
Midjourney
V7 · Niji 7 · Video

The image king upgraded the entire palace. V7 launched April 4, 2025 as a complete architectural rebuild — CEO David Holz called it "a totally different architecture." Smarter prompt understanding, photorealistic textures, and personalization profiles that learn your aesthetic from 200 image ratings. Draft Mode generates at 10x speed and half cost. Video generation arrived June 2025 for 5–21 second clips. Niji 7 dropped January 9, 2026 with a major coherency boost for anime and illustration. Voice-driven prompting lets you talk your way to art. Not an LLM, but the model every creative team reaches for first.

Latest drop → V7
· · ·

🍳 How We Taste

Our not-so-secret recipe for reviewing AI models. Three simple ingredients.

🔬
We Run the Models

Real prompts, real tasks, real results. No synthetic benchmarks. We test the way you actually use these tools — coding, writing, reasoning, and creating.

📡
We Track the Drops

Every model release, pricing change, and API update — tracked in real time. When something ships, you'll know what it means within 24 hours.

🚫
We Skip the Hype

No breathless "this changes everything" takes. Just clear, honest breakdowns of what's actually useful and what's just a press release.

· · ·

🔥 Latest Drops

The freshest AI model news, straight from the kitchen. Updated as things ship.

📬 Never Miss a Drop

Get the latest AI model releases, comparisons, and tasting notes delivered straight to your inbox. No spam, just the good stuff.

AI Company Tracker

What the Big AI Companies are Cooking UP.

We don’t spam! Read our privacy policy for more info.

Scroll to Top