On June 28, Elon Musk announced that Grok 4.5 — xAI's newest model — is in private beta at SpaceX and Tesla. The specs he shared are real news: it's the first model built on xAI's new V9 foundation, at roughly 1.5 trillion parameters — about three times the size of the architecture behind the Grok that answers questions on X today — with coding data from Cursor folded into supplemental training. Then came the line built for headlines: early evals show performance “close to, perhaps exceeding Opus.”
Here's what you can independently verify about that claim: nothing. There's no system card, no public benchmark, no pricing, and no release date. The evidence is internal evaluations at two companies Musk controls, plus one early-access developer calling it “similar to Opus” — an anecdote, not a leaderboard. The model anyone can actually buy is still Grok 4.3, public since April 30 at $1.25 per million input tokens and $2.50 out, with a 1-million-token context window.
Grok 4.5 joins a crowded genre this summer: the announced-but-unavailable frontier model. OpenAI's GPT-5.6 family is gated behind a government-visible preview of roughly 20 vetted partners. Meta's “Watermelon” exists as a claim from a closed briefing. And Grok 5 — the ~6-trillion-parameter flagship training on Colossus 2 in Memphis — has slipped from late 2025 to Q1, to Q2, and is now a Q3 hope at best. The gap between “exists” and “available” is where the marketing lives.
The tell is the training data
Strip out the Opus talk and one detail carries real signal: Cursor. xAI supplemented Grok 4.5's training with data from the most popular AI coding environment, a deliberate shot at the coding market where Anthropic and OpenAI earn their margins — and where prices are already collapsing. Pair that with the deployment strategy: SpaceX's aerospace workflows and Tesla's vehicle software are live testbeds harder than any benchmark. Dogfooding at industrial scale is a structural advantage no leaderboard captures. Musk says from-scratch models will now ship monthly through year-end — a cadence no other lab has publicly committed to.
Our take: An unbenchmarked model doesn't beat anything — it can only out-tweet it. Vendor evals flatter the vendor, so “perhaps exceeding Opus” is an aspiration until a third party can run the scores. But don't dismiss the setup: a 3x-scaled foundation, Cursor-grade coding data, and two factories as a test harness is serious. If Grok 4.5 ships publicly anywhere near its claims at anything like Grok 4.3's prices, the coding-model price war gets a third front — and that's the part that hits your API bill.
What to watch
- Release timing. xAI's pattern is SuperGrok and X first, API within days. If the beta follows form, public access lands in weeks — but xAI has missed every Grok 5 window it set.
- Independent numbers. A system card, LMArena, or SWE-bench results would turn the Opus claim into something checkable. Until then it's unverified.
- Pricing. Grok 4.3 sits at $1.25/$2.50 per million tokens — far below frontier rivals. Where 4.5 lands decides whether this is a price event or just a press event.
- The monthly cadence. A new from-scratch model every month through Q4 is the boldest shipping promise in AI. Watch whether it survives contact with reality.
