What Is GPT-5.4: New Features & What Changed

Hey fellow AI tinkerers — if you're the kind of person who watches model releases the way others watch sports scores, this one dropped fast. GPT-5.4 went live on March 5, 2026, literally two days after GPT-5.3 Instant. I've been tracking the pre-release signals since late February (yes, those Codex pull request leaks were real), and now that it's out, here's what actually changed — and what it means for how you work.

I'm Hanks. I test AI tools inside real workflows, not demos. That's the only lens worth using here.


What Is GPT-5.4?

GPT-5.4 is OpenAI's latest frontier model, officially described as their "most capable and efficient frontier model for professional work." It's not a standalone model family — it's the newest iteration inside the GPT-5 series, which OpenAI has been iterating on rapidly since the original GPT-5 launched in August 2025.

Version context: where it sits in the GPT-5 family

The GPT-5 line has moved fast. Here's the rough timeline:

Version
Released
Primary Role
GPT-5
Aug 2025
Original flagship, replaced GPT-4o as default
GPT-5.1
Sept 2025
Warmer tone, less sycophantic
GPT-5.2
Late 2025
Long-context reasoning, spreadsheet/slide generation
GPT-5.3 Instant
Mar 3, 2026
Faster everyday conversations, fewer overrefusals
GPT-5.4
Mar 5, 2026
Coding + reasoning + computer use unified

GPT-5.4 is the first model to bring together frontier coding (absorbed from GPT-5.3-Codex), deep reasoning (from the Thinking series), and native computer use into a single package. That consolidation is what justifies the version bump from 5.3 to 5.4, rather than another Instant patch.

When it was released and how to access it

GPT-5.4 rolled out on March 5, 2026 across ChatGPT, the API, and Codex. Per the official ChatGPT release notes:

  • GPT-5.4 Thinking — Available to ChatGPT Plus ($20/mo), Team, and Pro users. Replaces GPT-5.2 Thinking in the model picker.
  • GPT-5.4 Pro — ChatGPT Pro ($200/mo) and Enterprise plans only.
  • Free users — Get GPT-5.4 only when auto-routed by the system. No manual model selection.
  • API + Codex — Available now with up to 1M token context. Listed on OpenRouter at $2.50/1M input and $20.00/1M output (direct OpenAI API billing may vary by tier).

GPT-5.2 Thinking stays available as a legacy option for paid users until June 5, 2026.


What's New in GPT-5.4

Key capability changes vs previous version

Three things stand out:

  1. Native computer use — first time in a general-purpose model

This is genuinely new. GPT-5.4 is OpenAI's first general-purpose model with native computer-use capabilities baked in, not added as an external layer. In Codex and the API, the model can operate a computer, move across applications, and run multi-step workflows without human hand-holding between steps. This is the architecture shift that makes agents actually useful rather than impressive in demos.

  1. 1M token context window in the API

The context ceiling jumped to one million tokens. Important caveat: OpenAI charges double per token once input exceeds 272K tokens, so that ceiling comes with a real cost cliff. Budget accordingly before you start pumping in entire codebases.

  1. Tool Search for agentic workflows

A new API feature that helps agents find and use the right tools across large ecosystems of connectors — without pre-defining every tool call upfront. For developers building multi-tool pipelines, this is the kind of thing that quietly replaces a lot of boilerplate prompt engineering.

Hallucination numbers (from OpenAI)

Individual claims are 33% less likely to be false compared to GPT-5.2. Full responses are 18% less likely to contain any errors. On their internal knowledge-work benchmark, GPT-5.4 matched or exceeded industry professionals in 83% of comparisons across 44 occupations. These are OpenAI's own numbers, so treat them as a directional signal, not gospel.

Speed, context window, and pricing changes

The efficiency story is real. GPT-5.4 uses significantly fewer tokens to solve problems than GPT-5.2 — up to 47% fewer on some tasks. That matters for API cost calculations even though the output price per token is higher.

For ChatGPT users: GPT-5.4 Thinking now gives you an upfront reasoning plan before it commits to the full output. You can course-correct mid-response. That's a genuine workflow change — less waiting, more steering.

What stayed the same

  • Context windows in ChatGPT (non-API) are unchanged from GPT-5.2 Thinking
  • GPT-5.3 Instant remains the default for everyday conversational queries — GPT-5.4 doesn't replace it
  • The router system still runs underneath everything in ChatGPT; GPT-5.4 isn't the only model you'll hit

GPT-5.4 vs GPT-5 and GPT-4o — Quick Comparison

Model
Speed
Context (API)
Best For
ChatGPT Access
GPT-4o
Fast
128K
Multimodal, general tasks
Free + Plus (legacy)
GPT-5
Moderate
128K
Broad capability, coding, math
All users
GPT-5.2 Thinking
Slower
256K
Long-doc analysis, spreadsheets
Plus+ (legacy until June 2026)
GPT-5.3 Instant
Very fast
128K
Everyday conversation, Q&A
All ChatGPT users
GPT-5.4 Thinking
Fast
1M (API)
Coding, agents, professional work
Plus, Team, Pro
GPT-5.4 Pro
Slower
1M (API)
Maximum quality, complex tasks
Pro + Enterprise

GPT-4o isn't going anywhere immediately. OpenAI has no current deprecation plans for GPT-4o in the API. If you're cost-sensitive and don't need reasoning or computer use, it remains a solid option.


Who Should Actually Use GPT-5.4

Use cases where the upgrade matters

GPT-5.4 was optimized for a specific user profile. If you're in one of these, the upgrade is worth testing:

Developers building agents — Tool Search + native computer use + 1M token context in a single model is new. If you're orchestrating multi-step workflows or building on Codex, this is the model to benchmark against your current setup.

Finance and legal work — OpenAI's internal investment banking benchmark jumped from 43.7% (GPT-5) to 88% (GPT-5.4 Thinking). On Mercor's APEX-Agents benchmark for law and finance professional skills, GPT-5.4 led the field. For long-horizon deliverables — financial models, contract analysis, investor memos — these numbers are worth taking seriously.

Anyone currently using Thinking mode — The upfront plan feature changes the interaction model in a way that saves real time. You're not waiting for a full output to discover the model went in the wrong direction.

Who's better off staying on GPT-4o or GPT-5

If your use case is primarily conversational Q&A, quick drafts, or moderate-length documents, GPT-5.3 Instant is faster and more than capable. The GPT-5.4 improvements are concentrated in professional and agentic work — you won't feel them in a normal ChatGPT conversation.

API users: do the math carefully. GPT-5.4 at $20.00/1M output is roughly double GPT-5 at $10.00/1M. The token efficiency gains are real but don't fully offset that gap across all task types.


Common Questions & Misconceptions

Is GPT-5.4 free?

Partially. Free ChatGPT users can be auto-routed to GPT-5.4 responses, but can't select it manually. Deliberate access to GPT-5.4 Thinking requires at minimum a Plus subscription ($20/month).

Does it replace GPT-5?

No. GPT-5 remains available in the API with no announced deprecation timeline. GPT-5.4 specifically replaces GPT-5.2 Thinking in the ChatGPT model picker for Plus+ users. Earlier models aren't going away — at least not yet.

What happened to GPT-5.3?

GPT-5.3 existed as two things: GPT-5.3-Codex (specialized coding model) and GPT-5.3 Instant (launched March 3, 2026 for everyday chat). GPT-5.4 absorbs the coding capabilities of 5.3-Codex and layers reasoning and computer use on top. GPT-5.3 Instant and GPT-5.4 occupy different lanes — they're not competing for the same slot.


Verdict

GPT-5.4 is a meaningful upgrade if your work involves agents, coding, or high-stakes professional documents. The native computer use, Tool Search, and 1M token context consolidate what previously required juggling multiple models. The 47% token efficiency gains on some tasks also improve the economics compared to earlier reasoning models.

For everyday ChatGPT use, the delta over GPT-5.3 Instant is small. Most users won't notice a difference in standard Q&A or writing tasks.

The version to watch closely is GPT-5.4 Pro. If OpenAI's benchmark claims on complex professional work hold up outside their own test environment, that's a real competitive move against Claude and Gemini in enterprise workflows.


GPT-5.4 is the first OpenAI model that can actually operate software and run multi-step tasks like an agent. The interesting question isn’t whether the model is smarter — it’s whether those capabilities turn into something you can actually execute in daily work.

At Macaron, we built our personal AI agent for exactly this layer: turning a conversation into structured actions, tools, and repeatable workflows. If you want to see how an AI agent behaves outside a demo, start with a real task and try Macaron free at macaron.im.

Related Articles:

What Is GPT-5.3 Codex? A Practical Introduction for Developers (2026)

How to Use GPT-5.3 Codex for Long-Running Coding Tasks

How Developers Use GPT-5.3 Codex as a Coding Agent

When NOT to Use GPT-5.3 Codex (And What to Use Instead)

GPT-5.3 Codex vs Claude Opus 4.6: A Neutral "Choose-by-Task" Guide (No Rankings)

Hey, I’m Hanks — a workflow tinkerer and AI tool obsessive with over a decade of hands-on experience in automation, SaaS, and content creation. I spend my days testing tools so you don’t have to, breaking down complex processes into simple, actionable steps, and digging into the numbers behind “what actually works.”

Apply to become Macaron's first friends