GPT-5.5 Is Here and It Built Me a Website with Physics. Here Is What That Tells Us About Where AI Is Headed

Six weeks. That is how long it took OpenAI to go from GPT-5.4 to GPT-5.5. I know because I just used it in Codex to build a one-page website with a flag that moves using real physics simulation, and the thing just worked. No babysitting. No 15 rounds of back-and-forth. I described what I wanted, it planned the build, handled the implementation, and handed me something I could actually use. That experience tells you more about where AI is heading than any benchmark table.

GPT-5.5 dropped on April 23, 2026. If you blinked, you might have missed it, because OpenAI has been shipping at a pace that makes it hard to keep up. This is not the third upgrade of the year. By my count, it is closer to the sixth model in the GPT-5 family since the original launched in August 2025. The release cadence itself is the story, as much as the model.

What GPT-5.5 Actually Is

Let me clear something up first. GPT-5.5 is not just another patch on GPT-5. OpenAI describes it as the first fully retrained base model since GPT-4.5, meaning the architecture, training data, and objectives were reworked from the ground up. Every GPT-5.x model between them, versions 5.1 through 5.4, were post-training iterations sitting on top of the same base. GPT-5.5 is different at a structural level.

The codename internally was “Spud,” which is either endearing or completely at odds with how seriously everyone is treating this release. The public positioning is “a new class of intelligence for real work,” and for once that kind of language is not entirely hollow.

What makes GPT-5.5 different is not raw intelligence in the traditional sense. It is the shift from reactive model to proactive agent. Previous models were very good at answering questions when you asked them precisely. GPT-5.5 is built to handle what OpenAI calls “messy, multi-part tasks.” You give it a vague goal, it figures out the plan, uses whatever tools it needs, checks its own work along the way, and keeps going until the task is finished. Greg Brockman, OpenAI’s president, put it plainly during the launch briefing: “What is really special about this model is how much more it can do with less guidance.”

That matches my experience in Codex exactly.

I Built an Interactive Cloth Lab with One Prompt

Interactive Cloth lab: View live at https://cloth-lab.vercel.app/

Here is the actual prompt I used. I wanted a self-contained interactive physics demo, one HTML file, no external libraries, raw WebGL rendering, a cloth flag with the August Wheel branding, and a full control panel with sliders for turbulence, stiffness, and sheen. The kind of thing that would have taken a developer a full day to build from scratch.

I first came across this idea from a tweet by @chetaslua, who posted a paper physics website they built with GPT-5.5 in Codex, one shot, with wind effects and full physics interaction. The reaction was instant: HOLLLLY SHIIIIIT. I saw it and immediately thought, let me try this myself and see how repeatable it actually is. So I built my own version, the Interactive Cloth Lab, with my own prompt and my own brief. Here is exactly what I used.

PROMPT

Create a complete, self-contained interactive demo as exactly one HTML file. Build a polished one-page “Interactive Cloth Lab” website with a dark cinematic UI, inspired by a high-end physics showcase. The page should feature a large wind-blown cloth flag/banner with text mapped onto it. The cloth should feel physical, dimensional, and satisfying to watch.

Large text on the cloth: AUGUST WHEEL. Smaller text below it: By Augustine Osei. Additional text below that: made with GPT 5.5. Footer text: Interactive Cloth Lab. Follow my blog for AI and automation content on augustwheel.com.

Technical rules: Put all HTML, CSS, and JavaScript in one file named index.html. Do not use React, JSX, npm, bundlers, imports, modules, or external dependencies. Use only standard browser APIs. Prefer raw WebGL for the cloth rendering. Programmatically generate the cloth texture with an offscreen canvas. The demo must run by opening index.html directly in a browser.

Visual layout: Use a dark, glassy lab interface. Add a top bar with brand title, nav headings, FPS and point counters. Nav headings should include Showcase, Physics, Texture, and Controls. Add a vertical wind control on the left. Add a floating simulation panel with sliders for turbulence, stiffness, and sheen. Add a bottom control bar with reset, gust, and pulse controls. Keep the UI responsive for desktop and mobile.

Cloth physics: Implement custom mass-spring or Verlet-style cloth physics manually in JavaScript. Represent the banner as a subdivided rectangular grid of particles. Pin the full left edge so it behaves like a smooth pole line. Connect particles with structural, bend, and diagonal constraints. Apply wind, turbulence, gravity, damping, and pulse pressure. Add pointer interaction so users can grab and drag folds in the cloth.

Quality bar: The result should feel impressive enough to share publicly. The UI should feel intentional and polished, not like a bare demo.

GPT-5.5 in Codex took that prompt and delivered. The cloth ripples, the text deforms naturally with the fabric, and the control panel actually works. You can see the full code and play with it yourself in the public repo at github.com/AugustOsei/ClothLab. And you view it live also here

What struck me was not just the output but the process. The model did not ask me to clarify anything. It read the constraints, respected all of them, and produced something that felt intentional rather than generated. The “no external dependencies, raw WebGL only” constraint is the kind of thing that would trip up a less capable model. GPT-5.5 handled it cleanly.

How It Compares to Claude Opus 4.7 and Kimi K2.6

April 2026 has been one of the most competitive weeks in AI history, which is a sentence I am aware sounds like hype but is genuinely true. Claude Opus 4.7 launched on April 16, GPT-5.5 launched on April 23, and Kimi K2.6 from Moonshot AI arrived on April 20. Three serious models in eight days.

On the benchmark side, the picture is more nuanced than any one lab wants you to believe. GPT-5.5 leads Claude Opus 4.7 on agentic and computer-use benchmarks. It scores 82.7% on Terminal-Bench 2.0 compared to Opus 4.7’s 69.4%, and 78.7% on OSWorld-Verified versus 78.0%. Those are the kinds of tasks where you want the model to do things autonomously on a computer rather than just answer a question.

Opus 4.7 fights back on coding precision. It leads on SWE-bench Pro (64.3% to 58.6%), which is the benchmark closest to actually fixing bugs in real codebases. It also leads on most reasoning-heavy evaluations. The short version is that GPT-5.5 is better at doing things autonomously while Opus 4.7 is more precise when the task is complex code work requiring careful thought. Both ship with 1 million token context windows. Neither is a clear overall winner.

Then there is Kimi K2.6, which is the wildcard most people are sleeping on. Moonshot AI’s model is open-source and free, and it holds its own against both closed-source flagships on coding benchmarks. It scored 80.2% on SWE-bench Verified, ran a 13-hour autonomous coding session making over 1,000 tool calls without falling apart, and can coordinate up to 300 sub-agents working in parallel. For someone who does not want to pay per token, Kimi K2.6 is a genuinely serious option that deserves more attention than it is getting in Western AI coverage.

What the Release Cadence Is Actually Telling Us

Here is what I keep thinking about. OpenAI went from GPT-5 to GPT-5.5 in roughly eight months. That includes six distinct models or major variants. The company is openly in what sources described as a “Code Red” state, watching Anthropic’s annual revenue sprint from $9 billion toward $30 billion while losing ground in enterprise. GPT-5.5 is not just a better model. It is a strategic statement.

The framing has also shifted in a way worth noticing. Previous OpenAI launches led with accuracy scores and reasoning benchmarks. GPT-5.5 leads with outcomes. “It completes the task.” “It uses your tools.” “It does not need you to babysit it.” That is a fundamentally different product story, aimed at a different customer, and it puts pressure on a category that Anthropic and Google have not fully locked down yet.

TechCrunch reported that Brockman described GPT-5.5 as another step toward a unified “super app” merging ChatGPT, Codex, and an AI browser into a single session. If that lands, the model is not just competing with Claude. It is competing with how people use computers.

As for more releases coming this year, OpenAI has been clear that the pace is not slowing. Jakub Pachocki, OpenAI’s chief scientist, told reporters they expect “significant improvements in the short term, extremely significant improvements in the medium term.” Buckle up.

What This Means If You Are Just Trying to Use AI

If you are not a developer and you use ChatGPT occasionally, GPT-5.5 will mostly feel like a smoother version of what you already use. Tasks that used to require multiple follow-up prompts may just complete properly the first time. The change is less dramatic at the casual use level.

If you are doing any kind of building or automation, the story changes. GPT-5.5 in Codex is genuinely impressive for multi-step projects where you want the model to hold context and keep working without you managing every decision. My flag website was a small example, but the same principle applies to building tools, automating workflows, and tackling projects that previously required significant back-and-forth.

The practical advice right now is simple. Try GPT-5.5 in Codex if you have a paid OpenAI plan. Try Claude Opus 4.7 for anything that requires careful reasoning or precise code work. If cost is a factor and you are comfortable with open-source tools, Kimi K2.6 deserves a serious look.

Frequently Asked Questions

Is GPT-5.5 better than Claude Opus 4.7? It depends on what you are trying to do. GPT-5.5 leads on agentic tasks and computer use, making it the stronger choice for autonomous, multi-step workflows. Claude Opus 4.7 leads on coding precision and reasoning-heavy benchmarks. The smarter move is to use each for what it does best rather than picking one and sticking to it permanently.

How many AI models has OpenAI released in 2026? Since January 2026, OpenAI has shipped GPT-5.3 (February), GPT-5.3-Codex (February), GPT-5.4 (March), and GPT-5.5 (April), plus several mini and variant releases. The original GPT-5 launched in August 2025. The release cadence has been roughly one major model every four to six weeks.

What is Kimi K2.6 and why should I care about it? Kimi K2.6 is an open-source AI model from Moonshot AI, a Chinese AI lab, released April 20, 2026. It is free to use and competitive with GPT-5.5 and Claude Opus 4.7 on several coding benchmarks. It can run autonomous agent tasks for hours and coordinate hundreds of sub-agents. For anyone paying attention to what is happening outside of Anthropic and OpenAI, Kimi is worth watching closely.

Join The Wheels of Ai and Automation

A monthly drop of curated news, practical prompts, and how‑to guides—plus first dibs on workshops.

The Takeaway

We are in a period where the pace of AI development is itself the product announcement. GPT-5.5 is a genuinely capable model with a meaningful architectural upgrade. The agentic shift it represents is real, and my Codex experience building that physics site was not magic, but it was impressive. At the same time, Claude Opus 4.7 is still a serious competitor on precision tasks, and Kimi K2.6 is quietly making a strong case for open-source in a space that the big labs would prefer you forgot about.

The smartest thing you can do right now is stay curious, test the tools yourself, and resist the urge to crown a winner. The standings are changing every few weeks.

3 responses to “GPT-5.5 Is Here and It Built Me a Website with Physics. Here Is What That Tells Us About Where AI Is Headed”

Cursor Ai Just Launched Composer 2.5. I Tested It Against GPT-5.5 on the Same Task – August Wheel

May 20, 2026 at 5:07 am

[…] scrolling, upgraded to Cursor Pro, and ran a real test. Not a hello world. The exact same complex task I had already built with GPT-5.5, so I had a direct comparison […]

Loading…

I Watched Google I/O 2026 So You Didn't Have To. Here's What Actually Caught My Eye – August Wheel

May 20, 2026 at 8:24 am

[…] 4, and Gemini 3.1 Flash-Lite. Then April happened. Anthropic shipped Claude Opus 4.7 on April 16, OpenAI released GPT-5.5 on April 23 and broke the intelligence benchmark ceiling that had held since February, and DeepSeek […]

Loading…

Qwen3.7-Max Setup, Qwen Code, and a WebGL Flag I Actually Shipped – August Wheel

May 24, 2026 at 12:25 pm

[…] user interaction, and a clean UI all in one pass with no shortcuts. I have run this same test with GPT-5.5 and Cursor Composer 2.5, both of which I have written about here, and it has become my personal […]

Loading…

Join The Wheels of Ai and Automation

3 responses to “GPT-5.5 Is Here and It Built Me a Website with Physics. Here Is What That Tells Us About Where AI Is Headed”

Leave a ReplyCancel reply

Trending

Moonshot Just Dropped Kimi K2.7-Code, and It’s Beating Claude at One Specific Thing

CHOP FIRST Is Live: A Free Browser Racing Game Made with Claude’s Fable 5

Claude Fable 5 Review: I Tested It, It Impressed Me, and Then I Remembered I Can’t Afford It

Microsoft Just Made OpenClaw a Windows Native. Here’s What Scout Has to Do With It.