Overfit by Karl Mayer · Precision that misses the point.

We Had Blackboards Once

Two researchers in 1970s Carnegie Mellon laboratory pointing at a blackboard covered in a modern multi-agent system architecture diagram.
The diagram is anachronistic. The problem isn't. Pawn to queen four.

In 1976, a team at Carnegie Mellon built one of the first systems that could recognize continuous human speech. Not isolated words. Not a small, curated vocabulary. Actual speech — messy, ambiguous, full of noise and human variation.

They called it Hearsay‑II. And the challenge it tackled has a name: an ill‑structured problem. In these problems, evidence arrives unevenly, the solution path isn't known in advance, and any architecture that assumes a clean sequence of steps is guaranteed to fail.

The blackboard model didn't fail. It got stranded. When connectionism took hold in the late 1980s and neural networks began outperforming symbolic AI on benchmark after benchmark, the entire paradigm went with it — not because the blackboard model was wrong, but because the field moved and never looked back. We didn't abandon it because it stopped working. We abandoned it because something else worked better on different problems. The distinction matters now, because we're building systems that face the same class of problems Hearsay‑II was designed for, and we're mostly not using the approach that worked.



The AI Wrote the Demo

Grand Canyon rendered as a lidar point cloud, canyon walls and floor emerging from fog.
AI-simulated lidar scan of the Grand Canyon. Waymo uses real lidar to drive through fog.

The demo works until it doesn’t, and what comes next is fog — hiding either a speed bump or a cliff you only discover on impact.

A self-driving car handles that uncertainty with lidar: it emits its own pulse, reads the return signal, and builds a picture of what’s ahead. These ten questions work the same way. Paste them into your AI after the demo works, before you ship.



The Demo Is Not the Product

The Grand Canyon obscured by fog, the valley invisible from the rim.
Fog inversion at the Grand Canyon. The valley is exactly as deep as it was yesterday.

The demo is impressive. Everyone in the room knows it. An agent that reasons, retrieves, decides, and acts — built in a weekend. The applause is real. So is the illusion.

AI didn't just democratize expertise. AI democratized the appearance of building.



Chess Instincts. Poker Stakes.

A chess knight resting on a deck of playing cards.
Eight moves. All of them wrong.

You know that moment in Scrabble when your perfect word vanishes because someone played right across your spot? You had a plan. Now you don't. You find another one.

For an AI agent, that moment doesn't exist. The plan just keeps running — grandmasters at a poker table, certain the game is about to make sense.

The good news is it's a design problem, not a model problem.



Can You Read It?

Green Matrix-style digital rain symbolizing machine output versus human understanding.
“I don't even see the code.”

For as long as software has existed, we have measured the bottleneck: output. Story points. Velocity. Pull requests merged. Tickets closed. Each one a new vocabulary for the same instinct — count what gets produced, because production is what you can see.

IBM set an early example in the 60s counting K-LOC (thousands of lines of code). The assumption was simple: more code meant more work, more value, more progress. Never mind that the best engineers wrote less. Never mind that every line added was a line someone would have to read, debug, and maintain forever. Output was visible. Quality was not.

Now comes the next iteration: tokens in, tokens out. A reasoning model generates ten thousand lines before lunch. Management sees the number (billions of tokens consumed, hurrah!) and feels progress. The dashboard is very green.

But here's what the dashboard doesn't show: whether anyone actually understands what was produced.



The Last Generation of Explicit Logic

A human computer at work at NASA Langley Research Center in 1952, using a microscope to read data from film while a Friden calculating machine sits beside her.
NASA / Image L-74768

Before computers were machines, they were people — hired to execute, not to reason.

Through the 1930s, 40s, and 50s, rooms full of women sat at desks performing calculations by hand. Mathematicians, many of them — and brilliant ones. The constraint wasn't their capability. It was the job. At NASA, at Los Alamos, at the Bureau of Standards. Katherine Johnson computed orbital trajectories. Dorothy Vaughan managed entire teams of them. They were called computers. That was the job title, and the job description was simple and absolute: receive a specification, execute it precisely, return the result. No judgment, no interpretation, no deviation. They were valued for exactly one quality: the ability to suppress their own reasoning in service of perfect fidelity to the specification.

When the machines arrived, they inherited the job description wholesale. Alan Turing defined the digital computer as a machine intended to carry out any operation a human computer could perform.



Hello World!

Hello World! on IBM BASIC
60,894 bytes free. Plenty of room to think.

Originally written in 2022 and revised in 2026. The assumptions have been updated. The curiosity hasn't.

Every technology choice, every framework, every diagram is a compressed representation of reality — architects build abstractions that other people have to live inside.

Architecture externalizes assumptions. The job is deciding which assumptions to externalize, and whether they're still true.

This blog lives in that question.