Dear Readers,

The AI race still has an oddly comforting myth: just scale models up, pour in more data, and AGI will eventually “click.” Today’s featured story argues that’s the wrong mental model - because bigger context isn’t real memory, more GPUs collide with power and infrastructure limits, and synthetic data can quietly rot the training distribution. The interesting part isn’t that models are getting better (they are); it’s where they still break: long-horizon reliability, durable learning, robust retrieval, and the kind of planning that survives messy real-world feedback instead of collapsing into prompt theater.

Then we zoom out to 2026: less a single “AGI moment,” more an arms race in agent stacks—memory + tools + verification + security - where progress shows up as longer time-horizons and fewer faceplants, not flashier demos. Expect the biggest “breakthroughs” to look boring on a keynote slide (efficiency, stability, auditability), while regulation - especially in Europe - accelerates the split between toy agents and enterprise-grade systems. If that sounds incremental, it isn’t: trusted semi-autonomous AI coworkers would still reshape how work gets done, how products get built, and who captures the value. Keep reading - because the real story this year is what happens when AI stops being impressive and starts being dependable. This is a newly written and, above all, new study on a topic I have already written about.

All the best,

Why AGI Still Isn’t “Just Scale It More”

A few years ago, it was easy to tell a clean story about progress in AI: make models bigger, feed them more data, give them more compute, and they’ll get better, often in remarkably predictable ways. That story is still partly true. On broad metrics, frontier models often improve steadily with scale, which is exactly why the industry has been able to plan multi-billion-dollar training runs with confidence. Anthropic captured the weird tension neatly: model development is “predictable” in aggregate, yet “unpredictable” in the details, capabilities and failures can appear suddenly, and specific behaviors can be hard to foresee.

”Namely, these generative models have an unusual combination of predictable loss on a broad training distribution (as embodied in their "scaling laws"), and unpredictable specific capabilities, inputs, and outputs. We believe that the high-level predictability and appearance of useful capabilities drives rapid development of such models, while the unpredictable qualities make it difficult to anticipate the consequences of model deployment.”

— Anthropic

But “better at benchmarks” is not the same thing as general intelligence. AGI is supposed to be robust across domains, learn over time, keep goals coherent, act competently in the world, and not fall apart when conditions shift. Even defining what counts as AGI is controversial, which is why a recent attempt by Dan Hendrycks and collaborators tries to operationalize AGI as “matching the cognitive versatility and proficiency of a well-educated adult.” That definition is provocative for a reason: it forces us to confront missing pieces in today’s systems rather than being dazzled by fluent conversation.

So the real question isn’t “Are models impressive?” They are. The question is: What are the hardest missing capabilities - and which constraints might make them difficult to obtain by scaling alone?

This essay follows one guiding thread: If AGI is a stable, long-horizon problem-solver in the real world, what still breaks when we try to build that today? By the end, we’ll have a grounded answer: the biggest open problems cluster around memory and context, compute/power and efficiency, data and feedback loops, architecture and reasoning, agent reliability, and alignment + evaluation - and progress probably requires co-designing these pieces, not optimizing them separately.

logo

Subscribe to Superintel+ to read the rest.

Become a paying subscriber of Superintel+ to get access to this post and other subscriber-only content.

Upgrade

A subscription gets you:

  • Discord Server Access
  • Participate in Giveaways
  • Saturday Al research Edition Access

Reply

or to participate

Keep Reading

No posts found