Dear Readers,
What is really driving AI forward right now – larger models or better evidence? Today's common thread: the “last mile.” OpenAI swallows Statsig and builds the flight recorder directly into the product pipeline: hypotheses, controlled rollouts, hard evidence instead of feature feeling. This shifts the competition from “Who can do it?” to “Who can show it – week after week – without user chaos?” And that's where it gets exciting: when measurement becomes the operating system, leadership, deception, and team dynamics of agents take on new relevance.
We dissect what the Statsig deal means operationally and why teams that experiment cleanly ship faster and more consistently. Plus, a fresh social intelligence stress test: the Werewolf benchmark with role Elo – who manipulates, who resists? Plus, a major labor market analysis: junior jobs are shrinking where companies are introducing genetic AI, while senior profiles are growing – with consequences for career ladders and training. On top of that: “AI progress vs. forecasts” in a reality check, news bits (including Mistral, NVIDIA), our graph of the day, and a quick question for you. If you want to understand how AI is maturing from demo to reliable infrastructure, read on.
In Today’s Issue:
OpenAI just bought the keys to Silicon Valley's best product lab.
Can AI play Werewolf?
We are years ahead of schedule on the path to superintelligent AI.
And more AI goodness…
All the best,

OpenAI acquires Statsig
The Takeaway
👉 OpenAI acquires Statsig for approximately $1.1 billion and brings founder Vijaye Raji onto the team as CTO of Applications.
👉 With Statsig's experimentation tools, OpenAI aims to make the leap from model research to stable product features.
👉 The deal shows that the “last mile” – i.e., testing, rollout, and user feedback – is becoming just as strategically important as model breakthroughs.
👉 This sends a signal to the AI industry: speed alone is not enough; sustainable product quality requires systematic experimentation.
Perhaps the most important deal of this AI season: OpenAI is acquiring Statsig, the experimentation platform for A/B testing, feature flags, and product analytics—the engine room where ideas mature into robust features. The key point: this is not just a tool purchase. Statsig founder Vijaye Raji is joining OpenAI as CTO of Applications. The message is clear: research alone is not enough – speed in iteration and evidence is becoming a competitive advantage.

What does that mean operationally? If ChatGPT & Co. ship new features weekly in the future, they will need clean hypotheses, measurement logic, and rollouts without user chaos. This is exactly what Statsig has been delivering for years – now natively in the OpenAI pipeline. Raji reports to Fidji Simo; the deal (approximately $1.1 billion in stock) is subject to regulatory approval. For dev teams, this is like switching from gut feeling to flight recorders: less “we believe,” more “we know.”
Why it matters: OpenAI is professionalizing the last mile—from model demo to robust product feature. Anyone building AI products must now consider experimentation a core competency, not an add-on.
Sources:
Ad
Marketing ideas for marketers who hate boring
The best marketing ideas come from marketers who live it.
That’s what this newsletter delivers.
The Marketing Millennials is a look inside what’s working right now for other marketers. No theory. No fluff. Just real insights and ideas you can actually use—from marketers who’ve been there, done that, and are sharing the playbook.
Every newsletter is written by Daniel Murray, a marketer obsessed with what goes into great marketing. Expect fresh takes, hot topics, and the kind of stuff you’ll want to steal for your next campaign.
Because marketing shouldn’t feel like guesswork. And you shouldn’t have to dig for the good stuff.
In The News
„The 45% drop from 9,000 to 5,000 comes as the company’s own AI platform, Agentforce, now handles around half of all customer conversations (around 1.5 million interactions) – some human workers remain to handle the rest.“
— #Chubby♨️ (#@kimmonismus)
7:31 AM • Sep 3, 2025
NVIDIA Releases New Model Checkpoint
NVIDIA has now also open-sourced the intermediate 12B parameter checkpoint for its Nemotron Nano 2 model, which was the version distilled down to the final 9B release.
Le Chat Gets a Major Upgrade
Mistral has announced a major update for Le Chat, introducing over 20 new connectors powered by MCP and a fully controllable memory to make it one of the most connected and relevant AI assistants available.
Graph of the Day
Data centers are now clearly the largest source of revenue. As the biggest beneficiary of AI, NVIDIA has multiplied its sales in recent years. There is no end in sight.

Werewolf Benchmark
The Werewolf Benchmark tests the social intelligence of LLMs in live multiplayer scenarios: 7 powerful models play 210 games of Werewolf (10 matches per pair). New features include a role-conditioned Elo rating that separates manipulation (as a wolf) vs. resistance (as a villager), plus instrumented vote swing analyses and a strict protocol (mayoral election, regulated speech turns). Result: GPT-5 clearly leads; others show role-dependent strengths/weaknesses. Relevance: more practical measurement of leadership, bluffing, and group dynamics—important for agent teams, moderation, and risk assessment.
Junior employment in companies introducing AI will decline significantly
The paper examines whether generative AI primarily affects younger people. Based on resumes and job advertisements for 62 million US employees in 285,000 companies (2015–2025), the authors identify AI adoption through postings for “AI integrators.” Starting in Q1/2023, junior employment in companies introducing AI will decline significantly (primarily due to fewer hires), while senior jobs will increase, particularly in wholesale and retail. Graduates of mid-level colleges will be most affected. Conclusion: Career ladders and entry paths are eroding—with consequences for training, recruitment, and inequality.
AI progress is well ahead of expectations from a few years ago
“We can now say pretty definitively that AI progress is well ahead of expectations from a few years ago. In 2022, the Forecasting Research Institute had super forecasters & experts to predict AI progress. They gave a 2.3% & 8.6% probability of an AI Math Olympiad gold by 2025…” — Ethan Mollick

Get Your AI Research Seen by 200,000+ People
Have groundbreaking AI research? We’re inviting researchers to submit their work to be featured in Superintelligence, the leading AI newsletter with 200k+ readers. If you’ve published a relevant paper on arXiv.org, email the link to [email protected] with the subject line “Research Submission”. If selected, we will contact you for a potential feature.
Question of the Day
Are you surprised at how quickly AI is developing?
Tweet of the Day
We can now say pretty definitively that AI progress is well ahead of expectations from a few years ago.
In 2022, the Forecasting Research Institute had super forecasters & experts to predict AI progress. They gave a 2.3% & 8.6% probability of an AI Math Olympiad gold by 2025…
— #Ethan Mollick (#@emollick)
12:46 PM • Sep 2, 2025
OpenAI scientist Noam Brown emphasizes how much the assumption about AI development differs from actual development, especially for general models.
Sponsored By Vireel.com
Vireel is the easiest way to get thousands or even millions of eyeballs on your product. Generate 100's of ads from proven formulas in minutes. It’s like having an army of influencers in your pocket, starting at just $3 per viral video.
Rumours, Leaks, and Dustups
gpt-image-1-high-fidelity is now in LMarena.
— #Kol Tregaskes (#@koltregaskes)
6:04 AM • Sep 3, 2025
GPT-Image-1-high fidelity spotted in LM Arena
Aider leaderboard has been updated with @OpenAI GPT-5 scores
— #Mark Kretschmann (#@mark_k)
6:13 AM • Sep 3, 2025
GPT-5 is much smarter than o3-pro and much more affordable at the same time.
Rumours, Leaks, and Dustups
Yannick Kilcher says, AGI is not coming. A highly debatable thesis.