DeepDive: AlphaEvolve

In partnership with

Dear Readers,

Welcome to the first DeepDive!

From now on, every Saturday we will be publishing an in-depth analysis of a technological breakthrough that is changing our world - clearly explained, critically categorized and with an eye on the big picture.

This week it's about AlphaEvolve, a system from Google DeepMind that is more than just another AI model. AlphaEvolve can not only generate existing code - it develops completely new algorithms that amaze even experienced mathematicians. In a historic breakthrough, it solved a problem that had been considered untouchable since 1969: multiplying 4×4 matrices with fewer arithmetic steps than ever before.

But this is just the beginning. AlphaEvolve combines two language models with an evolutionary search process that generates, tests and improves thousands of code variants - like a digital researcher that independently formulates and tests hypotheses. The result: advances in mathematics, more efficient chips, faster AI training routines and millions saved in data centers.

What does it mean when artificial intelligence begins to actively develop new knowledge - beyond human examples? In this DeepDive, we explore this question.

All the best,

AlphaEvolve: Google’s AI Agent Just Discovered a New Math Algorithm

❝

The TLDR
Google DeepMind has unveiled AlphaEvolve, a groundbreaking AI coding agent that blends Gemini models with evolutionary algorithms to autonomously discover and optimize code. By breaking a decades-old matrix multiplication record, AlphaEvolve marks a turning point—where AI doesn’t just assist with code but pioneers new scientific knowledge.

“Today, we’re announcing AlphaEvolve, an evolutionary coding agent powered by large language models for general-purpose algorithm discovery and optimization. AlphaEvolve pairs the creative problem-solving capabilities of our Gemini models with automated evaluators that verify answers, and uses an evolutionary framework to improve upon the most promising ideas.” (Google DeepMind)

“After 56 years, someone has finally hit the streets again.” - This is how one DeepMind researcher commented on the moment when AlphaEvolve presented a new algorithm for multiplying 4 × 4 matrices with only 48 instead of 49 scalar multiplications - a record that had seemed unbreakable since 1969. This anecdote marks more than a mathematical footnote: it signals a paradigm shift in which AI no longer just writes existing code, but actively opens up unknown knowledge.

In May 2025, Google DeepMind officially presented the “Gemini-powered coding agent” AlphaEvolve. The system combines two large language models (Gemini Flash and Gemini Pro) with a genetic optimization process that automatically generates, executes and evaluates thousands of code mutations. AlphaEvolve thus operates like a digital research assistant that independently generates hypotheses in the form of code and tests them experimentally.

At first glance, this seems like a logical continuation of earlier DeepMind milestones (AlphaGo, AlphaFold, AlphaTensor). However, the scope today extends far beyond individual special problems: from data center scheduling and chip design to open basic knowledge, AlphaEvolve already delivers usable results. This analysis therefore focuses on the key question of whether AlphaEvolve represents an essential breakthrough for science and technology - and what future scenarios result from this.

From AlphaZero to AlphaEvolve - A Qualitative Leap

Earlier systems such as AlphaZero learned through reinforcement learning, but always within narrow playing field boundaries. AlphaEvolve replaces this game environment with an evolutionary code search space: a parent program is mutated, its “children” compete for the best evaluation metric, and the winners in turn become parents of the next generation. In principle, the method can be applied to any problem as long as an automatic quality function exists.

LLM ensemble: Gemini Flash provides width, Gemini Pro depth.
Prompt Sampler: Packs code, tests and mutation tips into an input prompt.
Evaluators: Run candidates and measure runtime, memory requirements or correctness, for example.
Genetic algorithm: Selects the best variants for the next cycle.

This setup allows not only the generation but also the evolution of entire code bases - a decisive difference to one-off “shot-in-the-dark” approaches such as AlphaCode.

Try Artisan’s All-in-one Outbound Sales Platform & AI BDR

Ava automates your entire outbound demand generation so you can get leads delivered to your inbox on autopilot. She operates within the Artisan platform, which consolidates every tool you need for outbound:

300M+ High-Quality B2B Prospects, including E-Commerce and Local Business Leads
Automated Lead Enrichment With 10+ Data Sources
Full Email Deliverability Management
Multi-Channel Outreach Across Email & LinkedIn
Human-Level Personalization

Book a demo to see what Ava can do.

How does AlphaEvolve Work?

wired.com

AlphaEvolve is based on DeepMind's powerful Gemini LLMs, which act as a kind of creative engine. Unlike previous systems, AlphaEvolve uses an ensemble of two language models: Gemini Flash (a particularly fast and efficient model) and Gemini Pro (an extremely powerful model) Gemini Flash generates a wide variety of ideas and code suggestions in a short time, while Gemini Pro, with greater depth and “thinking ability”, refines these suggestions and delivers higher quality approaches. Together, the two complement each other to explore a variety of approaches and suggest helpful, informed improvements. These models produce suggestions in the form of code that represent potential algorithmic solutions to the given problem.

Sequence of evolution-based learning: AlphaEvolve works iteratively and uses principles of evolutionary algorithms(also known as genetic algorithms). The following diagram illustrates the process schematically: First, a human scientist/engineer provides the system with a start program - i.e. an initial (possibly simple or inefficient) algorithm that at least solves the task correctly - and an evaluation function. This evaluation function (a separate program or automated test) can check whether a proposed solution is correct and measures its quality or goodness (e.g. runtime, accuracy, resource consumption).

Fig. 1: Schematic representation of the AlphaEvolve architecture and the evolution-based development process.

An existing program (“parent program”) together with inspirations from the program database is converted into an input command to the LLM ensemble (red) using a prompt sampler (blue). The language model then generates a change proposal (diff), which is applied to the program to obtain a modified child program. This is executed by the evaluator (yellow) and evaluated according to defined metrics. The new program and its result are then saved in the program database (green). A genetic algorithm ensures that the top-rated programs are preferred as the basis for the next iterations (survival of the fittest). This distributed controller loop is repeated until no more improvements are found.

“AlphaEvolve demonstrates the potential for AI to come up with completely novel ideas through continual experimentation and evaluation. DeepMind and other AI companies hope that AI agents will gradually learn to exhibit more general ingenuity in many areas, perhaps eventually generating ingenious solutions to a business problem or novel insights when given a particular problem.” (wired)

In contrast to AlphaZero, which started from scratch, AlphaEvolve is based on pre-trained language models. The Gemini models already bring enormous knowledge from programming and text corpora and can therefore make suggestions that contain complex programming logic. This combination of innate knowledge (the LLMs) and active trial and error (evolution) allows AlphaEvolve to find solutions beyond mere human examples. - It can actually discover something genuinely new and correct that was not explicitly in the training data.

Concrete Breakthroughs

Mathematics - The aforementioned 48-multiplication algorithm marks the first advance over Strassen in 56 years.In addition, AlphaEvolve improved 14 other matrix sizes and provided a new lower bound of 593 balls in the eleven-dimensional Kuss number problem.
“In internal tests, AlphaEvolve improved the state of the art on 14 matrix multiplication benchmarks, including a long-standing open problem from 1969, and made advances in over 20% of the 50+ open mathematical problems it was tested on. One of the most impressive? It found a new lower bound for the kissing number in 11 dimensions—a brain-twisting geometry puzzle that’s stumped mathematicians for centuries.” (Chris McKay, Maginative)
Data Center Scheduling - In Google's Borg cluster, a heuristic discovered by AlphaEvolve permanently recovers 0.7% of previously “stranded” computing capacity - a saving in the millions
Chip Design - A Verilog patch removed superfluous bits in a TPU circuit without any loss of functionality.
AI Training - Through clever matrix decomposition, AlphaEvolve shortened the training of large Gemini models by 1% and accelerated the FlashAttention kernel by up to 32.5%.

AlphaEvolve’s Impact on the economy and industry

In business and industry, particularly in the IT sector, AlphaEvolve has the potential to increase productivity and efficiency in many areas. As the system is geared towards algorithmic optimization, it can be used directly to improve software, hardware and operating processes - with some impressive results even at this early stage.

One immediate field of application is the IT infrastructure of large companies. Google itself has already tested AlphaEvolve intensively in its own infrastructure. One outstanding example is the management of data centers: AlphaEvolve discovered a simple but extremely effective scheduling heuristic for Google's cluster management system (Borg). This innovative scheduling method ensures that “idle” resources (such as free CPUs on nodes that have run out of memory) are better utilized. 0.7% of the total computing capacity worldwide has been continuously recovered in this way:

A seemingly small percentage, but one that makes a huge difference in Google's gigantic infrastructure. 0.7% more usable computing time globally means millions in savings and better utilization without additional hardware. Even more remarkable: the solution found by AlphaEvolve is implemented in the form of easy-to-understand code that human engineers can easily understand and maintain. . The interpretability of the AI solution (readable, well-structured code instead of a black box) facilitated the transfer to productive operation and increased confidence in such AI-generated optimizations.

AlphaEvolve was also able to create added value in hardware design. One example is the optimization of a Verilog code (hardware description language) for Google's TPU chips. Here, the AI suggested omitting unnecessary bits in a highly optimized arithmetic circuit to make the chip more efficient. Importantly, each hardware change proposal had to pass rigorous verification tests to ensure that the functionality of the chip was not compromised. . For the industry, this could mean that future hardware developments are significantly accelerated and supported automatically - from microchip optimization to the layout of complex circuits.

Another major area of application is software and system optimization. For example, AlphaEvolve has found a way to make a key computational module in the AI model training pipeline more efficient. More specifically, it found a smarter way to split large matrix multiplications into chunks, which made a particular kernel (computation routine) in the Gemini architecture 23% faster. This improvement led to a reduction in the overall training time of large Gemini models by about 1%. One percent doesn't sound like much, but when training huge AI models (which often takes weeks and consumes a lot of computing power), this is a significant saving in time and energy. Google speaks of considerable savings through increased efficiency. The example also shows that AlphaEvolve can even improve the systems on which it is based - an interesting feedback loop where the AI accelerates its own infrastructure. In addition, AlphaEvolve also optimized ultra-low code levels, such as GPU instructions: For example, the FlashAttention algorithm (important for Transformer models) was accelerated by the AI by up to 32.5%.

To summarize: AlphaEvolve can act as a multiplier for efficiency and innovation in the industry. From cloud infrastructure to hardware and application software - wherever optimization problems play a role, such an AI agent could find better solutions, reduce costs and open up new opportunities. At the same time, the economy and the world of work must adapt to ensure that collaboration between human experts and AI systems makes sense.

Chubby’s Opinion Corner

AlphaEvolve is not just another milestone - it is the beginning of a new chapter. For the first time, we are seeing how an AI system not only processes existing knowledge, but actively generates new knowledge - autonomously, comprehensibly, reproducibly. The actual breakthrough lies less in individual records than in the structure: a system that improves itself through variation and evaluation creates the basis for a positive feedback effect - a cycle in which models optimize and accelerate each other.

We are therefore at the beginning of a fundamental change. The speed at which AI continues to develop is no longer linear - but evolutionary.

How'd We Do?

Please let us know what you think! Also feel free to just reply to this email with suggestions (we read everything you send us)!