Youssef Ait Alama | CS PhD Student

Most people's mental model of AI goes something like this: you show it a ton of examples, it learns a pattern, it applies that pattern to new inputs. That's roughly right for a lot of what gets called "machine learning." But there's a different kind of task that's gotten less attention: can you build a system that doesn't just apply algorithms, but actually invents them?

This distinction sounds philosophical, but it has real technical teeth. Let me try to make it concrete.

Running vs. Inventing

When you ask a neural network to sort a list, you're asking it to apply a known algorithm -- or more precisely, to learn a mapping that approximates what a sorting algorithm would produce. Given enough examples of sorted lists, it can get pretty good at this. But it's not inventing mergesort. It's not reasoning about why mergesort is better than bubble sort for large inputs. It's pattern-matching to produce sorted-looking outputs.

Inventing an algorithm is different. It means finding a procedure -- a sequence of operations -- that solves a class of problems, that generalizes correctly, and that ideally does something new rather than just recombining what's already been seen. AlphaGo playing Go is impressive. A system that discovers a new game-theoretic insight about Go would be something else entirely.

Where this shows up in practice

Two examples make this vivid.

FunSearch (DeepMind, 2023) is one of the cleaner demonstrations of what algorithm discovery can look like. The system uses a large language model to write programs, evaluates them against a mathematical objective, and iteratively evolves the population of programs toward better solutions. It discovered new lower bounds for the cap set problem -- a result in combinatorics that human mathematicians hadn't found. That's not pattern matching. That's finding a novel procedure that works.

AlphaTensor (DeepMind, 2022) did something similar for matrix multiplication. It framed the problem of finding fast matrix multiplication algorithms as a game, trained a reinforcement learning agent to play it, and found new algorithms that beat Strassen's algorithm in specific cases. Again: not applying a known method, but discovering a new one.

Both are impressive. Both also have serious limitations. FunSearch works in narrow mathematical domains where you can write an exact objective. AlphaTensor works because matrix multiplication has a beautifully structured search space. Generalizing these approaches to arbitrary algorithm discovery problems is still an open question.

The gap that remains

Here's what's striking: if you take a recent benchmark called AlgoTune (NeurIPS 2025) and throw frontier LLMs at it, they mostly fail to produce genuine algorithmic improvements. They can restructure code, apply known optimizations, and occasionally stumble onto something clever -- but they don't discover. They recombine.

I think part of the reason is that current systems don't have a good internal model of why an algorithm works, only that it works on the examples they've seen. If you know that a sorting algorithm is correct because of an invariant it maintains at each step, you have a handle on the problem structure that pure example-based learning doesn't give you.

That's the gap I find interesting. Whether you can close it by giving a learning system a richer formal description of the task -- not just inputs and outputs, but something that encodes what the task actually means -- is the question I'm working on. I don't have a clean answer yet. But I think it's the right question.

If you're working on something adjacent to this, or you just want to argue about whether LLMs can actually discover anything, feel free to reach out.

Blog

What Does It Mean for a Machine to Discover an Algorithm?