Sakana AI Fugu Ultra: It's an Orchestrator, Not a Frontier Model

Llion Jones helped write the paper that made transformers the default architecture for every serious AI lab on the planet. Last October he went on stage at TED AI and told the audience he is absolutely sick of them. His company then shipped Fugu Ultra and published benchmark charts showing it tied with Anthropic's best model.

What Fugu Ultra actually is: a routing system that sends your query to GPT-5, Claude Sonnet 4, Gemini 2.5 Pro, or DeepSeek-R1 depending on what the task requires. Sakana calls this "autonomous model orchestration." The GPQA-Diamond numbers in the announcement are the scores of those underlying models plus the router. Sakana did not train a base model. They trained a 7B coordinator that decides which frontier model handles which subtask.

The Reddit community figured this out in the first comment. "It is an orchestrator of models, not a base model." 107 upvotes. The follow-up was more specific: "those GPQA numbers are really benchmarking the underlying models plus the router, not anything Sakana trained from scratch."

This is not a new pattern for Sakana. The same community flagged a prior misleading announcement last year. The shorthand that has stuck: "altman maxxing." Build something that routes to other people's work, publish benchmark charts, let the headline do the rest.

The architecture itself is not trivial. Fugu's Conductor is a 7B model trained with reinforcement learning to write collaborative workflows in natural language, assign subtasks to specific models, and pass context between them. The claimed advantage is that heterogeneous models cross checking each other break the self correction loops a single model gets stuck in. That is a real engineering problem and a reasonable approach to it. It is also entirely downstream of GPT-5 and Claude Sonnet 4 doing the actual reasoning.

The irony Jones apparently does not feel the need to address: the anti transformer lab's flagship product runs on transformers. Not incidentally. Fundamentally. Pull GPT-5 and Claude Sonnet 4 from the stack and Fugu Ultra is a workflow tool with nothing to route to.

Jones is right that transformer monoculture is a legitimate research concern. He is also the man who named the paper "Attention Is All You Need" and then seems surprised that attention is all anyone uses. Sakana's actual non transformer research, the Transformer squared self-adaptation work and the recursive self-improvement lab, is genuinely interesting. It is just not what they put in the press release.

The benchmarks match Anthropic. The model is Anthropic. That distinction matters, and Sakana is betting most readers wont make it.

Resources