An AI submitted a paper to peer review at ICLR 2025. It scored 6.33. That is higher than 55% of the human researchers who submitted to the same venue. Each paper cost $15 to produce.
Sakana AI's AI Scientist runs the full research cycle without human direction. It generates a hypothesis, reads the relevant literature, designs the experiment, writes the code, runs the tests, produces the figures, writes the paper in LaTeX, and submits it for review. The automated reviewer they built to evaluate output matches human judgment quality. The system does not assist a researcher. It replaces the process.
The $15 figure is not a rounding error. It is the cost per complete paper across multiple machine learning subfields including diffusion models, transformers, and language modeling. At that price, scientific output becomes a throughput problem.
Nature then published a paper about the AI Scientist. The venue that defines what rigorous science looks like decided this work was worth the platform. The AI Scientist is now both the subject of a Nature paper and a system capable of producing output that clears the same bar the venue is famous for applying.
The team knows what this means. Their stated recommendation is that papers substantially generated by AI must be marked as such. That recommendation is an admission: the peer review system as currently designed cannot detect the difference, and without disclosure it wont.
The scaling problem is the part that does not have a ceiling. Sakana states it plainly: as the underlying foundation models improve, the quality of the generated papers increases correspondingly. Current limitations like naive ideation and occasional hallucinations are failure modes of 2025 models. They are not structural limits of the approach.
Peer review was designed to filter bad science. It was never designed to ask who ran the experiment, because until recently that question did not need asking. It needs asking now. The answer at ICLR 2025 was that 55% of human researchers did not come out ahead.
Watermarking is the proposed solution. Watermarking requires the submitter to disclose. The submitter is the one with the incentive not to. Scientific publishing now has a second integrity problem it did not have three years ago, and the first one is still unsolved.
Resources