OpenAI Codex SSD Bug: 640 TB Per Year and the Slopware Problem

37 terabytes. That is what one developer watched their SSD absorb in 21 days of running Codex. Not from rendering or database migrations. From a background logging system that nobody told them was running.

Extrapolated: around 640 TB written to disk per year. A typical consumer SSD is rated for 300 to 600 TBW before it wears out. Codex can finish the job in under a year. While you are just using an AI coding tool.

What the Database Is Actually Doing

The bug lives in Codex's SQLite based feedback logging system. A GitHub issue filed against the repository lays out the forensics in uncomfortable detail.

5.5 billion row IDs allocated. 506,000 rows retained. That is a 10,000x gap. In a 15 second measurement window, 36,211 rows were inserted while the retained count stayed flat. Rows are being inserted and pruned in a continuous loop. Against your drive.

What is being logged: 70.7% is TRACE level traffic, the noisiest tier with the lowest signal. That includes tokio tungstenite events, raw inotify filesystem notifications, and dependency internals nobody asked for. Another 25.3% is OpenTelemetry data that is already being sent somewhere else. The rest is raw WebSocket and SSE payload bodies.

All of this at the default log level. You didnt configure this. Every installation ships this way.

The fix isnt architecturally hard. Raise the default log level for the SQLite sink, filter dependency noise, store summaries instead of raw bodies, deduplicate the telemetry, add a size cap. The issue is open. It is unassigned.

"Slopware"

The Hacker News thread named it directly. Commenters used the word slopware: software whose quality degraded because AI generated code was shipped without sufficient human review. That framing hits harder than usual here. The product being critiqued is itself an AI coding assistant.

The logging bug wasnt the only complaint in the thread. GPU spikes from idle spinner animations. Memory leaks consuming 60 GB during normal sessions. Performance that users say deteriorates over time. Operations that can destroy files without adequate warning.

One commenter described a production incident where AI generated code, with no understanding of transactional guarantees, caused data corruption. It passed code review. It passed tests. It failed in production.

The standard defense is that human written software has always had bugs. That argument doesnt hold here. The bar isnt average software. These are tools being marketed as a way to raise the bar. The pitch breaks down when the tool ships this.

The Problem Is Not the Bug

A TRACE level default on a SQLite write path is not a subtle mistake. The gap between 5.5 billion allocated IDs and 506,000 retained rows is not something you miss in any monitoring dashboard. A code reviewer catches this. Static analysis catches this. A premerge checklist catches this.

It shipped anyway. It sat there wearing drives for weeks.

That is a signal about the development process that produced it. Not a fluke, a process problem.

Developers are being asked to trust these tools with production codebases, credentials, and irreversible file operations. That trust requires the tool itself to be held to a standard. Right now it is not clear it is.

Check Your Drive Now

If you are running Codex: sudo smartctl -a /dev/nvme0 on Linux, look at the Data Units Written field. On macOS, diskutil info disk0 and check Disk Write Bytes.

A SQLite trigger can block new log entries as a temporary workaround. It is not a fix. Watch the GitHub issue for a real patch.

More broadly, monitor your AI tooling the way you monitor any persistent background process. These are not passive text editors. They run databases, maintain local state, and make system calls continuously. Treat them accordingly.

The Harder Problem

The logging bug will get fixed. It is a tractable engineering problem with a clear solution and a motivated reporter.

The harder problem is the culture that produced it. The speed pressures of AI era development keep generating tools that skip the unglamorous work. Careful defaults. Write amplification analysis. Testing on a 3 year old laptop with a budget SSD.

The pitch for AI coding assistants is faster shipping with fewer defects. Every time a tool ships a defect at this scale, that pitch takes a hit.

I wouldnt hold my breath waiting for this to self correct. The incentive is to ship fast, not to check the write counter on a developer's drive.

Sources: GitHub issue #28224, openai/codex · Hacker News discussion