20 Jan 2026

feedDrupal.org aggregator

Dries Buytaert: Software as clay on the wheel

Two people shape a clay pot on a spinning pottery wheel, their hands covered in wet clay.

A few weeks ago, Simon Willison started a coding agent, went to decorate a Christmas tree with his family, watched a movie, and came back to a working HTML5 parser.

That sounds like a party trick. It isn't.

It worked because the result was easy to check. The parser tests either pass or they don't. The type checker either accepts the code or it doesn't. In that kind of environment, the work can keep moving without much supervision.

Geoffrey Huntley's Ralph Wiggum loop is probably the cleanest expression of this idea I've seen, and it's becoming more popular quickly. In his demonstration video, he describes creating specifications through conversation with an AI agent, then letting the loop run. Each iteration starts fresh: the agent reads the specification, picks the most important remaining task, implements it, runs the tests. If they pass, it commits and exits. The next iteration begins with empty context, reads the current state from disk, and picks up where the previous run left off.

If you think about it, that's what human prompting already looks like: prompt, wait, review, prompt again. You're shaping the code or text the way a potter shapes clay: push a little, spin the wheel, look, push again. The Ralph loop just automates the spinning, which makes much more ambitious tasks practical.

The difference is how state is handled. When you work this way by hand, the whole conversation comes along for the ride. In the Ralph loop, it doesn't. Each iteration starts clean.

Why? Because carrying everything with you all the time is a great way to stop getting anywhere. If you're going to work on a problem for hundreds of iterations, things start to pile up. As tokens accumulate, the signal can get lost in noise. By flushing context between iterations and storing state in files, each run can start clean.

Simon Willison's port of an HTML5 parsing library from Python to JavaScript showed the principle at larger scale. Using GPT-5.2 through Codex CLI with the --yolo flag for uninterrupted execution, he gave a handful of directional prompts: API design, milestones, CI setup. Then he let it run while he decorated a Christmas tree with his family and watched a movie.

Four and a half hours later, the agent had produced a working HTML5 parser. It passed over 9,200 tests from the html5lib-tests suite. HTML5 parsing is notoriously complex. The specification precisely defines how even malformed markup should be handled, with thousands of edge cases accumulated over years. But the agent had constant grounding: each test run pulled it back to reality before errors could compound.

As Willison put it: "If you can reduce a problem to a robust test suite you can set a coding agent loop loose on it with a high degree of confidence that it will eventually succeed". Ralph loops and Willison's approach differ in structure, but both depend on tests as the source of truth.

Cursor's research on scaling agents confirms this is starting to work at enterprise scale. Their team explored what happens when hundreds of agents work concurrently on a single codebase for weeks. In one experiment, they built a web browser from scratch. Over a million lines of code across a thousand files, generated in a week. And the browser worked.

That doesn't mean it's secure, fast, or something you'd ship. It means it met the criteria they gave it. If you decide to check for security or performance, it will work toward that as well. But the pattern is the same: clear tests, constant verification, agents that know when they're done.

From solo loops to hundreds of agents running in parallel, the same pattern keeps emerging. It feels like something fundamental is crystallizing: autonomous AI is starting to work well when you can accurately define success upfront.

Willison's success criteria were "simple": all 9,200 tests pass. That is a lot of tests, but the agent got there. Clear criteria made autonomy possible.

As I argued in AI flattens interfaces and deepens foundations, this changes where humans add value:

Humans are moving to where they set direction at the start and refine results at the end. AI handles everything in between.

The title of this post comes from Geoffrey Huntley. He describes software as clay on the pottery wheel, and once you've worked this way, it's hard to think about it any other way. As Huntley wrote: "If something isn't right, you throw it back on the wheel and keep going". That's exactly how it feels. Throw it back, refine it, spin again until it's right.

Of course, the Ralph Wiggum loop has limits. It works well when verification is unambiguous. A unit test returns pass or fail. But not all problems come with clear tests. And writing tests can be a lot of work.

For example, I've been thinking about how such loops could work for Drupal, where non-technical users build pages. "Make this page more on-brand" isn't a test you can run.

Or maybe it is? An AI agent could evaluate a page against brand guidelines and return pass or fail. It could check reading level and even do some basic accessibility tests. The verifier doesn't have to be a traditional test suite. It just has to provide clear feedback.

All of this just exposes something we already intuitively understand: defining success is hard. Really hard. When people build pages manually, they often iterate until it "feels right". They know what they want when they see it, but can't always articulate it upfront. Or they hire experts who carry that judgment from years of experience. This is the part of the work that's hardest to automate. The craft is moving upstream, from implementation to specification and validation.

The question for any task is becoming: can you tell, reliably, whether the result is getting better or worse? Where you can, the loop takes over. Where you can't, your judgment still matters.

The boundary keeps moving fast. A year ago, I was wrestling with local LLMs to generate good alt-text for images. Today, AI agents build working HTML5 parsers while you watch a movie. It's hard not to find that a little absurd. And hard not to be excited.

20 Jan 2026 7:39pm GMT

Droptica: AGENTS.md Tool: How AI Actually Speeds Up Drupal Work

- Friday, 2:00 PM. New developer, production bug. Something's broken with a custom queue worker. In the past, this meant tracking down the previous developer, consultations, wasting time - all on a Friday. Now? The developer asks artificial intelligence and AI responds with useful answers because it knows the project. How? Just one file: AGENTS.md.

20 Jan 2026 12:14pm GMT

Specbee: Understanding Entity Reference Revisions in Drupal

Losing track of revisions after repeated edits in Drupal? Learn how Entity Reference Revisions preserves content integrity by ensuring safe edits, accurate history, and reliable workflows.

20 Jan 2026 5:44am GMT

19 Jan 2026

feedSymfony Blog

SymfonyLive Paris 2026: “Édition simultanée : Facile avec Symfony UX“

SymfonyLive Paris 2026, conference in French language only, will take place from March 26 to 27! The schedule is currently being revealed as we go along. More details are available here. 🎤 Nouvelle talk annoncé à SymfonyLive Paris 2026 ! Avec…

19 Jan 2026 11:00am GMT

18 Jan 2026

feedSymfony Blog

A Week of Symfony #994 (January 12–18, 2026)

This week, Symfony development activity focused on improving the HTTP Cache attribute and making some changes to controller event attributes. Meanwhile, we published more information about the upcoming SymfonyLive Paris 2026 conference. Lastly, we introduced…

18 Jan 2026 8:18am GMT

14 Jan 2026

feedSymfony Blog

Introducing the Symfony 8 Certification

Symfony 8 was released at the end of November 2025, alongside Symfony 7.4. Both versions share the exact same features, but Symfony 8.0 removes all deprecated features and requires PHP 8.4 or higher. Today, we're introducing the new certification exam for…

14 Jan 2026 9:27am GMT