09 Apr 2026
Planet Python
Rodrigo Girão Serrão: Who wants to be a millionaire: iterables edition
![]()
Play this short quiz to test your Python knowledge!
At PyCon Lithuania 2026 I did a lightning talk where I presented a "Who wants to be a millionaire?" Python quiz, themed around iterables. There's a whole performance during the lightning talk which was recorded and will be eventually linked to from here. This article includes only the four questions, the options presented, and a basic system that allows you to check whether you got it right or not.
Question 1
This is an easy one to get you started. It makes more sense if you watch the performance of the lightning talk.
What is the output of the following Python program?
print("Hello, world!")
- Hello, world!
- Hello world!
- Hello world
- Hello world!!
Question 2
What is the output of the following Python program?
squares = (x ** 2 for x in range(3))
print(type(squares))
<class 'generator'><class 'gen_expr'><class 'list'><class 'tuple'>
Question 3
This was a reference to the talk I'd given earlier today, where I talked about tee. The only object in itertools that is not an iterable.
Out of the 20, how many objects in itertools are iterables?
- 19
- 20
- 1
- 0
Question 4
What is the output of the following Python program?
from itertools import *
print(sum(chain.from_iterable(chain(*next(
islice(permutations(islice(batched(pairwise(
count()),5),3,9)),15,None)))))
- 1800
- 0
- 🇱🇹❤️🐍
SyntaxError
09 Apr 2026 9:17pm GMT
Rodrigo Girão Serrão: uv skills for coding agents
![]()
This article shares two skills you can add to your coding agents so they use uv workflows.
I have fully adopted uv into my workflows and most of the time I want my coding agents to use uv workflows as well, like when running any Python code or managing and running scripts that may or may not have dependencies.
To make this more convenient for me, I created two SKILL.md files for two of the most common workflows that the coding agents get wrong on the first few tries:
python-via-uv: this skill tells the agent that it should use uv whenever it wants to run any piece of Python code, be it one-liners or scripts. This is relevant because I don't even have the commandpython/python3in the shell path, so whenever the LLM tries running something withpython ..., it fails.uv-script-workflow: this skill is specifically for when the agent wants to create and run a script. It instructs the LLM to initalise the script withuv init --script ...and then tells it about the relevant commands to manage the script dependencies.
The two skills also add a note about sandboxing, since uv's default cache directory will be outside your sandbox. When that's the case, the agent is already instructed to use a valid temporary location for the uv cache.
Installing a skill usually just means dropping a Markdown file in the correct folder, but you should check the documentation for the tools you use.
Here are the two skills for you to download:
I also included the skills verbatim here, for your convenience:
Skill for python-via-uv
---
name: python-via-uv
description: Enforce Python execution through `uv` instead of direct interpreter calls. Use when Codex needs to run Python scripts, modules, one-liners, tools, test runners, or package commands in a workspace and should avoid invoking `python` or `python3` directly.
---
# Python Via Uv
Use `uv` for every Python command.
Do not run `python`.
Do not run `python3`.
Do not suggest `python` or `python3` in instructions unless the user explicitly requires them and the constraint must be called out as a conflict.
## Execution Rules
When sandboxed, set `UV_CACHE_DIR` to a temporary directory the agent can write to before running `uv` commands.
Prefer these patterns:
- Run a script: `UV_CACHE_DIR=/tmp/uv-cache uv run path/to/script.py`
- Run a module: `UV_CACHE_DIR=/tmp/uv-cache uv run -m package.module`
- Run a one-liner: `UV_CACHE_DIR=/tmp/uv-cache uv run python -c "print('hello')"`
- Run a tool exposed by dependencies: `UV_CACHE_DIR=/tmp/uv-cache uv run tool-name`
- Add a dependency for an ad hoc command: `UV_CACHE_DIR=/tmp/uv-cache uv run --with <package> python -c "..."`
## Notes
Using `python` inside `uv run ...` is acceptable because `uv` is still the entrypoint controlling interpreter selection and environment setup.
If the workspace already defines a project-specific temporary cache directory, prefer that over `/tmp/uv-cache`.
If a command example or existing documentation uses `python` or `python3` directly, translate it to the closest `uv` form before executing it....09 Apr 2026 12:19pm GMT
Real Python: Quiz: Reading Input and Writing Output in Python
In this quiz, you'll test your understanding of Reading Input and Writing Output in Python.
By working through this quiz, you'll revisit taking keyboard input with input(), showing results with print(), formatting output, and handling basic input types.
This quiz helps you practice building simple interactive scripts and reinforces best practices for clear console input and output.
[ Improve Your Python With 🐍 Python Tricks 💌 - Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]
09 Apr 2026 12:00pm GMT
08 Apr 2026
Django community aggregator: Community blog posts
Switching all of my Python packages to PyPI trusted publishing
Switching all of my Python packages to PyPI trusted publishing
As I have teased on Mastodon, I'm switching all of my packages to PyPI trusted publishing. I have been using it to release the django-debug-toolbar a few times but never set it up myself. The process seemed tedious.
The malicious releases uploaded to PyPI two weeks ago and the blog post about digital attestations in pylock.toml finally pushed me to make the switch. All of my PyPI tokens have been revoked so there is no quick shortcut.
Note
I'm also looking at other code hosting platforms. I have been using git before GitHub existed and I'll probably still use git when GitHub has completed its enshittification. For now the cost/benefit ratio of staying on GitHub is still positive for me. Trusted publishing isn't available everywhere, so for now it is GitHub anyway.
In the end, switching an existing project was easier than expected. I have completed the process for django-prose-editor and feincms3-cookiecontrol.
For my future benefit, here are the step by step instructions I have to follow:
-
Have a package which is buildable using e.g.
uvx build -
On PyPI add a trusted publisher in the project's publishing settings:
- Owner:
matthiask,feincms,feinheit, whatever the user or organization's name is. - Repository:
django-prose-editor - Workflow name:
publish.yml - Environment:
release
- Owner:
-
In the GitHub repository, create a
releaseenvironment in Settings / Environments. Add myself and potentially also other releasers as a required reviewer. I allow self-review and disallow administrators to bypass the protection rules. -
Run
git tag x.y.zandgit push, no moreuvx twineorhatch publish. -
Approve the release in the actions tab on the repository.
-
Either enjoy or swear and repeat the steps.
I'm happy with testing the release process in production. The older I get the less I care if people think I'm stupid. That's also why feincms3-cookiecontrol 1.7.0 doesn't exist, only 1.7.1 - the process failed and I had to bump the patch version and try again. Copy the publish.yml from a known good place, for example from the django-prose-editor repository. I have added the if: github.repository == 'feincms/django-prose-editor' statement which ensures that the workflow only runs in the main repository, but that's optional if you don't care about failing workflows.
08 Apr 2026 5:00pm GMT
New Package: Django Dependency Map
I have recently been reading Swizec Teller's new book Scaling Fast and in it he mentions architectural complexity, which reminded me of my desire for a tool that combines database dependencies between Django apps and import dependencies between Django apps. To date, I have used other tools such as graph models from Django extensions, import-linter is the most recent one, and pyreverse from Pylint. They all do bits of the job, but require manual stitching together to get a cohesive graph of everything overlaid in the right way. So I remembered about this, and so over the last couple of days, I've built a new package which combines all of this into a live view which updates as you build your app, a management command and a panel for Debug Toolbar.
Why the Django app level, you ask? Primarily, I do find models good, but they can get a little too complicated and a little you get a few too many lines and doing imports at the module level within an app or like separating it all out, again, you lose it becomes there becomes too much noise to signal to really understand the logical relationship between different components in the system. I like to think that Django apps naturally represent logical representations of different parts of a project or a system. A project obviously is too large unless you're dealing with multiple projects, but within a single Django project, it's a good representation to have an app deal with one thing. You can I know you can structure Django projects & apps in many ways. So it'd be interesting to see this tool used on other's project structures that aren't one app for a single logical component.
So without further ado, here is Django Dependency Map, which combines output from Django extensions graph_models and grimp, which is used by import-linter to dynamically map the dependencies between your different apps and third-party apps. Initially, it was a management command, which then outputs a HTML file, which exists. I then added that into a live view, and there's an integration into Django debug toolbar.
The live map page has the following features:
- you can hide nodes and kind of see how the dependencies change.
- force graph & hierarchical graph representation,
- Detailed information on a single app and its relationships
- import cycle detection
- import violations from import-linter
- Debug toolbar panel
- Export of the graph to mermaid & dot formats
My hope is twofold. One, it might reveal things about your projects that you didn't know about in terms of how fit how interlinked things are. And secondly, I hope it may change the way you build your Django apps. I'm hoping to have it open as another tab and just to watch as I'm building things to make sure out as I'm and maybe as an agent's building things see use it as a sense check of if it's doing things right or as I expect it to in terms of overall architecture rather than at the code level.
The pypi package is coming very soon, but you can visit the repo here: https://github.com/softwarecrafts/django-dependency-map
08 Apr 2026 5:00am GMT
How I configured OpenClaw's multi-model setup (so you don't have to)
A heads up before we start: over 95% of this blog post was written by my OpenClaw bot running GLM-5. I reviewed, edited, and approved everything, but credit where it's due: Tepui ⛰️ (yes, I named my AI) did most of the heavy lifting.
I need to vent, but in a good way this time.
Last week I vented about Anthropic pushing away paying customers. After that third-party ban hit, I had to rip out Claude Opus 4.6 from my OpenClaw setup and find alternatives. So I rebuilt the whole thing from scratch.
This time, I did it right.
What I wanted
I use OpenClaw as my personal AI assistant. It connects to my Telegram, manages my calendar, runs cron jobs, helps with research, and generally makes my life easier. Before the ban, it was running Claude Opus 4.6. After the ban, I needed alternatives.
My requirements were simple:
- Free or cheap (Lazer's LiteLLM proxy gives us free access to certain models)
- A text model for daily use (fast, capable reasoning)
- A vision model for images and PDFs (I send screenshots, receipts, documents)
- Image generation (sometimes I need to create images)
- A fallback if something breaks
What I got was so much more.
The journey
The whole process took about two hours. I started with a simple question: "What's the best model to use with OpenClaw?"
First thing I did was pull the model catalog from models.dev. If you're not familiar, it's a JSON file maintained by the OpenAI-compatible API community that lists every model from every provider with their specs: context window, token limits, pricing, capabilities, everything. I pulled it to /tmp/models.dev.json and started digging.
curl -s https://models.dev/api.json > /tmp/models.dev.json
Then I checked the Lazer proxy to see what models were actually available. Lazer Technologies (where I work) gives employees free access to a curated set of models through their LiteLLM proxy. The API is OpenAI-compatible, so you just query /v1/models:
curl -s https://llm.lazertechnologies.com/v1/models \
-H "Authorization: Bearer $LAZER_API_KEY"
The big ones available through Lazer:
- GLM-5 : Open source, 200K context, reasoning-enabled, competitive with frontier models
- GLM-4.6V : Vision model (text + images), also reasoning-enabled
- GPT-OSS-120b-Turbo : Fast, cheap, reasoning model
- Kimi-K2.5-Turbo : Multimodal (text + image + video)
These are all free for Lazer employees. If you don't have that luxury, the same models are available through DeepInfra or OpenRouter at reasonable prices.
The problem with my existing setup
My OpenClaw config was bare. I had:
- A primary model: MiniMax M2.7 through OpenRouter
- No fallback model configured
- No image model configured
- No image generation model
- No PDF model
And the MiniMax model was timing out on my cron jobs. The Montevideo Events Report job was failing because MiniMax M2.7 was too slow for complex reasoning tasks. I needed something faster, and free through Lazer.
The realization about model slots
This is where I learned something new. OpenClaw doesn't just have one "model" config. It has six:
agents.defaults.model: Primary text model (plus fallbacks)agents.defaults.imageModel: For image input (when primary can't accept images)agents.defaults.pdfModel: For PDF parsing (falls back to imageModel)agents.defaults.imageGenerationModel: For creating images (not just viewing them)agents.defaults.musicGenerationModel: For music generationagents.defaults.videoGenerationModel: For video generation
I was only using slot #1. No wonder images weren't working right.
What I configured
After some back-and-forth with Tepui (yes, I named my AI), here's what we landed on:
| Role | Model | Provider | Cost (per 1M tokens) |
|---|---|---|---|
| Primary text | GLM-5 | Lazer | $0.80 / $2.56 |
| Fallback | GLM-5 | OpenRouter | $0.80 / $2.56 |
| Image/PDF input | GLM-4.6V | Lazer | $0.30 / $0.90 |
| Image generation | Gemini 3.1 Flash Image | OpenRouter | $0.50 / $3.00 |
| Quick tasks (reserve) | GPT-OSS-120b-Turbo | Lazer | $0.15 / $0.60 |
| Video-capable (reserve) | Kimi-K2.5-Turbo | Lazer | $0.60 / $3.00 |
I'm putting actual prices in because Lazer's proxy is free for me, but I want to track costs as if I were paying. That way I know the real value of what I'm using.
Why these choices?
GLM-5 for text. It's the best open-source reasoning model available. 200K context window, MIT licensed, competitive with GPT-4 on agentic tasks. I tested it with quick prompts and it's snappy.
GLM-5 via OpenRouter as fallback. Same model, different provider. If the Lazer proxy goes down, OpenClaw keeps working through OpenRouter with the exact same model. No quality drop, just a different route. I also kept MiniMax M2.7 in the allowlist so I can switch to it manually if I ever need to.
GLM-4.6V for images and PDFs. This was the key insight. GLM-5 is text-only. For images and PDFs, I needed a vision model. GLM-4.6V handles both, and it's on the same Lazer proxy. This means my cron jobs can parse images (like parking receipts) without hitting paid APIs.
Fun fact: I actually added the GLM-4.6V model to my config from my car while waiting for my girlfriend to finish her driving classes. I was using my OpenCode server running at home, connected through WireGuard on my phone. Pulled the model specs from models.dev, updated the config, tested it with a screenshot. All from the car. That's the beauty of having your tools always running and always accessible.
Gemini 3.1 Flash Image for generation. I didn't have any image generation set up. Tepui suggested Flux.2 Pro (free on OpenRouter) but I wanted something more capable. Gemini 3.1 Flash Image generates high-quality images for about $3 per million output tokens. Worth it for occasional use.
The config changes
Here's what I actually changed in ~/.openclaw/openclaw.json:
// Primary model with fallback
agents.defaults.model: {
primary: "lazer/deepinfra/zai-org/GLM-5",
fallbacks: ["openrouter/zai-org/GLM-5"]
}
// Vision for images and PDFs
agents.defaults.imageModel: {
primary: "lazer/deepinfra/zai-org/GLM-4.6V"
}
agents.defaults.pdfModel: {
primary: "lazer/deepinfra/zai-org/GLM-4.6V"
}
// Image generation
agents.defaults.imageGenerationModel: {
primary: "openrouter/google/gemini-3.1-flash-image-preview"
}
I also added all the models to the allowlist with aliases so I can switch easily:
agents.defaults.models: {
"openrouter/minimax/minimax-m2.7": { alias: "MiniMax" },
"openrouter/zai-org/GLM-5": { alias: "GLM-5-OR" },
"lazer/deepinfra/zai-org/GLM-5": { alias: "GLM-5" },
"lazer/deepinfra/zai-org/GLM-4.6V": { alias: "GLM-4.6V" },
"lazer/deepinfra/openai/gpt-oss-120b-Turbo": { alias: "GPT-OSS" },
"lazer/deepinfra/moonshotai/Kimi-K2.5-Turbo": { alias: "Kimi" },
"openrouter/google/gemini-3.1-flash-image-preview": { alias: "Gemini-Image" }
}
The aliases make it easy to switch with /model commands in chat.
Testing it
I tested everything:
- Sent a picture of my car's steering wheel. Correctly identified the Mitsubishi logo.
- Sent a parking receipt from the airport. Correctly parsed it (and Tepui correctly identified it was my grandma's flight, not my girlfriend's, by checking my calendar).
- Sent a screenshot of my Spotify. Correctly identified the band (High Fade playing "Gossip").
The vision model works. The text model works. The fallback is there if something breaks.
What I learned
GLM-5 doesn't support images. The model name sounds like it should be the successor to GLM-4.6V, but it's text-only. For vision, you need GLM-4.6V specifically.
Model config fields are strict. OpenClaw's schema only accepts certain fields: id, name, input, contextWindow, maxTokens, reasoning, cost, api. Things like tool_call and temperature get rejected.
models.dev is the source of truth. Don't rely on memory or provider docs. Pull the JSON and check the specs yourself.
OpenClaw model slots matter. If you're only configuring one model, you're missing out on image parsing, PDF reading, and image generation. Set up all six slots.
Pricing matters even when free. I have free access through Lazer, but I still track prices. It helps me understand the cost of what I'm doing and compare alternatives.
The result
My OpenClaw setup is now:
- Free through the Lazer proxy for everyday use
- Fast with GLM-5 for reasoning tasks
- Vision-capable with GLM-4.6V for images and PDFs
- Image-generating with Gemini for when I need to create visuals
- Resilient with GLM-5 fallback through OpenRouter (same model, different provider)
And the cron jobs that were timing out? They're running fine now. The Montevideo Events Report takes 23 seconds instead of timing out at 75 seconds.
Not bad for two hours of work.
What's next
I kept GPT-OSS-120b-Turbo and Kimi-K2.5-Turbo in reserve. GPT-OSS is cheaper than GLM-5 for quick tasks, so I might use it as a second fallback. Kimi has video support, which could be useful if I ever need to analyze video frames.
But for now, this setup covers everything I need. Text, images, PDFs, generation, fallbacks. All configured properly with the right models in the right slots.
If you're running OpenClaw (or any AI assistant), do yourself a favor: check your model config. Make sure you're using the right slots. Pull the specs from models.dev. Track your actual costs. And test everything with real inputs.
It's worth the two hours.
See you in the next one!
08 Apr 2026 5:00am GMT
04 Apr 2026
Planet Twisted
Donovan Preston: Using osascript with terminal agents on macOS
Here is a useful trick that is unreasonably effective for simple computer use goals using modern terminal agents. On macOS, there has been a terminal osascript command since the original release of Mac OS X. All you have to do is suggest your agent use it and it can perform any application control action available in any AppleScript dictionary for any Mac app. No MCP set up or tools required at all. Agents are much more adapt at using rod terminal commands, especially ones that haven't changed in 30 years. Having a computer control interface that hasn't changed in 30 years and has extensive examples in the Internet corpus makes modern models understand how to use these tools basically Effortlessly. macOS locks down these permissions pretty heavily nowadays though, so you will have to grant the application control permission to terminal. But once you have done that, the range of possibilities for commanding applications using natural language is quite extensive. Also, for both Safari and chrome on Mac, you are going to want to turn on JavaScript over AppleScript permission. This basically allows claude or another agent to debug your web applications live for you as you are using them.In chrome, go to the view menu, developer submenu, and choose "Allow JavaScript from Apple events". In Safari, it's under the safari menu, settings, developer, "Allow JavaScript from Apple events". Then you can do something like "Hey Claude, would you Please use osascript to navigate the front chrome tab to hacker news". Once you suggest using OSA script in a session it will figure out pretty quickly what it can do with it. Of course you can ask it to do casual things like open your mail app or whatever. Then you can figure out what other things will work like please click around my web app or check the JavaScript Console for errors. Another very important tips for using modern agents is to try to practice using speech to text. I think speaking might be something like five times faster than typing. It takes a lot of time to get used to, especially after a lifetime of programming by typing, but it's a very interesting and a different experience and once you have a lot of practice It starts to to feel effortless.
04 Apr 2026 1:31pm GMT
16 Mar 2026
Planet Twisted
Donovan Preston: "Start Drag" and "Drop" to select text with macOS Voice Control
I have been using macOS voice control for about three years. First it was a way to reduce pain from excessive computer use. It has been a real struggle. Decades of computer use habits with typing and the mouse are hard to overcome! Text selection manipulation commands work quite well on macOS native apps like apps written in swift or safari with an accessibly tagged webpage. However, many webpages and electron apps (Visual Studio Code) have serious problems manipulating the selection, not working at all when using "select foo" where foo is a word in the text box to select, or off by one errors when manipulating the cursor position or extending the selection. I only recently expanded my repertoire with the "start drag" and "drop" commands, previously having used "Click and hold mouse", "move cursor to x", and "release mouse". Well, now I have discovered that using "start drag x" and "drop x" makes a fantastic text selection method! This is really going to improve my speed. In the long run, I believe computer voice control in general is going to end up being faster than WIMP, but for now the awkwardly rigid command phrasing and the amount of times it misses commands or misunderstands commands still really holds it back. I've been learning the macOS Voice Control specific command set for years now and I still reach for the keyboard and mouse way too often.
16 Mar 2026 11:04am GMT
04 Mar 2026
Planet Twisted
Glyph Lefkowitz: What Is Code Review For?
Humans Are Bad At Perceiving
Humans are not particularly good at catching bugs. For one thing, we get tired easily. There is some science on this, indicating that humans can't even maintain enough concentration to review more than about 400 lines of code at a time..
We have existing terms of art, in various fields, for the ways in which the human perceptual system fails to register stimuli. Perception fails when humans are distracted, tired, overloaded, or merely improperly engaged.
Each of these has implications for the fundamental limitations of code review as an engineering practice:
-
Inattentional Blindness: you won't be able to reliably find bugs that you're not looking for.
-
Repetition Blindness: you won't be able to reliably find bugs that you are looking for, if they keep occurring.
-
Vigilance Fatigue: you won't be able to reliably find either kind of bugs, if you have to keep being alert to the presence of bugs all the time.
-
and, of course, the distinct but related Alert Fatigue: you won't even be able to reliably evaluate reports of possible bugs, if there are too many false positives.
Never Send A Human To Do A Machine's Job
When you need to catch a category of error in your code reliably, you will need a deterministic tool to evaluate - and, thanks to our old friend "alert fatigue" above - ideally, to also remedy that type of error. These tools will relieve the need for a human to make the same repetitive checks over and over. None of them are perfect, but:
- to catch logical errors, use automated tests.
- to catch formatting errors, use autoformatters.
- to catch common mistakes, use linters.
- to catch common security problems, use a security scanner.
Don't blame reviewers for missing these things.
Code review should not be how you catch bugs.
What Is Code Review For, Then?
Code review is for three things.
First, code review is for catching process failures. If a reviewer has noticed a few bugs of the same type in code review, that's a sign that that type of bug is probably getting through review more often than it's getting caught. Which means it's time to figure out a way to deploy a tool or a test into CI that will reliably prevent that class of error, without requiring reviewers to be vigilant to it any more.
Second - and this is actually its more important purpose - code review is a tool for acculturation. Even if you already have good tools, good processes, and good documentation, new members of the team won't necessarily know about those things. Code review is an opportunity for older members of the team to introduce newer ones to existing tools, patterns, or areas of responsibility. If you're building an observer pattern, you might not realize that the codebase you're working in already has an existing idiom for doing that, so you wouldn't even think to search for it, but someone else who has worked more with the code might know about it and help you avoid repetition.
You will notice that I carefully avoided saying "junior" or "senior" in that paragraph. Sometimes the newer team member is actually more senior. But also, the acculturation goes both ways. This is the third thing that code review is for: disrupting your team's culture and avoiding stagnation. If you have new talent, a fresh perspective can also be an extremely valuable tool for building a healthy culture. If you're new to a team and trying to build something with an observer pattern, and this codebase has no tools for that, but your last job did, and it used one from an open source library, that is a good thing to point out in a review as well. It's an opportunity to spot areas for improvement to culture, as much as it is to spot areas for improvement to process.
Thus, code review should be as hierarchically flat as possible. If the goal of code review were to spot bugs, it would make sense to reserve the ability to review code to only the most senior, detail-oriented, rigorous engineers in the organization. But most teams already know that that's a recipe for brittleness, stagnation and bottlenecks. Thus, even though we know that not everyone on the team will be equally good at spotting bugs, it is very common in most teams to allow anyone past some fairly low minimum seniority bar to do reviews, often as low as "everyone on the team who has finished onboarding".
Oops, Surprise, This Post Is Actually About LLMs Again
Sigh. I'm as disappointed as you are, but there are no two ways about it: LLM code generators are everywhere now, and we need to talk about how to deal with them. Thus, an important corollary of this understanding that code review is a social activity, is that LLMs are not social actors, thus you cannot rely on code review to inspect their output.
My own personal preference would be to eschew their use entirely, but in the spirit of harm reduction, if you're going to use LLMs to generate code, you need to remember the ways in which LLMs are not like human beings.
When you relate to a human colleague, you will expect that:
- you can make decisions about what to focus on based on their level of experience and areas of expertise to know what problems to focus on; from a late-career colleague you might be looking for bad habits held over from legacy programming languages; from an earlier-career colleague you might be focused more on logical test-coverage gaps,
- and, they will learn from repeated interactions so that you can gradually focus less on a specific type of problem once you have seen that they've learned how to address it,
With an LLM, by contrast, while errors can certainly be biased a bit by the prompt from the engineer and pre-prompts that might exist in the repository, the types of errors that the LLM will make are somewhat more uniformly distributed across the experience range.
You will still find supposedly extremely sophisticated LLMs making extremely common mistakes, specifically because they are common, and thus appear frequently in the training data.
The LLM also can't really learn. An intuitive response to this problem is to simply continue adding more and more instructions to its pre-prompt, treating that text file as its "memory", but that just doesn't work, and probably never will. The problem - "context rot" is somewhat fundamental to the nature of the technology.
Thus, code-generators must be treated more adversarially than you would a human code review partner. When you notice it making errors, you always have to add tests to a mechanical, deterministic harness that will evaluates the code, because the LLM cannot meaningfully learn from its mistakes outside a very small context window in the way that a human would, so giving it feedback is unhelpful. Asking it to just generate the code again still requires you to review it all again, and as we have previously learned, you, a human, cannot review more than 400 lines at once.
To Sum Up
Code review is a social process, and you should treat it as such. When you're reviewing code from humans, share knowledge and encouragement as much as you share bugs or unmet technical requirements.
If you must reviewing code from an LLM, strengthen your automated code-quality verification tooling and make sure that its agentic loop will fail on its own when those quality checks fail immediately next time. Do not fall into the trap of appealing to its feelings, knowledge, or experience, because it doesn't have any of those things.
But for both humans and LLMs, do not fall into the trap of thinking that your code review process is catching your bugs. That's not its job.
Acknowledgments
Thank you to my patrons who are supporting my writing on this blog. If you like what you've read here and you'd like to read more of it, or you'd like to support my various open-source endeavors, you can support my work as a sponsor!
04 Mar 2026 5:24am GMT
22 Jan 2026
Planet Plone - Where Developers And Integrators Write
Maurits van Rees: Mikel Larreategi: How we deploy cookieplone based projects.

We saw that cookieplone was coming up, and Docker, and as game changer uv making the installation of Python packages much faster.
With cookieplone you get a monorepo, with folders for backend, frontend, and devops. devops contains scripts to setup the server and deploy to it. Our sysadmins already had some other scripts. So we needed to integrate that.
First idea: let's fork it. Create our own copy of cookieplone. I explained this in my World Plone Day talk earlier this year. But cookieplone was changing a lot, so it was hard to keep our copy updated.
Maik Derstappen showed me copier, yet another templating language. Our idea: create a cookieplone project, and then use copier to modify it.
What about the deployment? We are on GitLab. We host our runners. We use the docker-in-docker service. We develop on a branch and create a merge request (pull request in GitHub terms). This activates a piple to check-test-and-build. When it is merged, bump the version, use release-it.
Then we create deploy keys and tokens. We give these access to private GitLab repositories. We need some changes to SSH key management in pipelines, according to our sysadmins.
For deployment on the server: we do not yet have automatic deployments. We did not want to go too fast. We are testing the current pipelines and process, see if they work properly. In the future we can think about automating deployment. We just ssh to the server, and perform some commands there with docker.
Future improvements:
- Start the docker containers and curl/wget the
/okendpoint. - lock files for the backend, with pip/uv.
22 Jan 2026 9:43am GMT
Maurits van Rees: Jakob Kahl and Erico Andrei: Flying from one Plone version to another

This is a talk about migrating from Plone 4 to 6 with the newest toolset.
There are several challenges when doing Plone migrations:
- Highly customized source instances: custom workflow, add-ons, not all of them with versions that worked on Plone 6.
- Complex data structures. For example a Folder with a Link as default page, with pointed to some other content which meanwhile had been moved.
- Migrating Classic UI to Volto
- Also, you might be migrating from a completely different CMS to Plone.
How do we do migrations in Plone in general?
- In place migrations. Run migration steps on the source instance itself. Use the standard upgrade steps from Plone. Suitable for smaller sites with not so much complexity. Especially suitable if you do only a small Plone version update.
- Export - import migrations. You extract data from the source, transform it, and load the structure in the new site. You transform the data outside of the source instance. Suitable for all kinds of migrations. Very safe approach: only once you are sure everything is fine, do you switch over to the newly migrated site. Can be more time consuming.
Let's look at export/import, which has three parts:
- Extraction: you had collective.jsonify, transmogrifier, and now collective.exportimport and plone.exportimport.
- Transformation: transmogrifier, collective.exportimport, and new: collective.transmute.
- Load: Transmogrifier, collective.exportimport, plone.exportimport.
Transmogrifier is old, we won't talk about it now. collective.exportimport: written by Philip Bauer mostly. There is an @@export_all view, and then @@import_all to import it.
collective.transmute is a new tool. This is made to transform data from collective.exportimport to the plone.exportimport format. Potentially it can be used for other migrations as well. Highly customizable and extensible. Tested by pytest. It is standalone software with a nice CLI. No dependency on Plone packages.
Another tool: collective.html2blocks. This is a lightweight Python replacement for the JavaScript Blocks conversion tool. This is extensible and tested.
Lastly plone.exportimport. This is a stripped down version of collective.exportimport. This focuses on extract and load. No transforms. So this is best suited for importing to a Plone site with the same version.
collective.transmute is in alpha, probably a 1.0.0 release in the next weeks. Still missing quite some documentation. Test coverage needs some improvements. You can contribute with PRs, issues, docs.
22 Jan 2026 9:43am GMT
Maurits van Rees: Fred van Dijk: Behind the screens: the state and direction of Plone community IT

This is a talk I did not want to give.
I am team lead of the Plone Admin team, and work at kitconcept.
The current state: see the keynotes, lots happening on the frontend. Good.
The current state of our IT: very troubling and daunting.
This is not a 'blame game'. But focussing on resources and people this conference should be a first priority. We are a real volunteer organisation, nobody is pushing anybody around. That is a strength, but also a weakness. We also see that in the Admin team.
The Admin team is 4 senior Plonistas as allround admin, 2 release managers, 2 CI/CD experts. 3 former board members, everyone overburdened with work. We had all kinds of plans for this year, but we have mostly been putting out fires.
We are a volunteer organisation, and don't have a big company behind us that can throw money at the problems. Strength and weakness. In all society it is a problem that volunteers are decreasing.
Root causes:
- We failed to scale down in time in our IT landscape and usage.
- We have no clean role descriptions, team descriptions, we can't ask a minimum effort per week or month.
- The trend is more communication channels, platforms to join and promote yourself, apps to use.
Overview of what have have to keep running as admin team:
- Support main development process: github, CI/CD, Jenkins main and runners, dist.plone.org.
- Main communication, documentation: pone.org, docs.plone.org, training.plone.org, conf and country sites, Matomo.
- Community office automation: Google docds, workspacae, Quaive, Signal, Slack
- Broader: Discourse and Discord
The first two are really needed, the second we already have some problems with.
Some services are self hosted, but also a lot of SAAS services/platforms. In all, it is quite a bit.
The Admin team does not officially support all of these, but it does provide fallback support. It is too much for the current team.
There are plans for what we can improve in the short term. Thank you to a lot of people that I have already talked to about this. 3 areas: GitHub setup and config, Google Workspace, user management.
On GitHub we have a sponsored OSS plan. So we have extra features for free, but it not enough by far. User management: hard to get people out. You can't contact your members directly. E-mail has been removed, for privacy. Features get added on GitHub, and no complete changelog.
Challenge on GitHub: we have public repositories, but we also have our deployments in there. Only really secure would be private repositories, otherwise the danger is that credentials or secret could get stolen. Every developer with access becomes an attack vector. Auditing is available for only 6 months. A simple question like: who has been active for the last 2 years? No, can't do.
Some actionable items on GitHub:
- We will separate the contributor agreement check from the organisation membership. We create a hidden team for those who signed, and use that in the check.
- Cleanup users, use Contributors team, Developers
- Active members: check who has contributed the last years.
- There have been security incidents. Someone accidentally removed a few repositories. Someone's account got hacked, luckily discovered within a few hours, and some actions had already been taken.
- More fine grained teams to control repository access.
- Use of GitHub Discussions for some central communication of changes.
- Use project management better.
- The elephant in the room that we have practice on this year, and ongoing: the Collective organisation. This was free for all, very nice, but the development world is not a nice and safe place anymore. So we already needed to lock down some things there.
- Keep deployments and the secrets all out of GitHub, so no secrets can be stolen.
Google Workspace:
- We are dependent on this.
- No user management. Admins have had access because they were on the board, but they kept access after leaving the board. So remove most inactive users.
- Spam and moderation issues
- We could move to Google docs for all kinds of things. Use Google workspace drives for all things. But the Drive UI is a mess, so docs can be in your personal account without you realizing it.
User management:
- We need separate standalone user management, but implementation is not clear.
- We cannot contact our members one on one.
Oh yes, Plone websites:
- upgrade plone.org
- self preservation: I know what needs to be done, and can do it, but have no time, focusing on the previous points instead.
22 Jan 2026 9:43am GMT