23 Jun 2026
Planet Twisted
Glyph Lefkowitz: Adversarial Communication
As I have discussed in previous posts, "AIs" can make mistakes. In fact, they do make mistakes, and their mistake-making patterns are such that where and how they will make mistakes is both uncertain and constantly changing.
Thus, in any scenario where you want to attempt to make "productive" use of "AI", you must have a system in place for checking every result. Not checking some results; checking every result. If each result might have a consequence for you (and if it didn't have a consequence, why bother automating it?) and you cannot predict in advance which kinds of results will need verification, then verification is always required.
The verification often ends up being just as expensive as doing the work in the first place, which means that if you want your usage of "AI" to be personally profitable, you have to find someone else to externalize the cost of verification onto. This person becomes your adversary, and, if you are successful, your "AI's" victim.
The Ladder-Climber And Their Reverse-Centaur Rungs
One way that this constellation of facts can straightforwardly assemble themselves into a dystopian nightmare is the phenomenon, described by Cory Doctorow, of the reverse centaur. This is when your employer non-consensually turns you into the verification system. The "AI" does the fun part of initially performing the work, and then you do the boring part where you check if the robot is right and clean up its messes, even if everyone already knows that it would, in aggregate, be cheaper for you to do the work in the first place.
Reverse centaurs can be made from any automation, not only "AI" automation. I think that there is a reason that this term happens to have emerged in the "age of AI", though, and not with earlier automation technologies (even those which were considerably more viscerally horrific). That reason is: the wrongness of "AI" output is not merely a technical feature that must be compensated for, it is a generalized externality.
As I mentioned above, if you are responsible for the entirety of the work, both extruding the "AI" output and checking it, it's usually cheaper to have humans do the entirety of the work to begin with. When humans do the writing directly, we can check as we go, and thus verification doesn't need to be as comprehensive.
When "AI" coding advocates say "code review is the bottleneck", what they are observing is that the LLM is still rolling the dice for each PR, and a human is still necessary to verify that each of those rolls is a winner. But calling this process "code review" is a bit of a misnomer; it's not really "code review" in the traditional sense, it's human understanding.
Before the advent of "AI", the human understanding was implicit in the process of writing the code in the first place1, and the code review was a way of diffusing and extending that understanding. Now that the code can be authored with no initial understanding taking place, that cost has not gone away, it has moved.
Human understanding was always the bottleneck.
However, this is taking a collaborative view of a software project, where satisfying the needs and solving the problems of your customers are the goals. We can see that "AI" is a bad tool to satisfy those goals, because all it's doing is converting the first half of the work, that of understanding the code as you write it, to understanding the agent's output as you read it.
What if, instead, we were to take the view that every software company is a Hobbesian nightmare, red in tooth and claw? In this view, the only goal of a software project is for the individual developers to make their promo cycles and get their bonuses. Given that there is only a certain amount of money to go around, this is a zero-sum game where each programmer wants to look more productive than their colleagues.
Pretty much every organization finds it easy to reward "productivity" as expressed by lines of code emitted, but the benefits of doing thorough and thoughtful design, analysis, and code review very difficult to reward. In this world, an LLM is an invaluable tool for the sociopathic ladder-climber, particularly if your legacy organization is still structuring their workflows as if the person prompting the bot is "writing" the code, and then they get to foist off the act of "reviewing" the code onto someone else.
Here, the prompter effectively externalizes the cost of the LLM's failures but internalizes any benefits. The prompter will vibe-code a big feature, so large that the assigned reviewer can't possibly comprehend it all effectively. When this happens, the reviewer will, eventually, be pressured to approve it, even if they can try to spot a few problems along the way. The reviewer has their own work to get back to, after all, the obligation to review the prompter's (read: the bot's) code is a drain on their time that they are not going to get rewarded for.
If this feature is a big success, the prompter gets a promotion. If it causes a big issue, well, the reviewer must not have been careful enough.
This is why LLMs are "good for coding", and also why their biggest promoters keep having outages.
The Generative Gish Galloper
Coding is the biggest "success story" of this type of adversarial communication, but it is by far not the only instance of such a thing. LLMs create a new form of leverage that can turn Brandolini's law from a linear advantage into an exponential one. If you are engaged in a political debate where you want to overwhelm the other side in nonsense, an LLM can generate bullshit faster than it is physically possible for a human being to type, let alone respond thoughtfully. There is an asymmetry to the utility of this weapon as well: only one side of the political spectrum wants to flood the zone and destroy trust in institutions and the concept of truth. There's a good reason that the fascists love it.
Straightforward Spam and Fraud
This is kind of obvious, but LLMs can generate lightly-customized, plausible-looking text much more quickly than any human being. This facilitates their use in fraud, spam, and scams. In a spamming or fraudulent interaction, once again, the costs are externalized onto the victim: the recipient of a spam message has to do all the work of "checking" the LLM's output. Spammers already expect very low hit rates from boilerplate, and if the LLM can increase those percentages from 1% to 5% the technology will pay for itself; they don't need anything like reliable accuracy.
Customer "Support"
If you have any kind of commercial relationship with a company, I probably don't even need to mention this: customer "support" bots are a misery. Everybody knows it at this point. But customer support is usually conceptualized by businesses as an adversarial interaction, because it is a cost center. They maintain internal metrics on time-to-resolution and try to optimize them. Implicitly, this creates a dynamic where the goal of the customer service agent's job is not to solve your problem, but to emit noise that will cause you to think your problem is resolved, or to give up, as fast as possible. Unsurprisingly, LLMs can emit this noise faster than humans can, getting those customers off the phone. But those customers will remember those interactions, and the story outside the TTR metrics is horrible.
Similarly to the situation in software development, LLMs can look very good on paper for customer support, but mostly what they are doing is illuminating the problems with the industry's existing metrics, by turning "winning the metrics battle against the customer" into a more obvious and immediate defeat for the company's long term reputation.
"Education"
In 2026 it is sadly a fact of life that students cheat all the time using "AI", and that this cheating is very successful, in that the teachers find it very hard to detect.
LLMs are great for cheating on schoolwork because the student is externalizing the work of the checking onto the teachers, who are often starting at a disadvantage to begin with, at least in the US.
My view is that this is happening because of a divergence in the way that students vs. teachers (or, more accurately, "the broader educational system") view grading.
When a student is asked to write an essay, the teachers see the effort as both intrinsically worthwhile for the student, as well as useful as a pedagogical tool to evaluate and react to the student's progress. The student, by contrast, sees a stumbling block designed to knock them off the path to success and into a permanent underclass. It is no wonder that the student sees "AI" as useful to their own goals and has no compunction about deploying it.
There is a bitter irony that the ability to understand the inherent value of actually writing the essay on their own is the sort of thing that students can really only learn by writing a bunch of essays. There's no way that I can think of which makes the benefit legible as long as a shortcut is available.
The net effect here is a downward spiral, where the already-wobbling educational system is sustaining an attack that it doesn't have the resources to recover from. The individual students' attacks against their teachers and their schools' grading systems might appear to momentarily succeed, but they will win the battle and lose the war.
Spamming "For Good"?
Usually when we talk about someone unilaterally choosing to enter into an adversarial relationship, that's an "attack" and for good reasons we have a negative impression of the attacker. However, I would be remiss if I did not point out that there are some cases where the relationship was already adversarial; just because you're the attacker doesn't mean that you are evil.
For example we might imagine use-cases like automatically filing appeals for prior authorizations against health insurance. It's relatively well-known at this point that the main way for-profit insurers maintain their margins is by denying claims right up to the line of the policies themselves being fraud, so using a spamming tool to fight them might be entirely justifiable2 in that case.
Similarly, using an LLM could be justified in a fight against a company refusing to honor a warranty. One could imagine using an LLM to immediately generate replies and escalations.
However, even in imagined cases like these, the underlying problem is that the insurers and the vendors already have a tremendous amount of structural power, so it is more likely that they will have the advantage in deploying a communications weapon like an LLM, as well as enacting policies to simply ignore any LLM-based communication that you might submit. Worse, if these strategies were to become widespread, they might provide an excuse to reject any communications by feeding them into an unreliable "LLM detector" and issuing an automated "computer says no" even to hand-written correspondence.
It is also worth stressing that these cases are imagined, as compared to the very real coworker-abuse, spam, scam, fraud, and disinformation campaigns being waged in real life today.
Therefore, while legitimate uses might exist, it's hard to imagine that there's anywhere they would be genuinely valuable and sustainable. In the best case "AI" will provide a temporary advantage for underdogs that will provoke an arms race which the resource-advantaged adversaries will win in the long run, in the worst case the arms race itself will cement permanent structural change that will make things worse.
"Search" By Stealing
Most of the adversarial utility of "AI" is on the "write" side, since write-amplification is more obviously aggressive than reading. But the "read" side of LLMs - summarization and question-answering - can be a form of attack as well.
To begin with, the act of reading itself is currently enormously destructive, but that's arguably not a fundamental aspect of this technology. They could set reasonable rate-limits and respect things like robots.txt, as search engines have for decades now. They could also refrain from committing criminal levels of copyright infringement. But, today, using "AI" tools does suborn this sort of out-of-control crawling.
More insidiously, consider the scenario described in this YouTube video. The LTT Bros decided to try Linux again, and in the course of so doing, they had problems. When trying to solve these problems, they were faced with a choice: they could consult Reddit, or they could ask an LLM. Asking an LLM would "gaslight the heck out of" them, but they still found it preferable, because they would at least get an answer without getting yelled at.
Initially this sounds great. But it also means that you want to extract knowledge from a community, while mechanically eliding any values or norms that the community may want to impart as part of offering that knowledge. As someone who spent many years in a community tech support role, this is worrying. Many requests for support are people asking how to do things that will momentarily solve a superficial problem but create a long-term reliability problem or even an immediate security risk, that the question-asker doesn't want to hear about. Consider the question "I'm tired of entering my password so much, how do I make it so my laptop unlocks automatically". An obsequious chatbot will helpfully tell you how to do this without pushback.
But, this is also a sort of ethically murky area. The Linux community is somewhat famously, for many years now, a toxic cesspool of general hostility, misogyny, etc. It is certainly a good thing that people can get access to this knowledge without subjecting themselves to abuse. But it also means that the people with the power and the privilege to change the community for the better can just quietly withdraw, rather than fixing the problems. It also means that the positive elements of culture cannot be transmitted, and people will have no opportunity to learn about unknown unknowns.
In this case, the "adversarial" communication is with society. The thing that using an LLM for search lets you do is withdraw from society and avoid forming any personal connections. There are some personal connections which are painful and annoying, and so that can feel like a momentary balm. But the need to make connections in general is, like, the concept of society itself.
Who Am I Hurting?
LLMs are good at adversarial communication. They are so good at it, relative to their other benefits, that they will tend to make communications adversarial if you are not remaining vigilant about the possibility that it might do so. My request to you, dear reader, if you are going to use such tools, is to always ask yourself, "who might I be hurting, if I use an LLM for this?"
If you're using an "AI", who is its adversary? If you haven't given it one yet, who might the "AI" turn into an adversary? Who might you overwhelm with an asymmetric amount of output, or, if you're receiving information and not sending it, who are you taking that information from without consulting?
Figure out the answers to these questions and conduct yourself accordingly; the answer might be "yourself".
Acknowledgments
Thank you to my patrons who are supporting my writing on this blog. If you like what you've read here and you'd like to read more of it, or you'd like to support my various open-source endeavors, you can support my work as a sponsor!
-
One of the reasons that software developers tend to prefer greenfield development is that when you are given a blank page, you can project your own specific understanding onto it. You can structure the codebase in a way that works for your brain, down to the variable naming conventions and the module layouts. LLM-assisted development makes everything into instant brownfield work, which makes developers instantly miserable; even those who are excited about the technology will frequently complain about how it feels like their agency has been stolen and their joy in the work has been diminished. But I digress. ↩
-
Modulo the massive amount of other externalities involved in using LLMs, of course, but I don't have the time or energy to get into those here. ↩
23 Jun 2026 8:06pm GMT
Planet Python
Glyph Lefkowitz: Adversarial Communication
As I have discussed in previous posts, "AIs" can make mistakes. In fact, they do make mistakes, and their mistake-making patterns are such that where and how they will make mistakes is both uncertain and constantly changing.
Thus, in any scenario where you want to attempt to make "productive" use of "AI", you must have a system in place for checking every result. Not checking some results; checking every result. If each result might have a consequence for you (and if it didn't have a consequence, why bother automating it?) and you cannot predict in advance which kinds of results will need verification, then verification is always required.
The verification often ends up being just as expensive as doing the work in the first place, which means that if you want your usage of "AI" to be personally profitable, you have to find someone else to externalize the cost of verification onto. This person becomes your adversary, and, if you are successful, your "AI's" victim.
The Ladder-Climber And Their Reverse-Centaur Rungs
One way that this constellation of facts can straightforwardly assemble themselves into a dystopian nightmare is the phenomenon, described by Cory Doctorow, of the reverse centaur. This is when your employer non-consensually turns you into the verification system. The "AI" does the fun part of initially performing the work, and then you do the boring part where you check if the robot is right and clean up its messes, even if everyone already knows that it would, in aggregate, be cheaper for you to do the work in the first place.
Reverse centaurs can be made from any automation, not only "AI" automation. I think that there is a reason that this term happens to have emerged in the "age of AI", though, and not with earlier automation technologies (even those which were considerably more viscerally horrific). That reason is: the wrongness of "AI" output is not merely a technical feature that must be compensated for, it is a generalized externality.
As I mentioned above, if you are responsible for the entirety of the work, both extruding the "AI" output and checking it, it's usually cheaper to have humans do the entirety of the work to begin with. When humans do the writing directly, we can check as we go, and thus verification doesn't need to be as comprehensive.
When "AI" coding advocates say "code review is the bottleneck", what they are observing is that the LLM is still rolling the dice for each PR, and a human is still necessary to verify that each of those rolls is a winner. But calling this process "code review" is a bit of a misnomer; it's not really "code review" in the traditional sense, it's human understanding.
Before the advent of "AI", the human understanding was implicit in the process of writing the code in the first place1, and the code review was a way of diffusing and extending that understanding. Now that the code can be authored with no initial understanding taking place, that cost has not gone away, it has moved.
Human understanding was always the bottleneck.
However, this is taking a collaborative view of a software project, where satisfying the needs and solving the problems of your customers are the goals. We can see that "AI" is a bad tool to satisfy those goals, because all it's doing is converting the first half of the work, that of understanding the code as you write it, to understanding the agent's output as you read it.
What if, instead, we were to take the view that every software company is a Hobbesian nightmare, red in tooth and claw? In this view, the only goal of a software project is for the individual developers to make their promo cycles and get their bonuses. Given that there is only a certain amount of money to go around, this is a zero-sum game where each programmer wants to look more productive than their colleagues.
Pretty much every organization finds it easy to reward "productivity" as expressed by lines of code emitted easy to reward, but the benefits of doing thorough and thoughtful design, analysis, and code review very difficult to reward. In this world, an LLM is an invaluable tool for the sociopathic ladder-climber, particularly if your legacy organization is still structuring their workflows as if the person prompting the bot is "writing" the code, and then they get to foist off the act of "reviewing" the code onto someone else.
Here, the prompter effectively externalizes the cost of the LLM's failures but internalizes any benefits. The prompter will vibe-code a big feature, so large that the assigned reviewer can't possibly comprehend it all effectively. When this happens, the reviewer will, eventually, be pressured to approve it, even if they can try to spot a few problems along the way. The reviewer has their own work to get back to, after all, the obligation to review the prompter's (read: the bot's) code is a drain on their time that they are not going to get rewarded for.
If this feature is a big success, the prompter gets a promotion. If it causes a big issue, well, the reviewer must not have been careful enough.
This is why LLMs are "good for coding", and also why their biggest promoters keep having outages.
The Generative Gish Galloper
Coding is the biggest "success story" of this type of adversarial communication, but it is by far not the only instance of such a thing. LLMs create a new form of leverage that can turn Brandonlini's law from a linear advantage into an exponential one. If you are engaged in a political debate where you want to overwhelm the other side in nonsense, an LLM can generate bullshit faster than it is physically possible for a human being to type, let alone respond thoughtfully. There is an asymmetry to the utility of this weapon as well: only one side of the political spectrum wants to flood the zone and destroy trust in institutions and the concept of truth. There's a good reason that the fascists love it.
Straightforward Spam and Fraud
This is kind of obvious, but LLMs can generate lightly-customized, plausible-looking text much more quickly than any human being. This facilitates their use in fraud, spam, and scams. In a spamming or fraudulent interaction, once again, the costs are externalized onto the victim: the recipient of a spam message has to do all the work of "checking" the LLM's output. Spammers already expect very low hit rates from boilerplate, and if the LLM can increase those percentages from 1% to 5% the technology will pay for itself; they don't need anything like reliable accuracy.
Customer "Support"
If you have any kind of commercial relationship with a company, I probably don't even need to mention this: customer "support" bots are a misery. Everybody knows it at this point. But customer support is usually conceptualized by businesses as an adversarial interaction, because it is a cost center. They maintain internal metrics on time-to-resolution and try to optimize them. Implicitly, this creates a dynamic where the goal of the customer service agent's job is not to solve your problem, but to emit noise that will cause you to think your problem is resolved, or to give up, as fast as possible. Unsurprisingly, LLMs can emit this noise faster than humans can, getting those customers off the phone. But those customers will remember those interactions, and the story outside the TTR metrics is horrible.
Similarly to the situation in software development, LLMs can look very good on paper for customer support, but mostly what they are doing is illuminating the problems with the industry's existing metrics, by turning "winning the metrics battle against the customer" into a more obvious and immediate defeat for the company's long term reputation.
"Education"
In 2026 it is sadly a fact of life that students cheat all the time using "AI", and that this cheating is very successful, in that the teachers find it very hard to detect.
LLMs are great for cheating on schoolwork because the student is externalizing the work of the checking onto the teachers, who are often starting at a disadvantage to begin with, at least in the US.
My view is that this is happening because of a divergence in the way that students vs. teachers (or, more accurately, "the broader educational system") view grading.
When a student is asked to write an essay, the teachers see the effort as both intrinsically worthwhile for the student, as well as useful as a pedagogical tool to evaluate and react to the student's progress. The student, by contrast, sees a stumbling block designed to knock them off the path to success and into a permanent underclass. It is no wonder that the student sees "AI" as useful to their own goals and has no compunction about deploying it.
There is a bitter irony that the ability to understand the inherent value of actually writing the essay on their own is the sort of thing that students can really only learn by writing a bunch of essays. There's no way that I can think of which makes the benefit legible as long as a shortcut is available.
The net effect here is a downward spiral, where the already-wobbling educational system is sustaining an attack that it doesn't have the resources to recover from. The individual students' attacks against their teachers and their schools' grading systems might appear to momentarily succeed, but they will win the battle and lose the war.
Spamming "For Good"?
Usually when we talk about someone unilaterally choosing to enter into an adversarial relationship, that's an "attack" and for good reasons we have a negative impression of the attacker. However, I would be remiss if I did not point out that there are some cases where the relationship was already adversarial; just because you're the attacker doesn't mean that you are evil.
For example we might imagine use-cases like automatically filing appeals for prior authorizations against health insurance. It's relatively well-known at this point that the main way for-profit insurers maintain their margins is by denying claims right up to the line of the policies themselves being fraud, so using a spamming tool to fight them might be entirely justifiable2 in that case.
Similarly, using an LLM could be justified in a fight against a company refusing to honor a warranty. One could imagine using an LLM to immediately generate replies and escalations.
However, even in imagined cases like these, the underlying problem is that the insurers and the vendors already have a tremendous amount of structural power, so it is more likely that they will have the advantage in deploying a communications weapon like an LLM, as well as enacting policies to simply ignore any LLM-based communication that you might submit. Worse, if these strategies were to become widespread, they might provide an excuse to reject any communications by feeding them into an unreliable "LLM detector" and issuing an automated "computer says no" even to hand-written correspondence.
It is also worth stressing that these cases are imagined, as compared to the very real coworker-abuse, spam, scam, fraud, and disinformation campaigns being waged in real life today.
Therefore, while legitimate uses might exist, it's hard to imagine that there's anywhere they would be genuinely valuable and sustainable. In the best case "AI" will provide a temporary advantage for underdogs that will provoke an arms race which the resource-advantaged adversaries will win in the long run, in the worst case the arms race itself will cement permanent structural change that will make things worse.
"Search" By Stealing
Most of the adversarial utility of "AI" is on the "write" side, since write-amplification is more obviously aggressive than reading. But the "read" side of LLMs - summarization and question-answering - can be a form of attack as well.
To begin with, the act of reading itself is currently enormously destructive, but that's arguably not a fundamental aspect of this technology. They could set reasonable rate-limits and respect things like robots.txt, as search engines have for decades now. They could also refrain from committing criminal levels of copyright infringement. But, today, using "AI" tools does suborn this sort of out-of-control crawling.
More insidiously, consider the scenario described in this YouTube video. The LTT Bros decided to try Linux again, and in the course of so doing, they had problems. When trying to solve these problems, they were faced with a choice: they could consult Reddit, or they could ask an LLM. Asking an LLM would "gaslight the heck out of" them, but they still found it preferable, because they would at least get an answer without getting yelled at.
Initially this sounds great. But it also means that you want to extract knowledge from a community, while mechanically eliding any values or norms that the community may want to impart as part of offering that knowledge. As someone who spent many years in a community tech support role, this is worrying. Many requests for support are people asking how to do things that will momentarily solve a superficial problem but create a long-term reliability problem or even an immediate security risk, that the question-asker doesn't want to hear about. Consider the question "I'm tired of entering my password so much, how do I make it so my laptop unlocks automatically". An obsequious chatbot will helpfully tell you how to do this without pushback.
But, this is also a sort of ethically murky area. The Linux community is somewhat famously, for many years now, a toxic cesspool of general hostility, misogyny, etc. It is certainly a good thing that people can get access to this knowledge without subjecting themselves to abuse. But it also means that the people with the power and the privilege to change the community for the better can just quietly withdraw, rather than fixing the problems. It also means that the positive elements of culture cannot be transmitted, and the things that people will never learn about unknown unknowns.
In this case, the "adversarial" communication is with society. The thing that using an LLM for search lets you do is withdraw from society and avoid forming any personal connections. There are some personal connections which are painful and annoying, and so that can feel like a momentary balm. But the need to make connections in general is, like, the concept of society itself.
Who Am I Hurting?
LLMs are good at adversarial communication. They are so good at it, relative to their other benefits, that they will tend to make communications adversarial if you are not remaining vigilant about the possibility that it might do so. My request to you, dear reader, if you are going to use such tools, is to always ask yourself, "who might I be hurting, if I use an LLM for this?"
If you're using an "AI", who is its adversary? If you haven't given it one yet, who might the "AI" turn into an adversary? Who might you overwhelm with an asymmetric amount of output, or, if you're receiving information and not sending it, who are you taking that information from without consulting?
Figure out the answers to these questions and conduct yourself accordingly; the answer might be "yourself".
Acknowledgments
Thank you to my patrons who are supporting my writing on this blog. If you like what you've read here and you'd like to read more of it, or you'd like to support my various open-source endeavors, you can support my work as a sponsor!
-
One of the reasons that software developers tend to prefer greenfield development is that when you are given a blank page, you can project your own specific understanding onto it. You can structure the codebase in a way that works for your brain, down to the variable naming conventions and the module layouts. LLM-assisted development makes everything into instant brownfield work, which makes developers instantly miserable; even those who are excited about the technology will frequently complain about how it feels like their agency has been stolen and their joy in the work has been diminished. But I digress. ↩
-
Modulo the massive amount of other externalities involved in using LLMs, of course, but I don't have the time or energy to get into those here. ↩
23 Jun 2026 10:38am GMT
22 Jun 2026
Planet Python
Rodrigo Girão Serrão: Write a coding agent from first principles
![]()
Learn how to write a coding agent in Python in this tutorial that teaches how to interact with an LLM through an API, how to manage the context, and how to do tool calling.
Introduction
This tutorial will show you how to create your own coding agent from first principles. By doing so, you'll understand how coding agents work under the hood.
Prerequisites
To be able to follow this tutorial, you'll need
- prior Python experience: this tutorial is not suitable for people who don't have programming experience
- a valid Claude API key: you can get a Claude API key in the Claude Console dashboard[^1]
- uv: to manage the project you'll be working on
The concepts explained in this tutorial are independent from your LLM provider but the code snippets will make use of the Claude API and its Python SDK. This means that you can follow along with a different model provider as long as you adapt the code snippets to match the format expected by the API of your provider.
What's a coding agent?
A coding agent is an agent that's specialised for coding. In turn, an agent is just an LLM that has been extended with extra functionality that allows it to interact with its environment. This extra functionality is provided through tools, one of the core ideas covered in this tutorial.
This short definition still hides a lot of details, but instead of giving you a theoretical definition you can learn what a coding agent is by creating one. That starts now.
Project set up
To set your project up, start by using uv to create a packageable app project[^2]:
% uv init --app --package agent
Initialized project `agent` at `/Users/rodrigogs/Documents/mathspp/agent`
Then, cd into the project and add the two dependencies you'll need:
% cd agent
% uv add python-dotenv anthropic
You'll use python-dotenv to help you with authentication to access the Claude API and you'll use the dependency anthropic to make it easier to interact with the Claude API.
To set up authentication, create a .env file and paste your Claude API key there in front of the variable ANTHROPIC_API_KEY. When you're done, your .env file should look like this:
ANTHROPIC_API_KEY="sk-ant-api03-qI_3mJ..."
To make sure you never upload your API key to GitHub by accident, add the file .env to your .gitignore:
# .gitignore
# ... other entries generated by uv
.env
Now that you've set up your project, you can make your first request to the Claude API.
Interacting with an LLM
A coding agent needs an LLM at its core. Your LLM can come from any provider you want but you're going to use Claude because its SDK (the dependency anthropic you added in the previous section) is easy to use and because Claude is a popular model provider.
Using the anthropic SDK, here's how you can send a message to the LLM:
# src/agent/__init__.py
from anthropic import Anthropic
import dotenv
dotenv.load_dotenv() # Load .env
MODEL = "claude-haiku-4-5"...22 Jun 2026 12:32pm GMT
21 Jun 2026
Planet Python
The Python Coding Stack: 2. Anatomy of an Agent
Read Stephen's Preface to Agents Unpacked if you're new here.
You have used a large language model. You know the deal: a careful prompt gets a careful answer. A vague prompt gets a vague one. And the model itself does not keep anything from one conversation to the next, unless something external is holding that context for it.
Agents work differently. They have parts that do things a plain LLM does not. These parts are what make an agent an agent. It is not just the model underneath. It is the structure built around it that gives the system its abilities to persist, act, and keep going.
Understanding this structure is the second major shift in this series. The first shift is seeing that a chatbot can give you a good answer without finishing the job, because it stops after responding. The second shift is seeing that an agent is not a smarter model. It is a model placed inside a structure that gives it something to act with and somewhere to keep what it has done.
The Agent Formula
Most agents share the same basic parts:
-
A model (the LLM): the reasoning engine that understands language and decides what to do
-
Instructions: what tells the agent who it is, what it is for, and what 'good' looks like
-
Memory: a workspace or store that holds what has happened so far
-
Tools: capabilities the agent can call on to do things beyond generating text
-
An execution loop: the cycle of observing, deciding, acting, and checking
Different platforms package these differently. Some call memory "context," some call tools "plugins" or "capabilities," and some merge instructions and tools into a single configuration layer. But the parts are the same. An agent is not a single thing. It is a system, and each part matters.
Stephen: Don't LLMs also have memory since they remember what happened earlier in the conversation? How's this different?
Here is one distinction worth getting clear early: the context window and memory are not the same thing. The context window is the working space an LLM uses during a single session. It holds the conversation so far and gets loaded fresh every time the model gets a chance to speak. Memory, by contrast, is information stored outside the model, maintained by the system, and available across sessions and steps. We will come back to this.
An agent needs all its components:
Agent = Model + Instructions + Memory + Tools + Execution Loop
Leave any one of these out and the system changes behaviour in ways that matter. We will look at each piece in turn.
What the Model Does and What It Doesn't Do
The model is the reasoning core. It reads your request, figures out what to do, and decides what to say back. It gets the most attention because it is the part that generates language.
But a model on its own is like a brilliant mind with no hands and no memory of its own. It can think. It cannot act. It cannot remember what happened five minutes ago unless something explicit carries that information forward.
Stephen: Wait a second. You say the model doesn't remember what happened five minutes earlier. But when I use an LLM, it does seem to remember what happened earlier in the conversation.
Here is what is actually happening. When an LLM appears to remember earlier in a conversation, it is not the model itself that is remembering. The context window is carrying all the earlier messages along with your new message, every time you send something. The model sees the full conversation again and generates a response that fits what came before. That is not memory in the model. That is the system feeding the model a transcript.
This trips up almost everyone when they start using agents. The model generates text. The rest of the system decides what to do with that text and whether to act on it.
A better model helps. It reasons more clearly, follows instructions more faithfully, and handles edge cases better. But dropping a smarter model into an agent that is missing a working execution loop will not make it an agent. You need the other parts too.
Instructions: The Agent's Direction
Instructions tell the agent what it is supposed to do and how to behave. Some systems call these system prompts. Others call them agent definitions or behavioural instructions. The name does not matter. What matters is that they are the layer that tells the model why it exists, who it is helping, and what 'good' looks like for the task at hand.
Good instructions do not make an agent smarter. They make it more focused. They give it a frame for every decision: what to prioritise, what to avoid, when to ask for help, how to present its output.
Stephen: Are these what are often called 'skills', or are skills something else altogether?
Skills and instructions are related, but they are not the same thing. Instructions are the core behavioural direction: who the agent is, what it is for, how it should approach its work. A skill, in platforms like OpenClaw and Hermes, is a specific file that tells the agent how to carry out a particular task, often by combining one or more tools. So instructions tell the agent how to behave generally. A skill tells it how to do something specific. We will see this distinction more clearly when we look at how different platforms implement these parts.
The instructions shape what the agent notices, what it proposes, what it tries, and what it says no to. Two agents built on the same model with different instructions will behave differently in the same situation. They will notice different things, prioritise differently, and produce different outcomes.
Poorly written instructions can quietly break an agent. If the instructions are vague, the agent has to improvise every step. If they contradict each other, the agent has to choose, and it might not choose the way you intended.
Stephen: Can you provide a few examples of what these instructions may look like in different scenarios?
Here is what instructions might look like in practice. A poorly-written instruction can quietly break an agent. Consider an instruction that says "be helpful and concise" without defining either term. When a user asks for a full technical breakdown, the agent has to arbitrate between two vague goals. It might give a two-sentence answer that technically satisfies "concise" but ignores "helpful," or it might give an exhaustive response that satisfies "helpful" but ignores "concise." Either way, the agent is improvising because the instructions gave it no real frame for the conflict.
A research assistant agent might have instructions that say something like: "You are a research associate working for [user name]. Your role is to find, summarise, and organise information on topics the user assigns. Always cite your sources. Flag uncertainty rather than guessing. Present findings in a clear brief, not a wall of text."
A code review agent might have very different instructions: "You are a principled code reviewer. Focus on correctness, clarity, and performance. Do not praise code unnecessarily. When you find an issue, explain why it matters and suggest a concrete fix. Keep responses short."
The difference between those two sets explains a lot about why two agents can feel like entirely different systems, even if they use the same model underneath.
Memory: The Workspace and Context
Memory in an agent is not like human memory. It is a structured store of information kept and updated as the agent works. It is what lets the agent hold a thread across multiple steps without starting from scratch each time.
Most agents use some combination of three types:
-
Working context - what is active right now: the current goal, what has been tried so far, what the user last said
-
Stored information - what the agent has been told about the user, their preferences, their past requests
-
Files and state - what exists in the workspace right now, what has been written or read recently
This is not a personality feature. It is not the agent "remembering" in the way a person remembers their childhood. It is operational continuity. The system maintaining a thread of relevant information across time and steps.
Different platforms handle these differently. LangChain agents build up a rolling context window: the current request gets appended to everything that happened before, and the whole thing is passed to the model. If the conversation gets long, older turns get dropped or summarised to make room. AutoGen agents can maintain shared memory across a team, so that when one agent finishes a task, what it learned is available to the next agent that picks up the thread.
OpenClaw takes yet another approach. Its memory layer is a structured store that agents write to and read from across sessions. When an agent starts a new session, it can query that memory store for relevant context rather than relying solely on what was in the most recent conversation. An agent can know that the user prefers short emails, even if that was established three weeks ago.
Stephen: If memory can be stored in files, does it mean that agents can have nearly unlimited memory (within the limits of the computer or server's overall memory capacity)?
There are practical limits even when storage is effectively unbounded. The more relevant limit is not how much the agent can store, but how well it can find and use what it has stored. A full inbox is not the same as a well-organised one. Retrieval becomes harder as memory grows, and irrelevant information can dilute the signal if the system does not manage it carefully.
Think of it this way. A context window that holds 128,000 tokens can technically hold a lot of information. But it can only hold what was placed there. An agent with a large memory store full of useful context still needs a way to surface the right information at the right time. If it cannot find what it needs, or if what it finds is buried under noise, the effective memory is constrained.
The quality of retrieval matters as much as the quality of storage. An agent that retrieves relevant context poorly is effectively working with a much smaller memory than one that retrieves well, even if both store the same amount.
Stephen: So, tell me if I understood this. The agent has an index telling it where to find information specific to certain topics or tasks. When the LLM part of the agent decides it needs to deal with a certain topic, it uses the index to read and load the information from the memory file into its context. Is that right?
That is broadly right. The memory store, the index, and the retrieval into context are the key parts. One small correction worth noting: the decision to retrieve from memory is typically made by the agent or coordinator layer, not by the LLM directly. The LLM receives the retrieved content as part of its context, but it is the agent system that decides what to look up and when. This distinction matters because it is the agent layer, not the model, that is doing the memory management.
Stephen: But isn't the agent's brain the LLM? Clarify the distinction in your answer above. Which part of the agent's infrastructure deals with this?
It is a fair challenge. The LLM is genuinely where the reasoning happens. It reads context, generates text, and makes decisions about what to say or do next. But it is also just a text processor. It receives input, produces output, and has no awareness of anything beyond the tokens it has been given.
The coordinator layer is the infrastructure that sits around the LLM and manages the process. It reads the LLM's output, decides whether to act on it, calls tools, retrieves memory, and feeds results back into the next LLM call. It is the difference between the LLM thinking and the agent doing. A bare LLM generates text. The coordinator turns that text into action.
To use a rough analogy: the LLM is like a pilot who can read instruments and make decisions. The coordinator is like air traffic control - it decides which runway to use, when to land, and when to divert. The pilot's brain does the reasoning. But without the infrastructure around it, the pilot just sits in the cockpit thinking.
So when we say the agent retrieves memory, we mean the coordinator retrieves it and places it where the LLM can see it. The LLM does not reach into a file and pull something out. The coordinator does that work and presents the result to the LLM as part of the next context.
Stephen: And are the bits of these files then loaded into the LLM's context? Therefore, the more stuff is loaded from the memory files, the more the context fills up, affecting the rest of the conversation and cost, right?
Yes, exactly right. Memory retrieval feeds into the context window, which is the LLM's working space for the current session. Every token that goes into the context window is a token the LLM processes and a token that costs something. Loading a lot of context from memory means less room for the conversation itself, and it means higher token usage on every call.
This is one of the practical engineering tensions in agent design. Loading more memory gives the agent more to work with, but it also makes each LLM call more expensive and slower. A well-designed agent retrieves only what is relevant to the current task, not everything it knows.
Tools: What the Agent Can Actually Do
Tools are the capabilities that let an agent act beyond generating text. The model decides to use a tool. The tool performs an action and returns the result to the model.
This was covered in Chapter 1 under "Tools Are the Hands." Here it is worth noting that tools are also where agents differ most between platforms. Some agents come with a large built-in toolkit. Others can call external tools through open protocols. Some let you build custom tools. Others are more locked down.
What tools might an agent actually have? A research agent might be able to search the web and read files on your machine. A coding agent might run shell commands and read or write files. A calendar agent might check your schedule and send messages. The tool is the bridge between the model's decisions and the world the agent is working in.
What matters is not how many tools an agent has, but whether the tools it has are the right ones for the tasks you want it to perform.
Different platforms implement tools differently. LangChain provides a standardised tool interface that lets you connect to search APIs, databases, file systems, and custom functions. OpenCode agents run inside a development environment, where the tools available are the commands and interfaces of that environment. OpenClaw uses an open tool protocol that lets agents call external capabilities regardless of who built them. Hermes takes a more composed approach: a skill file specifies not just what the agent should do, but which tools to use and in what combination to carry out a specific task.
Here is the thing worth unpacking. A tool on its own is just a capability. What makes it useful is the bridge between what the agent is trying to accomplish and the tool that can help. A calendar tool is useless if the agent does not know it should check the schedule. An agent running a meeting-preparation skill that says "check availability, send invites, prepare a briefing document" has that bridge built in.
The Execution Loop: The Part That Makes It an Agent
The execution loop is the cycle that takes an agent from a single-shot response to a sustained process. Observe, think, act, check, repeat.
This was the core of Chapter 1. But it is worth restating here, in the context of anatomy, because the loop is what ties all the other parts together. Without it, you have a model that receives instructions and context and produces text. With it, you have a system that can pursue a goal across time, recover from partial failures, and stop when the work is genuinely done.
The loop is the difference between an agent and a very well-instructed chatbot.
Here is why the repeat step matters so much. A model has no native sense of when it is done. When you call a function in code, the function returns and you are finished. When a model generates text, it produces tokens until it hits a stop condition built into the model itself, most commonly a token limit or a designated stop sequence. These conditions tell the model when to stop generating, but they do not tell the agent whether the result is actually what the user wanted. There is no built-in check that says "is this the right answer?"
The execution loop provides that check. The check phase asks: is the result good? Does it meet the original goal? If not, the loop continues. Sometimes that means a dozen or more cycles before a task is genuinely complete.
The loop also determines how goals decompose. In LangChain's ReAct-style agents, the loop runs inside a single agent: observe, decide on the next action, execute it, check the result, repeat. In AutoGen, the loop is distributed across multiple agents that hand off to each other. A planner agent might coordinate specialist agents, each running their own loop on their own piece of the problem. OpenClaw uses a coordinator agent to manage the loop, assigning work to sub-agents and handling the check phase across the full task rather than within a single agent cycle.
The architecture of the loop is one of the most significant differences between agent platforms. But the function is the same everywhere: turning a sequence of isolated model calls into a coherent, goal-directed process.
Multiple Platforms: Comparing the Formula in Practice
It helps to see the same five-part formula playing out in different platforms. Here is how a few of them map onto it.
LangChain is one of the most widely-used agent frameworks. A LangChain agent has an LLM at its core, a set of tools, a prompt defining the agent's role, memory that accumulates conversation history, and an agent executor that runs the loop. The loop in LangChain is explicit: the agent executor repeatedly calls the model, parses the model's tool-call output, runs the tool, and feeds the result back until the model says it is done.
AutoGen takes a different approach. Rather than a single agent, AutoGen sets up a team of agents that communicate with each other. Each agent has a model, instructions defining its role, and its own set of tools. The loop is distributed: there is no single execution cycle. Agents exchange messages, delegate tasks to each other, and the overall process continues until the team has finished the assigned goal. Memory in AutoGen can be shared across agents so that one agent's work is available to the next.
OpenClaw uses a coordinator agent that manages the overall execution loop. Sub-agents each have their own identity, tools, and memory. The coordinator decides which sub-agent handles which part of a task, passes context between them, and handles the check phase across the full goal. Skills in OpenClaw are files that tell a specific agent how to carry out a particular task, combining instructions about what to do with definitions of which tools to use.
Hermes also uses a skill-based architecture where skill files define both the instructions and the tool configuration for specific tasks. Rather than a single general-purpose agent, Hermes composes agents from skills that know how to use particular tools in particular contexts.
OpenCode works differently again. It runs agents inside a development environment, typically a cloud workspace. The tools available to the agent are the commands and interfaces of that environment. The loop is typically managed at the task level: the agent receives a task, works through it using the tools at its disposal, and reports back. There is less of a formalised multi-step loop and more of a task-completion focus.
None of these platforms invents new parts of the agent formula. They all use a model, instructions, memory, tools, and an execution loop. What differs is how those parts are implemented, how they are divided up, and how they communicate. Understanding the formula means you can look at any of these platforms and see what you are actually looking at.
What This Chapter Covered
This chapter pulled apart the five components of the agent formula.
We saw how the model is the reasoning core but cannot act or remember on its own. How instructions shape the agent's focus and behaviour, and why the same model with different instructions can feel like a different system entirely. How memory provides operational continuity across steps and sessions, and why retrieval quality matters more than storage capacity. How tools extend what the agent can do beyond generating text, and why a tool is only as useful as the bridge between the model's decisions and the action the tool can take. And how the execution loop is the architecture that turns isolated model calls into a coherent, goal-directed process.
We also saw how different platforms implement the same five components differently: LangChain's explicit agent executor, AutoGen's team-based coordination, OpenClaw's coordinator and skill-based sub-agents, Hermes's composed skill architecture, and OpenCode's environment-integrated approach.
The goal was not to become an expert on any one platform. It was to show that agents are not mysterious black boxes. They are systems built from a small number of recognisable parts, and once you know what to look for, you can see the anatomy underneath any agent platform you encounter.
Next up in Agents Unpacked: we dig into tools and skills: what it actually means for an agent to do something rather than just say it, and why a well-tooled agent operating autonomously in a loop is a fundamentally different thing from a model answering questions.
<< Previous Post: From Answer to Outcome
>> Next Post: Coming Soon
21 Jun 2026 9:48pm GMT
19 Jun 2026
Django community aggregator: Community blog posts
Issue 342: DSF Executive Director Search
## News
Announcing the Search for a DSF Executive Director
The Django Software Foundation is hiring its first Executive Director, and we have the Django community to thank for making it possible.
Six Django web development agencies have jointly pledged $47,500 to help fund the Executive Director's first year: Caktus Group, Lincoln Loop, Six Feet Up, Cuttlesoft, OddBird, and Two Rock. This is the financial foundation we needed to move from "we should hire an ED someday" to "we are hiring an ED now."
I'm delighted to rejoin the Sovereign Tech Fellowship
Hugo van Kemenade returns to the Sovereign Tech Fellowship after being one of six participants in the 2025 pilot, calling out how dedicated time helped ship Python 3.14 and 3.15 releases, mentor triagers, and improve release automation and accessibility. The post also tracks a wide set of community and governance work, and looks ahead to a larger 2026 cohort spanning maintainers, community managers, and technical writers.
Python Software Foundation
Python Software Foundation News: PSF Board Election Dates for 2026
PSF Board elections for 2026 open for nominations on July 28 (2:00 pm UTC) and voting runs September 1 to September 15, with voter affirmation due August 25. The Packaging Council election will run in parallel under PEP 772, and PSF member voting eligibility is handled via psfmember.org.
Updates to Django
Last week we had 24! pull requests merged into Django by 11 different contributors.
This week's Django highlights 🦄:
-
Added --using option to sendtestemail management command. (#37141)
-
As a performance optimization, add an option to cull the DBCache only on every n queries. (#32785)
-
Reduced false positives in strings during collectstatic. (#36969, #35371)
Sponsored Link
Reach 4,300+ Engaged Django Developers
Sponsor this newsletter to reach an active community of Python and Django developers.
Articles
In search of a new contribution model
Carlton Gibson on why open source's contribution model is broken--burnout, extractive contributions, harassment, and now AI--and his plans to experiment with something less open-by-default on newer projects.
The University In The AI Era
From Carson Gross, creator of HTMX and full-time college professor, a detailed and practical look at what AI means for universities in general and computer science programs in particular.
How I Work From Anywhere Without Losing My Place
Jeff has been running a new remote dev setup that allows for seamless switching between home office, an iPad, or even a phone when out on the go.
LLM-Inspired Development
How a bad idea from an LLM led to a good idea on a website.
Tech doesn't matter? Why to use Django for agentic coding
Ronny Vedrilla argues that in the age of agentic coding, Django's opinionated structure, secure-by-default posture, and heavy representation in training data make it an ideal "harness" that keeps AI agents on the rails-not a competitive edge, but a hedge against shipping a quiet disaster.
Videos
The Modern Python Web Stack: Django, FastAPI, uv, Pydantic, and AI
A 5-minute conversation from PyCon US with Jeff Triplett on how Python web development is changing fast. (Yes, this video features Jeff and Will, the two authors of this newsletter, but we still think it warrants mention! 🤝)
Podcasts
Teaching Python #158: Will Vincent on Django, AI Coding, and Why Fundamentals Still Matter
A chat on why Django continues to matter, the reality behind vibe coding, local AI models, and more.
Django Forum
Call for mentors - GSoC 2026 with Django!
Google Summer of Code is around the corner and there is still a need for mentors on some projects.
Ticket 34753, Document how to properly escape to in email messages
An active discussion around this particular issue. Checking the forum is a great way to get a pulse on what's happening with core Django development.
Django Fellow Reports
Jacob Walls
In this four-day week (I headed out Friday for a college reunion), everything got a little bit better. First, check out @blighj's estimate showing that collectstatic's import statement detection reliability (needed to rewrite URLs) improves in Django 6.1 from 88% to 99%. Meanwhile @felixxm is stress-testing database defaults and landing fixes needed for using Django 6.1's UUID4()/UUID7() functions. Finally, we made the test client more friendly for third-party permission packages like django-guardian and django-rules. @sage also spotted a breakage in DRF in the upcoming Django 6.1 beta, since Wagtail tests against Django's main branch. I expect the fix to land before the beta is even out. Be like wagtail and test main!
Natalia Bidart
This week had a bit of a reset feel to it 🧹. After the previous stretch of PyCon US, security prep, and the security release itself 🏁, I spent time going through pending and snoozed items ⏰, trying to close loops and get things back to a more manageable state.
We also reviewed and triaged a batch of security reports 🎁 that were shared by a major AI company, following conversations I had at PyCon US 🐍 🏖️ about the growing volume of LLM-generated security submissions and the challenges they create for OSS projects (Django in particular). The reports were generated using an advanced security-focused model 🤖 against the Django codebase. We evaluated each finding, confirming and addressing valid issues where appropriate and mapping others to existing tickets and prior reports. Overall, Django is in good shape 💪, as the results largely overlapped with known reports, validated our current triage approach, and reinforced confidence in our security stance 👏.
Events
Django Girls Krakow on 18th July 2026
This event is taking place during EuroPython at the sprints venue.
Django Day Copenhagen 2026
Djangonauts from in and around Denmark are meeting up for the second edition of Django Day Copenhagen 2026, October 2.
International Travel to DjangoCon US 2026
Are you attending DjangoCon US 2026 in Chicago, Illinois, but you are not from US and need some travel information? Here are some things to consider when planning your trip.
Join DEFNA! There's a seat on the DEFNA board open
Django Events Foundation North America (DEFNA) is looking for another board member. We have eight board members currently and are looking for another person passionate about growing the DjangoCon US community to join.
Django Job Board
Senior Python/Django Developer at Gryps 🆕
Founding ML/Data Scientist (Remote, UK) at MyDataValue
Projects
ranahaani/GNews
A Happy and lightweight Python Package that Provides an API to search for articles on Google News and returns a JSON response.
jazzband/django-newsletter
An email newsletter application for the Django web application framework, including an extended admin interface, web (un)subscription, dynamic e-mail templates, an archive and HTML email support.
19 Jun 2026 3:00pm GMT
17 Jun 2026
Django community aggregator: Community blog posts
The 2026 way of using importmaps in Django
The 2026 way of using importmaps in Django
I last wrote about Django, JavaScript modules and importmaps in May 2025, slightly over a year ago.
The main topic of this post is the django-js-asset 4.0 release. The library is used in many places, some of the more well-known packages using it are django-mptt and django-ckeditor. I have since done a lot of work evolving the ways of integrating importmaps but the efforts to standardize upon an approach have stalled a bit. The main reason for this, apart from time and energy, was that I wasn't really all that happy with the global importmap. When I had only a few modules using the importmap facility, I didn't care all that much. Now that the recently released django-content-editor 9.0 also uses importmaps for shipping a refactored, much more modular JavaScript implementation while still keeping all the benefits of cache busting using ManifestStaticFilesStorage1, having a global importmap got annoying. The content editor JavaScript is only used within the Django administration interface, but when using a single global importmap object, the importmap entries were always there on each page that used an importmap at all.
A better solution was needed. I'm a big fan of using forms.Media for collecting CSS and JavaScript from widgets, forms and utilities. It helps me avoid inline JavaScript since at least 2017. I'm not using it for site-wide CSS and JavaScript, I'm still transpiling, PostCSS-ing and bundling the assets using rspack as for example written about here and here.
Why importmaps?
A quick refresher on why this matters at all. Django's ManifestStaticFilesStorage hashes the contents of each file into its name for cache busting, but out of the box it doesn't rewrite the import statements inside JavaScript modules. Importmaps bridge the gap: your code imports a stable name:
import { initializeEditors } from "django-prose-editor/editor"
and the importmap tells the browser where that name actually lives:
<script type="importmap">
{"imports": {
"django-prose-editor/editor": "/static/django_prose_editor/editor.6e8dd4c12e2e.js"
}}
</script>
So the import stays clean and constant while the file behind it can get a new hash on every deploy.
django-js-asset 4.0
The updated django-js-asset 4.0 doesn't ship the old, global importmap at all. This means the upgrade might require some work. Instead of one importmap shared across the whole site, you now get a specific importmap assembled for the context at hand - either by Django itself when it collects the media of your forms, widgets and the admin, or explicitly by you in a view or context processor. The building block in both cases is the ImportMap object; when it travels through js_asset.Media (a subclass of django.forms.Media) the maps are automatically merged into a single <script type="importmap">, by customizing and extending what Django does already when merging media instances.
The release notes go into more detail.
In practice
If you're using a package such as django-prose-editor in the Django admin you don't have to do anything, things should just work.
If you're using such a package outside the admin, you have to remove "js_asset.context_processors.importmap" from your list of context processors. On one particular website the prose editor is the only package with importmap entries outside the admin, so I have to add the importmap to the template context myself:
from django_prose_editor.widgets import importmap
def view(request, ...):
return render(request, "template.html", {
# ...
"importmap": importmap,
})
The template then just renders it in the <head>:
... {{ importmap }}</head>
On a different site, I have a slightly more involved scenario where I previously used importmap.update(...) to add my own entries to the importmap. There, I'm using a custom context processor to always add these entries to the importmap too:
from django_prose_editor.widgets import importmap as dpe_importmap
from js_asset import ImportMap, static_lazy
_site_importmap = ImportMap({
"imports": {
"my-module": static_lazy("my-module.js"),
}
})
_importmap = dpe_importmap | _site_importmap
def importmap(request):
return {"importmap": _importmap}
This importmap is merged once at server startup and then served repeatedly to the client. Because we use the lazy version of the static function we can do this during startup and not worry about files not yet collected by collectstatic - we'll get the correct paths later.
On the same site as the previous example, I also have an admin inline which requires some JavaScript and also an importmap:
from django.contrib import admin
from django.forms import Script
from js_asset import Media, ImportMap
# Initializing this once. Not necessary but I like it better that way.
_importmap = ImportMap({
"imports": {
# ...
}
})
class ModelInline(admin.StackedInline):
@property
def media(self):
return Media(
js=[
_importmap,
Script("module.js", type="module"),
]
)
As of 4.0, JS and CSS produce Django's own Script and Stylesheet objects, so you can import and use Script directly from django.forms as shown above (on Django 4.2-5.1, import it from js_asset instead, which backports it). The familiar JS("module.js", {"type": "module"}) wrapper still works unchanged if you prefer it - it just takes a positional dict instead of keyword arguments.
Here, it's really important to use the js_asset.Media and not django.forms.Media. js_asset.Media knows how to handle importmaps - all importmaps are collected from all media lists, merged and added to the output before all other CSS and especially JavaScript. The reason for that is that browsers only honour a single importmap per page, and it really has to appear before all JavaScript modules referencing any entries in the importmap.
The nice thing about js_asset.Media is that it doesn't have to appear first in the list of media classes which are merged - it can also appear in the middle or last, and still can do its magic after all Media objects have been merged into a single one.
The rest is handled by Django itself, since it already supports collecting media assets. The missing piece was just the importmap object and the js_asset.Media class which knows how to special case them, and which - through the power of overriding __add__ and __radd__ takes over all the other media instances.
What's next
I haven't yet used CSP nonces using {% csp_nonce_attr media %} in production myself, but it should just work, even with importmaps and everything else. Given that I have a passing test suite I have no reason to believe it doesn't already work, but I'd like to have a confirmation.
I'm hoping to standardize some more. If we could get something like this in Django core that would be really nice. Maybe I'll be able to work on that at Django on the Med 🏖️. Since no browser supports multiple importmaps as of today having multiple implementations of importmaps in the Django ecosystem will lead to trouble down the road. I think there is a clear case to be made for importmap support in Django and I would obviously love it if the approach implemented today in django-js-asset would be the basis for the official solution.
-
Without having to do any overrides to enable ESM support. ↩
17 Jun 2026 5:00pm GMT
16 Jun 2026
Django community aggregator: Community blog posts
Cheating as a programming discipline
Great programmers cheat. A hard problem gets quietly swapped for an easier one; a transaction-grade database is replaced by a flat file nobody misses; machinery everyone else considers mandatory simply never gets built. They know a lot - and that's exactly why they get away with it.

16 Jun 2026 11:00am GMT
09 Jun 2026
Planet Twisted
Hynek Schlawack: How to Ditch Codecov for Python Projects
Codecov's unreliability breaking CI on my open source projects has been a constant source of frustration for me for years. I have found a way to enforce coverage over a whole GitHub Actions build matrix that doesn't rely on third-party services.
09 Jun 2026 12:00am GMT
22 May 2026
Planet Twisted
Glyph Lefkowitz: Opaque Types in Python
Let's say you're writing a Python library.
In this library, you have some collection of state that represents "options" or "configuration" for a bunch of operations. Such a set of options is a bundle of potentially ever-increasing complexity. Thus, you will want it to have an extremely minimal compatibility surface, with a very carefully chosen public interface, that is either small, or perhaps nothing at all. Such an object conveys state and might have some private behavior, but all you want consumers to be able to do is build it in very constrained, specific ways, and then pass it along as a parameter to your own APIs.
By way of example, imagine that you're wrapping a library that handles shipping physical packages.
There are a zillion ways to do it ship a package. There are different carriers who can ship it for you. There's air freight, and ground freight, and sea freight. There's overnight shipping. There's the option to require a signature. There's package tracking and certified mail. Suffice it to say, lots of stuff.
If you are starting out to implement such a library, you might need an object called something like ShippingOptions that encapsulates some of this. At the core of your library you might have a function like this:
1 2 3 4 5 |
|
If you are starting out implementing such a library, you know that you're going to get the initial implementation of ShippingOptions wrong; or, at the very least, if not "wrong", then "incomplete". You should not want to commit to an expansive public API with a ton of different attributes until you really understand the problem domain pretty well.
Yet, ShippingOptions is absolutely vital to the rest of your library. You'll need to construct it and pass it to various methods like estimateShippingCost and shipPackage. So you're not going to want a ton of complexity and churn as you evolve it to be more complex.
Worse yet, this object has to hold a ton of state. It's got attributes, maybe even quite complex internal attributes that relate to different shipping services.
Right now, today, you need to add something so you can have "no rush", "standard" and "expedited" options. You can't just put off implementing that indefinitely until you can come up with the perfect shape. What to do?
The tool you want here is the opaque data type design pattern. C is lousy with such things (FILE, pthread_*_t, fd_set, etc). A typedef in a header file can easily achieve this.
But in Python, if you expose a dataclass - or any class, really - even if you keep all your fields private, the constructor is still, inherently, public. You can make it raise an exception or something, but your type checker still won't help your users; it'll still look like it's a normal class.
Luckily, Python typing provides a tool for this: typing.NewType.
Let's review our requirements:
- We need a type that our client code can use in its type annotations; it needs to be public.
- They need to be able to consruct it somehow, even if they shouldn't be able to see its attributes or its internal constructor arguments.
- To express high-level things (like "ship fast") that should stay supported as we add more nuanced and complex configurations in the future (like "ship with the fastest possible option provided by the lowest-cost carrier that supports signature verification").
In order to solve these problems respectively, we will use:
- a public
NewType, which gives us our public name... - which wraps a private class with entirely private attributes, to give us an actual data structure, while not exposing the constructor,
- a set of public constructor functions, which returns our
NewType.
When we put that all together, it looks like this:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
|
As a snapshot in time, this is not all that interesting; we could have just exposed _RealShipOpts as a public class and saved ourselves some time. The fact that this exposes a constructor that takes a string is not a big deal for the present moment. For an initial quick and dirty implementation, we can just do checks like if options._speed == "fast" in our shipping and estimation code.
However, the main thing we are doing here is preserving our flexibility to evolve the related APIs into the future, so let's see how we might do that. For example, let's allow the shipping options to contain a concrete and specific carrier and freight method:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 |
|
As a NewType, our public ShippingOptions type doesn't have a constructor. Since _RealShipOpts is private, and all its attributes are private, we can completely remove the old versions.
Anything within our shipping library can still access the private variables on ShippingOptions; as a NewType, it's the same type as its base at runtime, so it presents minimal1 overhead.
Clients outside our shipping library can still call all of our public constructors: shipFast, shipNormal, and shipSlow all still work with the same (as far as calling code knows) signature and behavior.
If you need to build and convey some state within your public API, while avoiding breakages associated with compatibility churn, hopefully this technique can help you do that!
Acknowledgments
Thanks for reading, and thank you to my patrons who are supporting my writing on this blog. If you like what you've read here and you'd like to read more of it, or you'd like to support my various open-source endeavors, you can support my work as a sponsor.
-
The overhead is minimal, but it is not completely zero. The suggested idiom for converting to a
NewTypeis to call it like a function, as I've done in these examples, but if you are wanting to use this pattern inside of a hot loop, you can use# type: ignore[return-value]comments to avoid that small cost. ↩
22 May 2026 12:33am GMT