30 May 2026

feedPlanet Python

death and gravity: DynamoDB crash course: part 3 – design patterns

This is the last part of a series covering core DynamoDB concepts. The goal is to help you understand idiomatic usage and trade-offs in under an hour.

In the first part, I summarized DynamoDB's main proposition to its users like so:

data modeling complexity is always preferable to complexity coming from infrastructure maintenance, availability, and scalability

Today, we're looking at the design patterns to help manage this complexity, make the most of DynamoDB's data model and features, and work around its limits.

Contents

Composite keys #

Composite (aka synthetic) keys underpin most other patterns.

The idea is simple: keys don't have to be natural attributes of your data, they can be composed of other attributes that enable specific access patterns. This works both with table and index keys.

How do you compose keys? By string concatenation, of course! (Careful with numbers though, they need padding to be useful in sort keys.)

Example

To sort lexicographically by more than one attribute, you group them in a sort key, e.g. {Album}#{Song}.

Or, in single table design, you distinguish between item types by prefixing keys with the type, e.g. album#{Album}.

Or, in partition key sharding, you spread the load on a GSI partition by splitting one partition key into multiple ones, e.g. {Genre}#{shard}.

But denormalization has its trade-offs. For sort key {Album}#{Song}, should Album and Song also be separate attributes? If yes, you need to ensure they never change, but you can use them in indexes (e.g. a GSI with Album as primary key). If no, the item cannot become inconsistent, but you always need to parse the key.

This was inconvenient enough that DynamoDB finally added multi-attribute keys support to GSIs in 2025 (although not inconvenient enough to also add it to tables).

See also

Single table design #

The AWS guidance is to use as few tables as possible:

As a general rule, you should maintain as few tables as possible in a DynamoDB application. [...] A single table with inverted indexes can usually enable simple queries to create and retrieve the complex hierarchical data structures required by your application.

This culminates in single table design, where you put all entities in the same table, and tell them apart based on the key format, usually using a prefix. With this pattern, one DynamoDB table corresponds to a whole relational database.

The easiest way is to put items related to a top-level entity on the same partition. The main benefit is that joins with the top-level entity become trivial. A second one is that you can sometimes get different entity types in a single query, which can be both faster and cheaper (fewer queries; small items pack into fewer capacity units).

Example

You can group items related to an Artist on the same partition, with sort keys like artist, album#{Album}, and song#{Album}#{Song}.

# table Music (partition key: Artist, sort key: sk)
Solar Fields: !btree
  'album#Leaving Home': { Genre: Electronic }
  'artist': { Variations: [ Solarfields ] }
  'song#Leaving Home#Air Song': { Duration: 741 }
  'song#Leaving Home#Monogram': { Duration: 944 }

Besides getting items of a single type, you can also get artist details and albums in a single query (sk BETWEEN "album#" AND "artist").

But choose wisely - queries can have only one sort key condition, so you can't also get album details and songs in a single query with this schema; sort keys {Album} and {Album}#{Song} would do it, at the expense of the first query.

Sometimes, it can be useful to put some sub-entities on dedicated partitions, accepting that joins will have to be done in code.

Example

In the example above, a popular artist with lots of songs can lead to:

Perhaps it's better to put songs in each album on separate partitions:

# table Music (partition key: pk, sort key: sk)
'artist#Solar Fields': !btree
  'album#Leaving Home': { Genre: Electronic }
  'artist': { Variations: [ Solarfields ] }
'song#Solar Fields#Leaving Home': !btree
  'Air Song': { Duration: 741 }
  'Monogram': { Duration: 944 }

This spreads the load onto multiple partitions, which should fix throttling.

The downside is that list songs for artist is now a two-step operation: first one query for the albums, then one query per album for the songs. The upside is that the per-album queries can be done in parallel, which wasn't possible before.

A consequence of this design is that you need a GSI to list items of a specific type (otherwise, you have to do a full table scan). Of note, exceeding the GSI partition throughput limit will cause write throttling on the base table; in the absence of a natural high-cardinality GSI partition key, sharding or some other composite key can help.

A final benefit of using a single table is better utilization with provisioned mode: usage gets averaged across entities and tends to be smoother, and spikes can share the same spare capacity.

See also

GSI overloading #

GSI overloading is just single table design for indexes - you put different values in the GSI key attributes, depending on item type. This way you can index more attributes than the 20 GSIs per table quota, and it can be cheaper too, since fewer indexes make better use of spare provisioned capacity.

Example

For a table that contains both artist and album items, a single GSI can be used for entirely different purposes:

# table Music (partition key: Artist, sort key: sk)
2 Bit Pie: !btree
  'album#2 Pie Island': { gsi1pk: 'album#Electronic' }
  'artist': { gsi1pk: 'artist#United Kingdom' }
Ishome: !btree
  'album#Confession': { gsi1pk: 'album#Electronic' }
  'artist': { gsi1pk: 'artist#Russia' }
# GSI GSI1 (partition key: gsi1pk, sort key: Artist)
'artist#United Kingdom': !btree
  2 Bit Pie: { sk: 'artist' }
'artist#Russia': !btree
  Ishome: { sk: 'artist' }
'album#Electronic': !btree
  2 Bit Pie: { sk: 'album#2 Pie Island' }
  Ishome: { sk: 'album#Confession' }

See also

Partition key sharding #

Sometimes, a partition key composed of multiple natural attributes is not enough to spread the load evenly across partitions; you can deal with this by putting items with the same natural attributes on multiple partitions.

So, what partition key should you use? One option is to use a random suffix from a known range; this still allows you to list items for a natural attribute value by doing multiple queries, one for each suffix.

Example

For a table of songs, using Album as the partition key won't work, since not all songs are released on an album; Artist always has a value, but some artists have hundreds or even thousands of songs, which can lead to throttling.

Instead, we can use {Artist}#{randrange(10)} as partition key, which allows ten times as many items before we reach throughput limits. To list an artist's songs:

for shard in range(10):
    for item in dynamodb.query(f"{artist}#{shard}"):
        yield item

A downside of random suffixes is that you can't get a specific item, because you don't know what its suffix is. A better option is to calculate the suffix from an attribute that you do know, for example using its hash modulo N.

Example

With primary key {Artist}#{hash(Song) % 10)}, we can get a song like this:

def hash(s):
    return int.from_bytes(sha256(s.encode()).digest())

shard = hash(song_title) % 10
dynamodb.get_item(f"{artist}#{shard}", song_title)

A lot of times you need to list items by a low-cardinality attribute, so sharding may be even more important for GSIs.

Example

Assuming dedicated album items, you can list all the albums by putting them in a single GSI partition key called albums, but this will definitely cause throttling.

To avoid it, you can use GSI partition key album#{hash(Album} % 100} if you don't care about the order, or something like album#{Album[:2].lower()} if you do (but likely more sophistication is needed - th will be a very common album title prefix, and some album titles don't contain letters at all).

Even if throttling is not an issue (e.g. single infrequent reader), sharding allows you to query multiple partitions in parallel, which can speed up getting the entire result set.


So, how many shards should you have? That depends on the number, size, and how often you access the items, and is also a trade-off - too many shards means additional queries and latency, too few shards means you still overload the partitions sometimes.

Importantly, increasing the number of shards is non-trivial. For tables, you usually need to rebalance the items in place. For indexes, it's cleaner to move to a new index, or if you just need to list items by type, you can put all new items on new shards.

Regardless, you have to support it in code, do a backfill, and orchestrate the migration, which all become more complex if downtime and inconsistencies are not acceptable (e.g. if you expose a pagination token based on LastEvaluatedKey, you may want to support both versions during the switch).

See also

Sparse indexes #

An item with missing index partition/sort key attributes won't appear in the index, and you won't pay for it. This can be used deliberately to query a subset of the items in the table, like those of a specific type or in a specific state.

Example

Assuming dedicated album items, an alternative way to list all the albums is to have a GSI with {Album} as partition key, and just scan the entire index (the primary key has to be a dedicated attribute that only albums have, so that only album items appear in the index).

Or, you can use a dedicated GSI with CoverOf as primary key to list cover songs.

See also

Base table indexes #

In some cases, GSIs won't cut it - maybe you need a strongly consistent index, or need to model a many-to-one relationship (indexes map one item in the base table to one item in the index).

Instead, you can maintain an index in the base table by having additional index items associated with the main item; to guarantee atomic updates, use transactions. You then go from the main item to the index items via a main item attribute, and from the index items to the main item via their partition key.

Example

Songs have different identifiers in external systems, such as ISRC, ISWC, or MBID. To query songs by multiple external ids, you'd structure your database like this:

(Alternatively, you could have one sparse index per external id type, but then you lose strong consistency, and risk running out of GSIs).

Note that modeling one-to-many relationships isn't this involved, since it fits neatly into the related-items-same-partition variant of single table design.

See also

Optimistic locking #

Optimistic locking is a concurrency control method useful when conflicts are rare, so instead of acquiring a lock to do changes, you check if someone else changed the data right before commiting, as part of an atomic operation.

In DynamoDB, that operation is a conditional write; items get an integer version attribute, and every time you want to update an item, you:

  1. read the item, including the version
  2. increment the version and modify the item
  3. update the item, using a condition expression to ensure the version matches
    1. if successful, you're done
    2. else, start over from the beginning

You can also do this in transactions to update groups of related items, like in the base table index pattern above, with only the main item needing a version.

The upside of optimistic locking is that it is faster on average, since updates usually succeed on the first try; for fewer conflicts, use strongly consistent reads.

The downside is that it requires explicit support - it must be possible to start over from the beginning, which complicates logic, especially if you need to interact with other systems besides updating the item (e.g. to send a notification).

See also


Anyway, that's it for now.

See also

For mode details and examples, check out the official documentation:

Learned something new today? Share it with others, it really helps!

Want to know when new articles come out? Subscribe here to get new stuff straight to your inbox!

30 May 2026 6:00pm GMT

Talk Python to Me: #550: AI Contributions and Maintainer Load in Open Source

You wake up, brew the coffee, open GitHub, and there it is. Another pull request on your open source project. Thirteen thousand lines added. No issue filed first. No discussion. Just "here, please review this for me." <br/> <br/> Over the past year, GitHub activity has spiked roughly twelve times in a few short months, and a huge chunk of that signal is landing on the same small group of maintainers who were already stretched thin. The curl bug bounty got buried under AI-generated noise. Jazzband, the home of Django classics like pip-tools and the Django debug toolbar, hit what its maintainer called an "apocalypse" and started sunsetting. Even CPython just shipped fresh guidelines on AI-assisted contributions this week. <br/> <br/> So what does all of this actually look like from the receiving end of the pull request? <br/> <br/> On this episode, Paolo Melchiorre joins us to tell that story from inside the maintainer's chair. Paolo is a director of the Django Software Foundation, an organizer of PyCon Italy, a Django Girls coach, and he has spent the past year carefully collecting examples of how AI is reshaping open source contributions. The good, the bad, and the extra fingers. <br/> <br/> We dig into his PyCon US talk on AI-assisted contributions and maintainer load, why AI is best understood as an amplifier rather than a new kind of contributor, the wildly different policies across 86 open source foundations, whether projects banning AI today are reacting to last year's models.<br/> <br/> <strong>Episode sponsors</strong><br/> <br/> <a href='https://talkpython.fm/agentfield-page'>AgentField AI</a><br> <a href='https://talkpython.fm/training'>Talk Python Courses</a><br/> <br/> <h2 class="links-heading mb-4">Links from the show</h2> <div><strong>Guest</strong><br/> <strong>Paolo Melchiorre</strong>: <a href="https://github.com/pauloxnet?featured_on=talkpython" target="_blank" >github.com</a><br/> <br/> <strong>DSF</strong>: <a href="https://www.djangoproject.com/foundation/?featured_on=talkpython" target="_blank" >www.djangoproject.com</a><br/> <strong>djangonaut-space</strong>: <a href="https://djangonaut.space/?featured_on=talkpython" target="_blank" >djangonaut.space</a><br/> <strong>PyCon Italia</strong>: <a href="https://2026.pycon.it/en?featured_on=talkpython" target="_blank" >2026.pycon.it</a><br/> <strong>uDjango</strong>: <a href="https://github.com/pauloxnet/uDjango?featured_on=talkpython" target="_blank" >github.com</a><br/> <strong>My PyCon US 2026 post</strong>: <a href="https://www.paulox.net/2026/05/21/my-pycon-us-2026/?featured_on=talkpython" target="_blank" >www.paulox.net</a><br/> <strong>AI-Assisted Contributions and Maintainer Load</strong>: <a href="https://www.paulox.net/2026/05/15/pycon-us-2026/?featured_on=talkpython" target="_blank" >www.paulox.net</a><br/> <strong>Senior Engineer Tries Vibe Coding</strong>: <a href="https://www.youtube.com/watch?v=_2C2CNmK7dQ" target="_blank" >www.youtube.com</a><br/> <strong>Code Rabbit AI PR Reviews</strong>: <a href="https://www.coderabbit.ai?featured_on=talkpython" target="_blank" >www.coderabbit.ai</a><br/> <strong>GitHub Usage Graphs</strong>: <a href="https://github.blog/news-insights/company-news/an-update-on-github-availability/?featured_on=talkpython" target="_blank" >github.blog</a><br/> <strong>Update on CPython's AI Policies</strong>: <a href="https://fosstodon.org/@mariatta/116610508567734365" target="_blank" >fosstodon.org</a><br/> <strong>High-Quality Chaos from Curl</strong>: <a href="https://daniel.haxx.se/blog/2026/04/22/high-quality-chaos/?featured_on=talkpython" target="_blank" >daniel.haxx.se</a><br/> <strong>The Generative AI Policy Landscape in Open Source</strong>: <a href="https://redmonk.com/kholterhoff/2026/02/26/generative-ai-policy-landscape-in-open-source/?featured_on=pythonbytes" target="_blank" >redmonk.com</a><br/> <br/> <strong>Watch this episode on YouTube</strong>: <a href="https://www.youtube.com/watch?v=1RJ1kkpTdow" target="_blank" >youtube.com</a><br/> <strong>Episode #550 deep-dive</strong>: <a href="https://talkpython.fm/episodes/show/550/ai-contributions-and-maintainer-load-in-open-source#takeaways-anchor" target="_blank" >talkpython.fm/550</a><br/> <strong>Episode transcripts</strong>: <a href="https://talkpython.fm/episodes/transcript/550/ai-contributions-and-maintainer-load-in-open-source" target="_blank" >talkpython.fm</a><br/> <br/> <strong>Theme Song: Developer Rap</strong><br/> <strong>🥁 Served in a Flask 🎸</strong>: <a href="https://talkpython.fm/flasksong" target="_blank" >talkpython.fm/flasksong</a><br/> <br/> <strong>---== Don't be a stranger ==---</strong><br/> <strong>YouTube</strong>: <a href="https://talkpython.fm/youtube" target="_blank" ><i class="fa-brands fa-youtube"></i> youtube.com/@talkpython</a><br/> <br/> <strong>Bluesky</strong>: <a href="https://bsky.app/profile/talkpython.fm" target="_blank" >@talkpython.fm</a><br/> <strong>Mastodon</strong>: <a href="https://fosstodon.org/web/@talkpython" target="_blank" ><i class="fa-brands fa-mastodon"></i> @talkpython@fosstodon.org</a><br/> <strong>X.com</strong>: <a href="https://x.com/talkpython" target="_blank" ><i class="fa-brands fa-twitter"></i> @talkpython</a><br/> <br/> <strong>Michael on Bluesky</strong>: <a href="https://bsky.app/profile/mkennedy.codes?featured_on=talkpython" target="_blank" >@mkennedy.codes</a><br/> <strong>Michael on Mastodon</strong>: <a href="https://fosstodon.org/web/@mkennedy" target="_blank" ><i class="fa-brands fa-mastodon"></i> @mkennedy@fosstodon.org</a><br/> <strong>Michael on X.com</strong>: <a href="https://x.com/mkennedy?featured_on=talkpython" target="_blank" ><i class="fa-brands fa-twitter"></i> @mkennedy</a><br/></div>

30 May 2026 3:43pm GMT

Bob Belderbos: The control layer is the product, not the model

Gary Bernhardt posted something this week that names a phenomenon we're teaching in our agentic AI cohort:

Everyone seems fixated on the models, but I think there's so much low-hanging fruit in the control layer above the model. "Agent" and "harness" sell that layer short. There's so much more that we can do beyond "read input, send to model, run commands it returns."

He's right. The model is a brain in a jar. Useful, fast, occasionally wrong, stateless. Everything that turns it into a product lives in the code that wraps it: the routing, the validation, the state, the audit trail. Gary calls that the control layer. I'm stealing the term.

One of the replies under the tweet nailed the design goal in a single question: do you actually know what the agent is going to do?

That's what a control layer buys you. Not magic, not autonomy, predictability. A workflow where, by the time the model is called, the next move is already constrained to something safe.

Why "agent" and "harness" sell it short

When a developer says "I'm building an agent", they usually mean a while True loop that pings an LLM, parses a tool call, runs it, feeds the result back, and repeats. That pattern works for demos. It rarely survives contact with a real workflow.

The word "harness" makes the wrapping code sound passive, a strap that holds the model in place. It's actually the control layer where the engineering happens. The model is a function call inside it. Once you flip that mental model, you stop asking "which LLM should I use" and start asking "what guarantees does my control layer make?" and "how can I make the inherently unpredictable model fit into a predictable workflow?"

These are the questions production teams have to answer.

Pattern 1: deterministic state machines, not unconstrained agents

An agent without constraints decides what to do next from inside the model. A state machine decides outside the model and gives the model one bounded job at each step. The pipeline runs categorize → validate → confirm → persist, and the LLM only ever gets called inside one of those buckets.

This shifts control flow back to your code, where you can test it, log it, and reason about it. The expense agent we build in our cohort, which I broke down in How an AI expense agent is actually structured, follows exactly this pattern: Protocol-defined LLM boundary, Pydantic-validated outputs, service layer holds the state, human-in-the-loop (HITL) confirms before anything writes. Four layers, no free-roaming agent, constraints at every step.

Pattern 2: the model behind a typed boundary

The model should be one swappable function call inside your control layer, not a dependency threaded through every layer. In our cohort the LLM lives behind a Python Protocol: a small interface the service layer depends on, so nothing downstream knows or cares whether the call goes to OpenAI or Anthropic.

Once the boundary is a Protocol, the decisions people reach for "routing" to solve become wiring instead of rewrites. Picking a cheap fast model for a 12-way classification and saving the expensive one for hard reasoning is a one-line change. Falling back to a second provider when the first is rate-limited is a small factory, not a refactor. Swapping OpenAI for Anthropic, two SDKs that disagree on almost every detail, touches one file because the boundary absorbs the difference.

And it makes the whole pipeline testable. Tests pass a mock that satisfies the Protocol, so you exercise every path without an API call incurring latency or cost.

Pattern 3: evaluators and guardrails

The model's output is not the user's output. Between the two sits validation: schema checks, business rules, PII filters, sometimes a second model grading the first one's work.

This is the generator-evaluator split and it's an important pattern (apart from HITL) I've found for AI code that has to be right. The generator proposes. The evaluator approves or rejects. When the evaluator rejects, control loops back with feedback, not a stack trace.

It's also the layer that catches the worst failure mode of multi-step agents. What production AI agents actually require goes deeper on the four questions the control layer answers before any action runs: state, idempotency, audit, rollback.

Pattern 4: structured generation

A raw string from the model is the start of your problems. You can't store it, validate it, or test it well. The fix is to constrain output at the boundary: the model is allowed to speak, but only in shapes your code understands.

Where the typed boundary in Pattern 2 decides where the model sits in your code, structured generation decides what shape it's allowed to emit.

Pydantic plus your model's structured outputs gives you typed data instead of strings, which means the next layer of your control flow becomes ordinary Python.

I covered this in Build the data layer before you touch the LLM, explaining why we teach students to build the schema before they make a single API call.


The frontier models make the headlines. The control layer ships the product. Gary's tweet names a gap that has been there the whole time, between the people optimizing benchmarks and the people building products. The control layer is the product, not the model. If you want to build AI products, that's where you need to spend your time.

If you want a working walkthrough of the patterns above, the 10 small agentic AI exercises Juanjo and I shipped, run in the browser and cover the arc from a 3-line model call to a complete loop with HITL. They're the conceptual map.

The cohort is the same map, end to end. Six weeks, no frameworks, the control layer built explicitly, with code review at every step. By the end you can answer that one question: you know what your agent is going to do.

30 May 2026 12:00am GMT

29 May 2026

feedDjango community aggregator: Community blog posts

Issue 339: Early Bird DjangoCon US Tickets Ending Soon

News

DjangoCon US 2026: Early Bird Tickets End May 31st!

Early bird ticket sales for DjangoCon US 2026 end on May 31, 2026, with discounted pricing available. The conference runs five days at Voco Chicago Downtown and includes community-selected talks plus Django contribution sprints.


Wagtail CMS News

Wagtail Space NL - June 12

A full-day conference in Rotterdam, The Netherlands on Wagtail, with talks covering a range of topics, lightning talks, hallway discussions, and more.



Updates to Django

Today, "Updates to Django" is presented by Pradhvan from Djangonaut Space! 🚀

Last week we had 16 pull requests merged into Django by 10 different contributors.

This week's Django highlights: 🦄

If you haven't already, give Django 6.1 alpha 1 a spin and report anything suspicious to the issue tracker! 🎉

That's all for this week in Django development! 🐍🦄


Articles

Upgrade PostgreSQL from 17 to 18 on Ubuntu 26.04

After moving to Ubuntu 26.04, upgrade an existing 17/main cluster to 18 by running pg_upgradecluster 17 main -v 18, then verify the new 18/main cluster is online. Once confirmed, drop the old 17 cluster with pg_dropcluster 17 main and optionally purge postgresql-17 and postgresql-client-17 packages.

My not-so-static new static website

Jake Howard walks through his eighth website rewrite, this time ditching Wagtail for a custom "semi-static" Django setup that renders Markdown content into SQLite at startup and serves it dynamically with Jinja2 templates.

Improving First Byte and Contentful Paint on a Django Website

A look at how to use Django's StreamingHttpResponse to send the ` and above-the-fold content first, letting the browser fetch static assets and start painting while the rest of the page renders.

PyCon US 2026 Recap - Black Python Devs

A recap from from the community booth to open spaces, hallway track, and Jay Miller receiving the PSF Community Service Award.

django-removals 1.2.0 - Now with Django 6.1 deprecations

How the maintainers of django-removals shipped new warnings for the Django 6.1 deprecation wave.

Mentoring GSoC 2026: Experimental Flags - Software Crafts

Mentor and mentee are starting a GSoC 2026 project around an "Experimental Flags" framework for Django core, using the forum to gather requirements and drive early consensus. The plan balances fast iteration with faster-than-normal Django consensus, including an initial third-party package to test ideas before wider adoption.


Django Forum

GSoC 2026: Implementing a Formal Experimental API Framework for Django Core

A lively discussion around how experimental features can be merged into the main repository but remain explicitly non-stable.

Thoughts on advertising on djangoproject.com

New thoughts and comments on the age-old question.


Django Fellow Reports

Jacob Walls

Not much going on, "just" the 6.1 Feature Freeze/alpha release, a sprint at PyCon US, and a kickoff meeting with Google Summer of Code participants & mentors.

Sarah Boyce

As we had the feature freeze, focused on a few feature PRs I had prioritized for 6.1 release.

Natalia Bidart

This week was mostly about returning from PyCon, which was quite exhausting. I arrived back on Wednesday, fairly drained (and very hungry), so I worked during Thu and Fri catching up on a large backlog of email notifications and syncing with the other Fellows.


Events

Django on the Med - September 23-25 in Pescara, Italy

PyCon Italia this week has been Django members in attendance, so it is a good time to remind readers that Django on the Med will be back in Italy later in the year.


Django Job Board

Founding Engineer at MyDataValue


Projects

feincms/feincms3-cookiecontrol

Cookie banner with support for embedded media.

emfpdlzj/django-deploy-probes

HTTP deployment probes for Django applications.

29 May 2026 2:00pm GMT

27 May 2026

feedDjango community aggregator: Community blog posts

Please add an RSS Feed to Your Site

Why syndication feeds are having a moment in 2026.

27 May 2026 9:57pm GMT

Mentoring GSoC 2026: Experimental Flags

Over the last couple of weeks, Google Summer of Code (GSoC) has started for 2026, I think along side my mentee, I will blog about it as we progress through the project. So far, there has been a kick-off meeting with all participants and I have started to chat with my mentee (Praful) about the first steps of our project - Experimental Flags. he has posted to the Forum about the project, asking for feedback on what we want from the project.

Before I say anymore, please go and pitch your opinion and any ideas you may have, the more we have to work with the better! We need you!

What set's this project apart from GSoC projects in recent years is that we have yet to have an agreed solution in place that 'just' needs implementing. So my initial guide will be to focus on consensus gathering and documentation. But being a GSoC project with a limited time availabilty, I do feel the need to push the process forward at a pace for consensus that is faster than the normal Django pace. That said, the potential for this project is wide and expansive, currently with a lot of open questions both as to why we need them and what should be implemented and that's before we get to the details of how to implement this.

So for me, the why of experimental feature flags most things can be done or can be experimented with as a third-party package. I think the requirement for an experimental feature flag is perhaps for that last 10% of a new API, or where you need where getting higher usage of a feature is required to flesh out all of the use cases with a wider audience, this audience is beyond that of the community. If we think of the adoption curve we're talking about the early majority, those developers who are more likely to enable a feature inside Django, with it's stablilty guarantees, than a third-party package. Or perhaps this is the project which allows us as a community to get more flexible with what in the release package(s?) of Django and what code is in the source control repository?

One thing is for sure, I do want to ensure Praful isn't completely stuck so we will be experimenting with these ideas in a third-party package while we build consensus and then perhaps dogfood the process with our this package once consensus has been reached!

Again, go to the Forum and make your opinion known!

27 May 2026 5:00am GMT

22 May 2026

feedPlanet Twisted

Glyph Lefkowitz: Opaque Types in Python

Let's say you're writing a Python library.

In this library, you have some collection of state that represents "options" or "configuration" for a bunch of operations. Such a set of options is a bundle of potentially ever-increasing complexity. Thus, you will want it to have an extremely minimal compatibility surface, with a very carefully chosen public interface, that is either small, or perhaps nothing at all. Such an object conveys state and might have some private behavior, but all you want consumers to be able to do is build it in very constrained, specific ways, and then pass it along as a parameter to your own APIs.

By way of example, imagine that you're wrapping a library that handles shipping physical packages.

There are a zillion ways to do it ship a package. There are different carriers who can ship it for you. There's air freight, and ground freight, and sea freight. There's overnight shipping. There's the option to require a signature. There's package tracking and certified mail. Suffice it to say, lots of stuff.

If you are starting out to implement such a library, you might need an object called something like ShippingOptions that encapsulates some of this. At the core of your library you might have a function like this:

1
2
3
4
5
async def shipPackage(
        how: ShippingOptions,
        where: Address,
    ) -> ShippingStatus:
    ...

If you are starting out implementing such a library, you know that you're going to get the initial implementation of ShippingOptions wrong; or, at the very least, if not "wrong", then "incomplete". You should not want to commit to an expansive public API with a ton of different attributes until you really understand the problem domain pretty well.

Yet, ShippingOptions is absolutely vital to the rest of your library. You'll need to construct it and pass it to various methods like estimateShippingCost and shipPackage. So you're not going to want a ton of complexity and churn as you evolve it to be more complex.

Worse yet, this object has to hold a ton of state. It's got attributes, maybe even quite complex internal attributes that relate to different shipping services.

Right now, today, you need to add something so you can have "no rush", "standard" and "expedited" options. You can't just put off implementing that indefinitely until you can come up with the perfect shape. What to do?

The tool you want here is the opaque data type design pattern. C is lousy with such things (FILE, pthread_*_t, fd_set, etc). A typedef in a header file can easily achieve this.

But in Python, if you expose a dataclass - or any class, really - even if you keep all your fields private, the constructor is still, inherently, public. You can make it raise an exception or something, but your type checker still won't help your users; it'll still look like it's a normal class.

Luckily, Python typing provides a tool for this: typing.NewType.

Let's review our requirements:

  1. We need a type that our client code can use in its type annotations; it needs to be public.
  2. They need to be able to consruct it somehow, even if they shouldn't be able to see its attributes or its internal constructor arguments.
  3. To express high-level things (like "ship fast") that should stay supported as we add more nuanced and complex configurations in the future (like "ship with the fastest possible option provided by the lowest-cost carrier that supports signature verification").

In order to solve these problems respectively, we will use:

  1. a public NewType, which gives us our public name...
  2. which wraps a private class with entirely private attributes, to give us an actual data structure, while not exposing the constructor,
  3. a set of public constructor functions, which returns our NewType.

When we put that all together, it looks like this:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
from dataclasses import dataclass
from typing import Literal, NewType

@dataclass
class _RealShipOpts:
    _speed: Literal["fast", "normal", "slow"]

ShippingOptions = NewType("ShippingOptions", _RealShipOpts)

def shipFast() -> ShippingOptions:
    return ShippingOptions(_RealShipOpts("fast"))

def shipNormal() -> ShippingOptions:
    return ShippingOptions(_RealShipOpts("normal"))

def shipSlow() -> ShippingOptions:
    return ShippingOptions(_RealShipOpts("slow"))

As a snapshot in time, this is not all that interesting; we could have just exposed _RealShipOpts as a public class and saved ourselves some time. The fact that this exposes a constructor that takes a string is not a big deal for the present moment. For an initial quick and dirty implementation, we can just do checks like if options._speed == "fast" in our shipping and estimation code.

However, the main thing we are doing here is preserving our flexibility to evolve the related APIs into the future, so let's see how we might do that. For example, let's allow the shipping options to contain a concrete and specific carrier and freight method:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
from dataclasses import dataclass
from enum import Enum, auto
from typing import NewType

class Carrier(Enum):
    FedEx = auto()
    USPS = auto()
    DHL = auto()
    UPS = auto()

class Conveyance(Enum):
    air = auto()
    truck = auto()
    train = auto()

@dataclass
class _RealShipOpts:
    _carrier: Carrier
    _freight: Conveyance

ShippingOptions = NewType("ShippingOptions", _RealShipOpts)

def shipFast() -> ShippingOptions:
    return ShippingOptions(_RealShipOpts(Carrier.FedEx, Conveyance.air))

def shipNormal() -> ShippingOptions:
    return ShippingOptions(_RealShipOpts(Carrier.UPS, Conveyance.truck))

def shipSlow() -> ShippingOptions:
    return ShippingOptions(_RealShipOpts(Carrier.USPS, Conveyance.train))

def shippingDetailed(
    carrier: Carrier, conveyance: Conveyance
) -> ShippingOptions:
    return ShippingOptions(_RealShipOpts(carrier, conveyance))

As a NewType, our public ShippingOptions type doesn't have a constructor. Since _RealShipOpts is private, and all its attributes are private, we can completely remove the old versions.

Anything within our shipping library can still access the private variables on ShippingOptions; as a NewType, it's the same type as its base at runtime, so it presents minimal1 overhead.

Clients outside our shipping library can still call all of our public constructors: shipFast, shipNormal, and shipSlow all still work with the same (as far as calling code knows) signature and behavior.

If you need to build and convey some state within your public API, while avoiding breakages associated with compatibility churn, hopefully this technique can help you do that!


Acknowledgments

Thanks for reading, and thank you to my patrons who are supporting my writing on this blog. If you like what you've read here and you'd like to read more of it, or you'd like to support my various open-source endeavors, you can support my work as a sponsor.


  1. The overhead is minimal, but it is not completely zero. The suggested idiom for converting to a NewType is to call it like a function, as I've done in these examples, but if you are wanting to use this pattern inside of a hot loop, you can use # type: ignore[return-value] comments to avoid that small cost.

22 May 2026 12:33am GMT

04 Apr 2026

feedPlanet Twisted

Donovan Preston: Using osascript with terminal agents on macOS

Here is a useful trick that is unreasonably effective for simple computer use goals using modern terminal agents. On macOS, there has been a terminal osascript command since the original release of Mac OS X. All you have to do is suggest your agent use it and it can perform any application control action available in any AppleScript dictionary for any Mac app. No MCP set up or tools required at all. Agents are much more adapt at using rod terminal commands, especially ones that haven't changed in 30 years. Having a computer control interface that hasn't changed in 30 years and has extensive examples in the Internet corpus makes modern models understand how to use these tools basically Effortlessly. macOS locks down these permissions pretty heavily nowadays though, so you will have to grant the application control permission to terminal. But once you have done that, the range of possibilities for commanding applications using natural language is quite extensive. Also, for both Safari and chrome on Mac, you are going to want to turn on JavaScript over AppleScript permission. This basically allows claude or another agent to debug your web applications live for you as you are using them.In chrome, go to the view menu, developer submenu, and choose "Allow JavaScript from Apple events". In Safari, it's under the safari menu, settings, developer, "Allow JavaScript from Apple events". Then you can do something like "Hey Claude, would you Please use osascript to navigate the front chrome tab to hacker news". Once you suggest using OSA script in a session it will figure out pretty quickly what it can do with it. Of course you can ask it to do casual things like open your mail app or whatever. Then you can figure out what other things will work like please click around my web app or check the JavaScript Console for errors. Another very important tips for using modern agents is to try to practice using speech to text. I think speaking might be something like five times faster than typing. It takes a lot of time to get used to, especially after a lifetime of programming by typing, but it's a very interesting and a different experience and once you have a lot of practice It starts to to feel effortless.

04 Apr 2026 1:31pm GMT

16 Mar 2026

feedPlanet Twisted

Donovan Preston: "Start Drag" and "Drop" to select text with macOS Voice Control

I have been using macOS voice control for about three years. First it was a way to reduce pain from excessive computer use. It has been a real struggle. Decades of computer use habits with typing and the mouse are hard to overcome! Text selection manipulation commands work quite well on macOS native apps like apps written in swift or safari with an accessibly tagged webpage. However, many webpages and electron apps (Visual Studio Code) have serious problems manipulating the selection, not working at all when using "select foo" where foo is a word in the text box to select, or off by one errors when manipulating the cursor position or extending the selection. I only recently expanded my repertoire with the "start drag" and "drop" commands, previously having used "Click and hold mouse", "move cursor to x", and "release mouse". Well, now I have discovered that using "start drag x" and "drop x" makes a fantastic text selection method! This is really going to improve my speed. In the long run, I believe computer voice control in general is going to end up being faster than WIMP, but for now the awkwardly rigid command phrasing and the amount of times it misses commands or misunderstands commands still really holds it back. I've been learning the macOS Voice Control specific command set for years now and I still reach for the keyboard and mouse way too often.

16 Mar 2026 11:04am GMT