27 Jun 2026

feedPlanet Python

Go Deh: Sparse Ranges

Implementing sparse_range: From a Python Discuss Idea to a Sieve Stress Test

I found an interesting thread over on the Python Discuss forum titled "Possibility to exclude ranges from range". The initial discussion revolved around a common (?), developer need of : how to cleanly skip specific blocks of ints in a range of ints without writing clunky, nested if/continue logic, and without doing something memory-heavy like casting everything into a massive set.

The consensus was that Python's native range is beautiful because it's a lightweight, memory-efficient arithmetic engine. It doesn't store numbers in RAM; it just calculates them on the fly. But the moment you need to punch holes in it-say, "give me all numbers from 1 to 1000, except for multiples of 5, and except for the block from 200 to 300"-you lose that structural elegance.

This inspired me to build a clean, production-grade abstraction to solve this kind of thng: sparse_range.

The Architectural Concept

The core idea began with a simple mathematical definition: a Sparse range instance is created by a set of (python) ranges S_in that hold possible output values and a set of ranges S_ex that exclude possible output values. A Sparse_range instance can be called with a start, stop, and step argument that generate trial output candidates that are checked against S_in and S_ex to see if they are allowed/filtered. An item is eligible for the final output if its value is present in at least one of our permitted baseline ranges S_in, unless that value happens to be intercepted by one of our forbidden exclusion ranges S_ex. On top of that structure, the outer sparse_range wrapper behaves like a native Python range-it accepts its own outer start, stop, and step constraints, matching the target window requested by the user.

To make the evaluation stateless and fast(?), the engine must evaluate values on the fly. However, native Python range objects are immutable and opaque; they don't natively maintain operational iteration states, nor do they know how to coordinate with one another.

To bridge this gap, the engine internally upgrades every raw Python range passed to collections S_in and S_ex into custom RangePointers.

+----------------------------------------+
| Outer Loop: start, stop, step |
+----------------------------------------+
|
[ Evaluates current candidate value ]
|
v
+--------------------------------------------------+
| Is value in any S_in? AND NOT in any S_ex? |
+--------------------------------------------------+
| |
(Using forward-aligned (Using backward-aligned
RangePointers for step > 0) RangePointers for step < 0)


Why We Need RangePointer and Direction Alignment

When the outer loop steps through values, the internal component ranges must follow along. If the outer loop is counting upwards (step > 0), all internal evaluation cursors must track upwards. If the outer loop reverses direction and counts downwards (step < 0), every single internal range must generate the same values, but in the reverse order, and without using up extra memory for storing values.

If they don't match directions, the leapfrogging logic breaks. A forward-running pointer cannot efficiently tell a backward-running outer loop where the next valid alignment boundary is without wasting CPU cycles scanning dead space or whatever.

Therefore, when a sparse_range instance is invoked, the engine inspects the outer step direction and enforces a unified traversal direction across every single internal S_in and S_ex sequence. If an internal range's native direction opposes the outer loop, it must be completely mathematically inverted.

The Range Reversal Formula

Reversing an arithmetic progression isn't as simple as swapping the start and stop boundaries. Because Python ranges stop before reaching the termination boundary, reversing a range requires recalculating its new alignment based on its exact step constraints.

To reverse a range mathematically, the engine computes:

  1. New Step: The step sign is flipped (-step).

  2. New Start: The final valid item generated by the original range becomes the new starting point. This is computed using the remainder of the length of the sequence:

    new_start = start + (length - 1) × step
  3. New Stop: The original start boundary shifts out by one step inverted to act as the non-inclusive terminal wall:

    new_stop = start - step

By normalizing all RangePointer instances to track in the exact same direction as the outer loop, the engine can efficiently step through candidate numbers, skipping blocks of excluded data in O(1) or O(K) time (where K is the number of active ranges), keeping the memory footprint at absolute zero.

The code

sparse_range.py

#!/bin/env python3
# Author: Donald "Paddy" McCarthy. paddy3118@gmail.com
# 27/06/2026

#%%
"""
sparse_range - Reusable Sparse Integer Generators
===================================================

This module provides a configured factory class `sparse_range`. You define your
global permitted (`s_in`) and forbidden (`s_ex`) integer lists once, and the
resulting object can be called repeatedly with customized window configurations
exactly like Python's built-in `range()` function.

Usage
-----
>>> from sparse_range import sparse_range
>>> # Configure the template rules
>>> my_filter = sparse_range(s_in=[range(0, 100, 2)], s_ex=[range(10, 20, 3)])
>>>
>>> # Call it like the built-in range function
>>> sequence = my_filter(0, 30, 3)
>>> list(sequence)
[0, 6, 24]

Command Line Interface
----------------------
Run this script directly to run differential limits tests or inspect help files:
python sparse_range.py
"""

import sys
from typing import List, Optional, Iterator, Set, Dict

# Explicitly define module exports
__all__ = ['sparse_range']


class RangePointer:
"""
Manages a single sub-range's mathematical cursor.
Dynamically aligns itself relative to the iteration direction of the caller.
"""
def __init__(self, r: range) -> None:
"""Initializes structural constraints of a single tracking range cursor."""
self.start: int = r.start
self.stop: int = r.stop
self.step: int = r.step
self.length: int = len(r)
# Calculate the actual last element generated by this range
if self.length > 0:
self.last: Optional[int] = r.start + (self.length - 1) * r.step
else:
self.last = None

def align_to_or_past(self, target: int, direction: int) -> Optional[int]:
"""
Leapfrogs the cursor instantly to the closest valid element
aligned with the runtime's traversal direction.
"""
if self.length == 0 or self.last is None:
return None

# 1. Check if the target is entirely out of bounds for this sub-range
low, high = min(self.start, self.last), max(self.start, self.last)
if target < low:
return self.start if direction > 0 else None
if target > high:
return None if direction > 0 else self.last

# 2. Project target onto the arithmetic sequence of this range
distance = target - self.start
if self.step > 0:
k = (distance + self.step - 1) // self.step if distance > 0 else distance // self.step
else:
k = distance // self.step if distance > 0 else (distance + self.step + 1) // self.step
candidate = self.start + k * self.step

# 3. Sweep forward/backward based on global runtime direction
if direction > 0:
while candidate < target:
candidate += self.step
if low <= candidate <= high:
return candidate
else:
while candidate > target:
candidate -= self.step
if low <= candidate <= high:
return candidate
return None


class SparseRangeIterable:
"""The execution runner returned when a configured sparse_range is called."""
def __init__(self, start: int, stop: int, step: int, s_in: List[range], s_ex: List[range]) -> None:
"""Prepares runtime matrices for specific window iterations."""
if step == 0:
raise ValueError("sparse_range loop step argument must not be zero")
self.start: int = start
self.stop: int = stop
self.step: int = step
self.direction: int = 1 if step > 0 else -1
self.in_ranges: List[RangePointer] = [RangePointer(r) for r in s_in if len(r) > 0]
self.ex_ranges: List[RangePointer] = [RangePointer(r) for r in s_ex if len(r) > 0]

def __iter__(self) -> Iterator[int]:
"""Iterates through matching sparse elements using double-pointer acceleration."""
target = self.start
stop = self.stop
step = self.step
dir_mask = self.direction

while (dir_mask > 0 and target < stop) or (dir_mask < 0 and target > stop):
# Align all input sub-ranges and collect their active values
active_in: Dict[RangePointer, int] = {}
for p in self.in_ranges:
val = p.align_to_or_past(target, dir_mask)
if val is not None:
active_in[p] = val

if not active_in:
break

# Align all exclusion sub-ranges
active_ex: Dict[RangePointer, int] = {}
for p in self.ex_ranges:
val = p.align_to_or_past(target, dir_mask)
if val is not None:
active_ex[p] = val

# Dynamic Execution Strategy Optimization Lookup
is_included = False
if len(active_in) <= len(active_ex):
if any(val == target for val in active_in.values()):
if not any(val == target for val in active_ex.values()):
is_included = True
else:
if not any(val == target for val in active_ex.values()):
if any(val == target for val in active_in.values()):
is_included = True

if is_included:
yield target
target += step


class sparse_range:
"""
A factory pattern configuration setup that caches structural filter limits.
Instances can be directly invoked exactly like the standard range() construct.
"""
def __init__(self, s_in: Optional[List[range]] = None, s_ex: Optional[List[range]] = None) -> None:
"""Instantiates the filtering template for subsequent executions."""
self._s_in_raw: List[range] = s_in or []
self._s_ex_raw: List[range] = s_ex or []

def __repr__(self) -> str:
"""Returns structural serialization details for troubleshooting."""
return f"sparse_range(s_in={self._s_in_raw}, s_ex={self._s_ex_raw})"

def __call__(self, start: int, stop: Optional[int] = None, step: int = 1) -> SparseRangeIterable:
"""Mimics the signature properties of Python's default range() function."""
if stop is None:
stop = start
start = 0
return SparseRangeIterable(start, stop, step, self._s_in_raw, self._s_ex_raw)


def _expensive_sparse_range(start: int, stop: int, step: int, s_in: List[range], s_ex: List[range]) -> List[int]:
"""Reference high-memory execution harness to generate baseline validation benchmarks."""
allowed: Set[int] = set()
for r in (s_in or []):
allowed.update(r)
for r in (s_ex or []):
allowed.difference_update(r)
target = start
direction = 1 if step > 0 else -1
result: List[int] = []
while (direction > 0 and target < stop) or (direction < 0 and target > stop):
if target in allowed:
result.append(target)
target += step
return result


def run_tests() -> None:
"""Runs structural verification configurations spanning cross-boundary constraints."""
print("Running Factory Pattern Limits Verification Suite (-1000 to +1000)...\n")
# 1. Main dynamic test setup
filter_config = sparse_range(
s_in=[range(-500, 500, 3)],
s_ex=[range(-50, 50, 2), range(200, 300, 5)]
)
test_windows = [
("Standard positive crawl", (0, 200, 2)),
("Negative direction crawl", (400, -400, -5)),
("Large step slice jump", (-1000, 1000, 25)),
("Completely out of bounds window", (600, 900, 1))
]
# 2. Explicit Empty Input Edge Case Verification Setup
empty_filter_config = sparse_range(s_in=[range(0, 0), range(10, 5)], s_ex=[range(0, 100)])
test_windows.append(("Empty S_in Edge Case", (0, 50, 1)))
passed = 0
for name, args in test_windows:
start, stop, step = args
# Check which config to evaluate against
cfg = empty_filter_config if "Empty" in name else filter_config
actual_output = list(cfg(start, stop, step))
expected_output = _expensive_sparse_range(start, stop, step, cfg._s_in_raw, cfg._s_ex_raw)
if actual_output == expected_output:
print(f"✅ PASS: {name:<35} -> args: ({start}, {stop}, {step})")
passed += 1
else:
print(f"❌ FAIL: {name:<35}")
print(f" Expected: {expected_output}")
print(f" Got: {actual_output}")
print(f"\nTest Summary: {passed}/{len(test_windows)} passed successfully.")


if __name__ == "__main__":
args = sys.argv[1:]
if not args or "--test" in args:
run_tests()
elif "--help" in args or "-h" in args:
print("""sparse_range Factory Engine\n---------------------------""")
elif "--doc" in args:
print(__doc__.strip())


A run of included tests

Running Factory Pattern Limits Verification Suite (-1000 to +1000)...

✅ PASS: Standard positive crawl -> args: (0, 200, 2)
✅ PASS: Negative direction crawl -> args: (400, -400, -5)
✅ PASS: Large step slice jump -> args: (-1000, 1000, 25)
✅ PASS: Completely out of bounds window -> args: (600, 900, 1)
✅ PASS: Empty S_in Edge Case -> args: (0, 50, 1)

Test Summary: 5/5 passed successfully.

The Ultimate Stress Test: A Declarative Sieve

It's easy to write a basic range filter that skips a static block of numbers. But to prove that this RangePointer inversion and alignment math is completely airtight across dozens of concurrent boundaries, I decided to implement a completely declarative Sieve of Eratosthenes.

Instead of allocating a large mutable array of booleans in memory and procedurally striking out composites, we can use pure sequence logic:

  1. The Permitted Range S_in: Our baseline pool of candidate of ints starting at 2: [range(2, limit + 1)].

  2. The Exclusion ranges S_ex: A list of independent arithmetic progression ranges tracking the composites for every factor m up to (limit).

Each exclusion range/stream starts at m × m (instead of m × 2). Any composite smaller than m × m has a prime factor smaller than m, meaning it has already been intercepted by an earlier stream. For example by the time m=5 runs, numbers like 10, 15, and 20 are already masked out by the ranges/streams for 2 and 3. The first unseen composite is 5 × 5 = 25.

Putting it to the Test

Although the module sparse_range.py has its own tests, to further verify the integrity of the design I put together a test module called sparse_range_sieve_test.py. It packages the sieve creation into an isolated factory function and validates the output against a traditional, trusted array sieve across both a forward sweep and a reversed backward pass.

sparse_range_sieve_test.py
#!/bin/env python3
# Author: Donald "Paddy" McCarthy. paddy3118@gmail.com
# 27/06/2026

#%%
"""
sparse_range_sieve_test - Prime Generation Performance Test Harness
===================================================================

This module verifies the mathematical integrity of the `sparse_range` tracking
engine by utilizing it to implement a declarative Sieve of Eratosthenes.
It tests overlapping cross-boundary constraints by instantiating dozens of
simultaneous exclusion ranges running with diverse step intervals.
"""

from typing import List
from sparse_range import sparse_range

def traditional_sieve(n: int) -> List[int]:
"""Generates prime numbers up to N using a standard boolean array sieve."""
if n < 2:
return []
sieve = [True] * (n + 1)
sieve[0] = sieve[1] = False
for p in range(2, int(n**0.5) + 1):
if sieve[p]:
for i in range(p * p, n + 1, p):
sieve[i] = False
return [i for i, is_prime in enumerate(sieve) if is_prime]


def build_prime_sieve(limit: int) -> sparse_range:
"""
Constructs a sparse_range configured as a declarative Sieve of Eratosthenes.
How S_in and S_ex constitute a "sieved" range:
----------------------------------------------
Instead of tracking a mutable array of booleans in memory and crossing off
composites procedurally, this factory models the Sieve of Eratosthenes
declaratively using pure sequence logic.
1. The Permitted Range (S_in):
We start with the baseline domain of all potential prime candidates. Since
1 is not prime, our domain is a single continuous range starting at 2:
S_in = [range(2, limit + 1)]
2. The Exclusion Streams (S_ex):
To eliminate non-primes, we generate an independent exclusion range for
every integer 'm' from 2 up to sqrt(limit). Each range tracks the periodic
sequence of composites generated by that factor.
- Step factor: The range increments by 'm' (e.g., for m=3, it targets
every 3rd number).
- Optimization (m * m): Rather than starting at m * 2, each exclusion stream
starts at m * m. Any composite number smaller than m * m must possess a
prime factor smaller than 'm', meaning it is already guaranteed to be
intercepted by a previous exclusion stream.
When the resulting factory is iterated, the sparse_range engine dynamically
clashes these arithmetic progressions together, masking out composites instantly
without storing any tables in RAM.
"""
# Baseline candidate space
s_in: List[range] = [range(2, limit + 1)]
# Composite elimination progressions
s_ex: List[range] = [
range(m * m, limit + 1, m)
for m in range(2, int((limit + 1)**0.5) + 1)
]
return sparse_range(s_in=s_in, s_ex=s_ex)


def run_sieve_tests(limit: int = 200) -> None:
"""Executes forward and backward verification passes against traditional algorithms."""
print(f"Beginning Sparse Range Sieve Evaluation Suite (N = {limit})...")
print("----------------------------------------------------------------")

# 1. Generate baseline ground-truth primes
expected_forward: List[int] = traditional_sieve(limit)
expected_backward: List[int] = list(reversed(expected_forward))

# 2. Build the sieve factory via isolated generator function
prime_sieve_factory = build_prime_sieve(limit)

# 3. Test Forward Generation
print("👉 Running Forward Traversal Validation...")
actual_forward: List[int] = list(prime_sieve_factory(2, limit + 1, 1))
if actual_forward == expected_forward:
print(f" ✅ PASS: Forward matching. Found {len(actual_forward)} primes.")
else:
print(" ❌ FAIL: Forward mismatch.")
print(f" Expected: {expected_forward}")
print(f" Got: {actual_forward}")

# 4. Test Backward Generation
print("👉 Running Backward Traversal Validation...")
actual_backward: List[int] = list(prime_sieve_factory(limit, 1, -1))
if actual_backward == expected_backward:
print(f" ✅ PASS: Backward matching. Reversed sequences match perfectly.")
else:
print(" ❌ FAIL: Backward mismatch.")
print(f" Expected: {expected_backward}")
print(f" Got: {actual_backward}")


if __name__ == "__main__":
# Test with a limit of 200 elements to keep visualization digestible
run_sieve_tests(limit=200)

The Verdict

When running forward, the generator's internal pointers march sequentially alongside the main consumer loop.

The true architectural challenge happens during the backward traversal pass (step = -1). When executing prime_sieve_factory(limit, 1, -1), the engine must manage 13 independent composite exclusion ranges simultaneously (for m = 2, 3, ..., 14). Each sub-range pointer must instantly compute its upper boundary, snap precisely to its maximum valid multiple less than or equal to 200, and step backward in perfect synchronization without dropping or skipping values.

The execution checks out perfectly:


Beginning Sparse Range Sieve Evaluation Suite (N = 200)...
----------------------------------------------------------------
👉 Running Forward Traversal Validation...
   ✅ PASS: Forward matching. Found 46 primes.
👉 Running Backward Traversal Validation...
   ✅ PASS: Backward matching. Reversed sequences match perfectly.

If you are following the Python Discuss thread, this demonstrates that handling arbitrary sequence exclusions cleanly and statelessly via pure index-pointer stream math isn't just an elegant idea-it's highly reliable even under heavy algorithmic stress.

Disclosure

I used Gemini AI to help in this. I was able to state algorithms and get Gemini to fill in with implementations, state changes and get Gemini to implement those too. I am used to creating algorithms without AI, this time I was able to state what I wanted in some detail and have Gemini add yet more detail. I could show python, pseudo-code, r textual descriptions as needed and get Gemini to fill in. The tests went from my description to code by Gemini - the sieve example test idea and spec, and debug was by me with Gemini improving my limit on m from m*2 to m*m for example before implementing as told.

27 Jun 2026 4:19pm GMT

26 Jun 2026

feedPlanet Python

Talk Python to Me: #553: All of our tools

This episode is a fun crossover from our Python news and tips podcast, Python Bytes. We have had some big changes over there. Brian Okken has moved on and Calvin Hendryx-Parker has joined the show as the new co-host. To kick off this new era, we decided to do a longer and more personal episode called "All Our Tools". The idea is both of us talk about some of our most useful day-to-day developer and business owner tools that we think you all would find useful. It was so well received, that I'm bringing it to you all as a crossover episode. Enjoy and we hope you find something new and awesome to help you with your software and data science day to day.<br/> <br/> <strong>Episode sponsors</strong><br/> <br/> <a href='https://talkpython.fm/sentry'>Sentry Error Monitoring, Code talkpython26</a><br> <a href='https://talkpython.fm/devopsbook'>Python in Production</a><br> <a href='https://talkpython.fm/training'>Talk Python Courses</a><br/> <br/> <h2 class="links-heading mb-4">Links from the show</h2> <div><strong>@calvinhp@sixfeetup.social</strong>: <a href="https://sixfeetup.social/@calvin?featured_on=talkpython" target="_blank" >sixfeetup.social</a><br/> <strong>@calvinhp.com</strong>: <a href="https://bsky.app/profile/calvinhp.com?featured_on=talkpython" target="_blank" >bsky.app</a><br/> <strong>calvinhp.com</strong>: <a href="https://calvinhp.com?featured_on=talkpython" target="_blank" >calvinhp.com</a><br/> <br/> <strong>Original airing on Python Bytes</strong>: <a href="https://pythonbytes.fm/episodes/show/484/all-our-tools?featured_on=talkpython" target="_blank" >pythonbytes.fm</a><br/> <br/> <strong>pi</strong>: <a href="https://pi.dev/?featured_on=talkpython" target="_blank" >pi.dev</a><br/> <strong>superpowers</strong>: <a href="https://github.com/obra/superpowers/tree/main?featured_on=talkpython" target="_blank" >github.com</a><br/> <strong>Warp.dev</strong>: <a href="http://Warp.dev?featured_on=talkpython" target="_blank" >Warp.dev</a><br/> <strong>OhMyZSH</strong>: <a href="https://ohmyz.sh/?featured_on=talkpython" target="_blank" >ohmyz.sh</a><br/> <strong>Commandbookapp.com</strong>: <a href="http://Commandbookapp.com?featured_on=talkpython" target="_blank" >Commandbookapp.com</a><br/> <strong>Blink</strong>: <a href="https://blink.sh/?featured_on=talkpython" target="_blank" >blink.sh</a><br/> <strong>kitty</strong>: <a href="https://sw.kovidgoyal.net/kitty/?featured_on=talkpython" target="_blank" >sw.kovidgoyal.net</a><br/> <strong>mosh</strong>: <a href="https://mosh.org/?featured_on=talkpython" target="_blank" >mosh.org</a><br/> <strong>tmux</strong>: <a href="https://github.com/tmux/tmux?featured_on=talkpython" target="_blank" >github.com</a><br/> <strong>Claude code</strong>: <a href="https://www.anthropic.com/product/claude-code?featured_on=talkpython" target="_blank" >www.anthropic.com</a><br/> <strong>Claude.md</strong>: <a href="http://Claude.md?featured_on=talkpython" target="_blank" >Claude.md</a><br/> <strong>MacWhisper</strong>: <a href="https://goodsnooze.gumroad.com/l/macwhisper?featured_on=talkpython" target="_blank" >goodsnooze.gumroad.com</a><br/> <strong>Handy</strong>: <a href="https://handy.computer?featured_on=talkpython" target="_blank" >handy.computer</a><br/> <strong>Tailscale</strong>: <a href="https://tailscale.com/?featured_on=talkpython" target="_blank" >tailscale.com</a><br/> <strong>Talk Python episode with Alex</strong>: <a href="https://talkpython.fm/episodes/show/546/self-hosting-apps-for-python-people" target="_blank" >talkpython.fm</a><br/> <strong>Telescopo</strong>: <a href="https://www.telescopo.app?featured_on=talkpython" target="_blank" >www.telescopo.app</a><br/> <strong>Typora markdown</strong>: <a href="https://typora.io/?featured_on=talkpython" target="_blank" >typora.io</a><br/> <strong>formal documentation for many of my open source packages</strong>: <a href="https://mkennedy.codes/docs/?featured_on=talkpython" target="_blank" >mkennedy.codes</a><br/> <strong>Great Docs</strong>: <a href="https://posit-dev.github.io/great-docs/?featured_on=talkpython" target="_blank" >posit-dev.github.io</a><br/> <strong>Statement on the US government directive to suspend access to Fable 5 and Mythos 5</strong>: <a href="https://www.anthropic.com/news/fable-mythos-access?featured_on=talkpython" target="_blank" >www.anthropic.com</a><br/> <strong>No second date</strong>: <a href="https://x.com/pr0grammerhum0r/status/2063078450311598430?s=12&amp;featured_on=talkpython" target="_blank" >x.com</a><br/> <br/> <strong>Watch this episode on YouTube</strong>: <a href="https://www.youtube.com/watch?v=wgKF3yvpxPU" target="_blank" >youtube.com</a><br/> <strong>Episode #553 deep-dive</strong>: <a href="https://talkpython.fm/episodes/show/553/all-of-our-tools#takeaways-anchor" target="_blank" >talkpython.fm/553</a><br/> <strong>Episode transcripts</strong>: <a href="https://talkpython.fm/episodes/transcript/553/all-of-our-tools" target="_blank" >talkpython.fm</a><br/> <br/> <strong>Theme Song: Developer Rap</strong><br/> <strong>🥁 Served in a Flask 🎸</strong>: <a href="https://talkpython.fm/flasksong" target="_blank" >talkpython.fm/flasksong</a><br/> <br/> <strong>---== Don't be a stranger ==---</strong><br/> <strong>YouTube</strong>: <a href="https://talkpython.fm/youtube" target="_blank" ><i class="fa-brands fa-youtube"></i> youtube.com/@talkpython</a><br/> <br/> <strong>Bluesky</strong>: <a href="https://bsky.app/profile/talkpython.fm" target="_blank" >@talkpython.fm</a><br/> <strong>Mastodon</strong>: <a href="https://fosstodon.org/web/@talkpython" target="_blank" ><i class="fa-brands fa-mastodon"></i> @talkpython@fosstodon.org</a><br/> <strong>X.com</strong>: <a href="https://x.com/talkpython" target="_blank" ><i class="fa-brands fa-twitter"></i> @talkpython</a><br/> <br/> <strong>Michael on Bluesky</strong>: <a href="https://bsky.app/profile/mkennedy.codes?featured_on=talkpython" target="_blank" >@mkennedy.codes</a><br/> <strong>Michael on Mastodon</strong>: <a href="https://fosstodon.org/web/@mkennedy" target="_blank" ><i class="fa-brands fa-mastodon"></i> @mkennedy@fosstodon.org</a><br/> <strong>Michael on X.com</strong>: <a href="https://x.com/mkennedy?featured_on=talkpython" target="_blank" ><i class="fa-brands fa-twitter"></i> @mkennedy</a><br/></div>

26 Jun 2026 10:32pm GMT

The No Title® Tech Blog: Just updated - Optimize Images v2.1.0

Optimize Images 2.1.0 brings native WebP support, a generalized format-conversion system, on-demand image inspection with EXIF reporting, and a new in-memory API for working with image bytes directly. It is a focused, fully backwards-compatible step forward for the command-line tool and its public API.

26 Jun 2026 9:40pm GMT

feedDjango community aggregator: Community blog posts

Open Source Comes From People

I recently attended my first PG Data 2026 conference where keynote speaker Robert Haas delivered a talk that has stayed with me. His keynote focused on the people behind PostgreSQL, the growing challenges of sustaining open-source communities, and the urgent need to cultivate new contributors through mentorship and community engagement. While his remarks centered on PostgreSQL, they sparked broader reflections for me about the future of open source and communities like Django.

26 Jun 2026 7:00pm GMT

Issue 343: Django 6.1 beta 1 released

News

Django 6.1 beta 1 released

Django 6.1 beta 1 is now available, giving the community a chance to test upcoming features and improvements before the final release on August 5.

Djangonaut Space: Launching Contributors

Djangonaut Space shares the results from its first six mentorship sessions, showing how an 8-week cohort program helped launch 104 contributors from 40+ countries into long-term open source participation and leadership.


Django Software Foundation

How the Django Software Foundation Became a CNA

Learn how the Django Software Foundation became a CVE Numbering Authority, giving it the ability to assign CVE IDs directly and streamline Django's security advisory process.


Wagtail CMS News

Wagtail as Django admin on steroids

Think Wagtail is just a CMS? See why it can serve as a polished, modern replacement for Django's admin with a familiar API and powerful features that make client-facing backends shine.

Comparing open weight AI models and providers

Open weight AI models are closing the gap with proprietary LLMs, and this guide explains how to compare models and providers on performance, cost, energy use, and transparency.


Releases

Python 3.15.0 beta 3 is here!

Python 3.15 beta 3 is out with nearly 200 bug fixes plus major additions like lazy imports, frozendict, sentinel objects, a faster JIT, and UTF-8 as the default encoding.


Updates to Django

Today, "Updates to Django" is presented by Raffaella from Djangonaut Space! 🚀

Last week we had 24 pull requests merged into Django by 16 different contributors - including 2 first-time contributors! Congratulations to Margaret Fero and diaxoaine for having their first commits merged into Django - welcome on board!


Articles

Teach your linter your own rules

boa-restrictor is a Python/Django linter that now lets you register your own AST-based rule classes via pyproject.toml to enforce project-specific conventions. This is especially useful as a deterministic guardrail for keeping AI coding agents from repeating unwanted patterns.

Why I wrote PEP 832 -- virtual environment discovery

PEP 832 proposes a standard way for editors and AI tools to discover Python virtual environments, aiming to make project setup smoother regardless of your workflow tool.

Supporting Django's Next Chapter

Caktus Group has become a founding sponsor of the Django Software Foundation's new Executive Director position, investing in Django's long term sustainability and encouraging other companies to do the same.

Mitigated API authentication bypass for python.org download metadata

Python.org has disclosed and mitigated an authentication bypass that could have altered download metadata, with no evidence of exploitation after extensive audits and additional security hardening.

How I Architected Automatic Parking Detection in Django - Bluetooth Disconnects, Geofence Events, and a Strict State Machine

A deep dive into building a reliable Django parking detection system using Bluetooth events, geofencing, state machines, and optimistic locking to safely handle concurrency.

What I learned from two days of hanging out with AI experts

Five practical takeaways from an AI conference suggest the future belongs to model agnosticism, measurable ROI, and smaller open models instead of hype.


Videos

Learning Python in the Age of AI

In this short interview from PyCon US, Sheena O'Connell discusses one of the biggest questions facing developers today: how should people learn Python in the age of AI?

Paolo Melchiorre on AI-Assisted Development

Another PyCon US 2026 chat, this time with Paolo Melchiorre talking about Django, AI-assisted development, open-source maintainership, and how the Python community is adapting to AI.


Django Forum

Django 6.1 release - timeline and next steps

Notes and updates from Fellow Jacob Walls on the 6.1 release process.

Adding database backend methods to get hardcoded or nonexistent primary key values for tests

From Tim Graham, surfacing ticket #37175 "to see what our creative community can suggest."


Django Fellow Reports

Jacob Walls

Tended to a flurry of fixes before the non-release-blocker bugfix freeze for Django 6.1 in a few days. Also chipped away at some performance improvements for ASGI projects using sync middleware.

Natalia Bidart

Lots of preparation for the upcoming 6.1 βeta, with the goal of stabilizing recent changes and ensuring overall readiness 🚀. I also spent time digging into Django's async behavior, reviewing recent changes and following through on related optimizations and documentation updates 📒. I also looked more closely at packaging and reproducibility, especially around artifact builds, to improve our consistency in the release process 📦.


Django Job Board

Senior Python/Django Developer at Gryps

Founding ML/Data Scientist (Remote, UK) at MyDataValue


Projects

vintasoftware/django-ai-boost

A MCP server for Django applications, inspired by Laravel Boost.

Archmonger/ServeStatic

Production-grade Python static file server. Run as middleware or standalone.

26 Jun 2026 3:00pm GMT

24 Jun 2026

feedDjango community aggregator: Community blog posts

Supporting Django's Next Chapter

The path to hiring an Executive Director gained real momentum at DjangoCon US 2024, when Jacob Kaplan-Moss shared a vision for what dedicated resources could mean for the future of Django. In his blog post If We Had $1,000,000, he invited companies and supporters to help get the initiative off the ground. The response from the community was inspiring, and we're proud to see that vision become reality.

24 Jun 2026 7:00pm GMT

23 Jun 2026

feedPlanet Twisted

Glyph Lefkowitz: Adversarial Communication

As I have discussed in previous posts, "AIs" can make mistakes. In fact, they do make mistakes, and their mistake-making patterns are such that where and how they will make mistakes is both uncertain and constantly changing.

Thus, in any scenario where you want to attempt to make "productive" use of "AI", you must have a system in place for checking every result. Not checking some results; checking every result. If each result might have a consequence for you (and if it didn't have a consequence, why bother automating it?) and you cannot predict in advance which kinds of results will need verification, then verification is always required.

The verification often ends up being just as expensive as doing the work in the first place, which means that if you want your usage of "AI" to be personally profitable, you have to find someone else to externalize the cost of verification onto. This person becomes your adversary, and, if you are successful, your "AI's" victim.

The Ladder-Climber And Their Reverse-Centaur Rungs

One way that this constellation of facts can straightforwardly assemble themselves into a dystopian nightmare is the phenomenon, described by Cory Doctorow, of the reverse centaur. This is when your employer non-consensually turns you into the verification system. The "AI" does the fun part of initially performing the work, and then you do the boring part where you check if the robot is right and clean up its messes, even if everyone already knows that it would, in aggregate, be cheaper for you to do the work in the first place.

Reverse centaurs can be made from any automation, not only "AI" automation. I think that there is a reason that this term happens to have emerged in the "age of AI", though, and not with earlier automation technologies (even those which were considerably more viscerally horrific). That reason is: the wrongness of "AI" output is not merely a technical feature that must be compensated for, it is a generalized externality.

As I mentioned above, if you are responsible for the entirety of the work, both extruding the "AI" output and checking it, it's usually cheaper to have humans do the entirety of the work to begin with. When humans do the writing directly, we can check as we go, and thus verification doesn't need to be as comprehensive.

When "AI" coding advocates say "code review is the bottleneck", what they are observing is that the LLM is still rolling the dice for each PR, and a human is still necessary to verify that each of those rolls is a winner. But calling this process "code review" is a bit of a misnomer; it's not really "code review" in the traditional sense, it's human understanding.

Before the advent of "AI", the human understanding was implicit in the process of writing the code in the first place1, and the code review was a way of diffusing and extending that understanding. Now that the code can be authored with no initial understanding taking place, that cost has not gone away, it has moved.

Human understanding was always the bottleneck.

However, this is taking a collaborative view of a software project, where satisfying the needs and solving the problems of your customers are the goals. We can see that "AI" is a bad tool to satisfy those goals, because all it's doing is converting the first half of the work, that of understanding the code as you write it, to understanding the agent's output as you read it.

What if, instead, we were to take the view that every software company is a Hobbesian nightmare, red in tooth and claw? In this view, the only goal of a software project is for the individual developers to make their promo cycles and get their bonuses. Given that there is only a certain amount of money to go around, this is a zero-sum game where each programmer wants to look more productive than their colleagues.

Pretty much every organization finds it easy to reward "productivity" as expressed by lines of code emitted, but the benefits of doing thorough and thoughtful design, analysis, and code review very difficult to reward. In this world, an LLM is an invaluable tool for the sociopathic ladder-climber, particularly if your legacy organization is still structuring their workflows as if the person prompting the bot is "writing" the code, and then they get to foist off the act of "reviewing" the code onto someone else.

Here, the prompter effectively externalizes the cost of the LLM's failures but internalizes any benefits. The prompter will vibe-code a big feature, so large that the assigned reviewer can't possibly comprehend it all effectively. When this happens, the reviewer will, eventually, be pressured to approve it, even if they can try to spot a few problems along the way. The reviewer has their own work to get back to, after all, the obligation to review the prompter's (read: the bot's) code is a drain on their time that they are not going to get rewarded for.

If this feature is a big success, the prompter gets a promotion. If it causes a big issue, well, the reviewer must not have been careful enough.

This is why LLMs are "good for coding", and also why their biggest promoters keep having outages.

The Generative Gish Galloper

Coding is the biggest "success story" of this type of adversarial communication, but it is by far not the only instance of such a thing. LLMs create a new form of leverage that can turn Brandolini's law from a linear advantage into an exponential one. If you are engaged in a political debate where you want to overwhelm the other side in nonsense, an LLM can generate bullshit faster than it is physically possible for a human being to type, let alone respond thoughtfully. There is an asymmetry to the utility of this weapon as well: only one side of the political spectrum wants to flood the zone and destroy trust in institutions and the concept of truth. There's a good reason that the fascists love it.

Straightforward Spam and Fraud

This is kind of obvious, but LLMs can generate lightly-customized, plausible-looking text much more quickly than any human being. This facilitates their use in fraud, spam, and scams. In a spamming or fraudulent interaction, once again, the costs are externalized onto the victim: the recipient of a spam message has to do all the work of "checking" the LLM's output. Spammers already expect very low hit rates from boilerplate, and if the LLM can increase those percentages from 1% to 5% the technology will pay for itself; they don't need anything like reliable accuracy.

Customer "Support"

If you have any kind of commercial relationship with a company, I probably don't even need to mention this: customer "support" bots are a misery. Everybody knows it at this point. But customer support is usually conceptualized by businesses as an adversarial interaction, because it is a cost center. They maintain internal metrics on time-to-resolution and try to optimize them. Implicitly, this creates a dynamic where the goal of the customer service agent's job is not to solve your problem, but to emit noise that will cause you to think your problem is resolved, or to give up, as fast as possible. Unsurprisingly, LLMs can emit this noise faster than humans can, getting those customers off the phone. But those customers will remember those interactions, and the story outside the TTR metrics is horrible.

Similarly to the situation in software development, LLMs can look very good on paper for customer support, but mostly what they are doing is illuminating the problems with the industry's existing metrics, by turning "winning the metrics battle against the customer" into a more obvious and immediate defeat for the company's long term reputation.

"Education"

In 2026 it is sadly a fact of life that students cheat all the time using "AI", and that this cheating is very successful, in that the teachers find it very hard to detect.

LLMs are great for cheating on schoolwork because the student is externalizing the work of the checking onto the teachers, who are often starting at a disadvantage to begin with, at least in the US.

My view is that this is happening because of a divergence in the way that students vs. teachers (or, more accurately, "the broader educational system") view grading.

When a student is asked to write an essay, the teachers see the effort as both intrinsically worthwhile for the student, as well as useful as a pedagogical tool to evaluate and react to the student's progress. The student, by contrast, sees a stumbling block designed to knock them off the path to success and into a permanent underclass. It is no wonder that the student sees "AI" as useful to their own goals and has no compunction about deploying it.

There is a bitter irony that the ability to understand the inherent value of actually writing the essay on their own is the sort of thing that students can really only learn by writing a bunch of essays. There's no way that I can think of which makes the benefit legible as long as a shortcut is available.

The net effect here is a downward spiral, where the already-wobbling educational system is sustaining an attack that it doesn't have the resources to recover from. The individual students' attacks against their teachers and their schools' grading systems might appear to momentarily succeed, but they will win the battle and lose the war.

Spamming "For Good"?

Usually when we talk about someone unilaterally choosing to enter into an adversarial relationship, that's an "attack" and for good reasons we have a negative impression of the attacker. However, I would be remiss if I did not point out that there are some cases where the relationship was already adversarial; just because you're the attacker doesn't mean that you are evil.

For example we might imagine use-cases like automatically filing appeals for prior authorizations against health insurance. It's relatively well-known at this point that the main way for-profit insurers maintain their margins is by denying claims right up to the line of the policies themselves being fraud, so using a spamming tool to fight them might be entirely justifiable2 in that case.

Similarly, using an LLM could be justified in a fight against a company refusing to honor a warranty. One could imagine using an LLM to immediately generate replies and escalations.

However, even in imagined cases like these, the underlying problem is that the insurers and the vendors already have a tremendous amount of structural power, so it is more likely that they will have the advantage in deploying a communications weapon like an LLM, as well as enacting policies to simply ignore any LLM-based communication that you might submit. Worse, if these strategies were to become widespread, they might provide an excuse to reject any communications by feeding them into an unreliable "LLM detector" and issuing an automated "computer says no" even to hand-written correspondence.

It is also worth stressing that these cases are imagined, as compared to the very real coworker-abuse, spam, scam, fraud, and disinformation campaigns being waged in real life today.

Therefore, while legitimate uses might exist, it's hard to imagine that there's anywhere they would be genuinely valuable and sustainable. In the best case "AI" will provide a temporary advantage for underdogs that will provoke an arms race which the resource-advantaged adversaries will win in the long run, in the worst case the arms race itself will cement permanent structural change that will make things worse.

"Search" By Stealing

Most of the adversarial utility of "AI" is on the "write" side, since write-amplification is more obviously aggressive than reading. But the "read" side of LLMs - summarization and question-answering - can be a form of attack as well.

To begin with, the act of reading itself is currently enormously destructive, but that's arguably not a fundamental aspect of this technology. They could set reasonable rate-limits and respect things like robots.txt, as search engines have for decades now. They could also refrain from committing criminal levels of copyright infringement. But, today, using "AI" tools does suborn this sort of out-of-control crawling.

More insidiously, consider the scenario described in this YouTube video. The LTT Bros decided to try Linux again, and in the course of so doing, they had problems. When trying to solve these problems, they were faced with a choice: they could consult Reddit, or they could ask an LLM. Asking an LLM would "gaslight the heck out of" them, but they still found it preferable, because they would at least get an answer without getting yelled at.

Initially this sounds great. But it also means that you want to extract knowledge from a community, while mechanically eliding any values or norms that the community may want to impart as part of offering that knowledge. As someone who spent many years in a community tech support role, this is worrying. Many requests for support are people asking how to do things that will momentarily solve a superficial problem but create a long-term reliability problem or even an immediate security risk, that the question-asker doesn't want to hear about. Consider the question "I'm tired of entering my password so much, how do I make it so my laptop unlocks automatically". An obsequious chatbot will helpfully tell you how to do this without pushback.

But, this is also a sort of ethically murky area. The Linux community is somewhat famously, for many years now, a toxic cesspool of general hostility, misogyny, etc. It is certainly a good thing that people can get access to this knowledge without subjecting themselves to abuse. But it also means that the people with the power and the privilege to change the community for the better can just quietly withdraw, rather than fixing the problems. It also means that the positive elements of culture cannot be transmitted, and people will have no opportunity to learn about unknown unknowns.

In this case, the "adversarial" communication is with society. The thing that using an LLM for search lets you do is withdraw from society and avoid forming any personal connections. There are some personal connections which are painful and annoying, and so that can feel like a momentary balm. But the need to make connections in general is, like, the concept of society itself.

Who Am I Hurting?

LLMs are good at adversarial communication. They are so good at it, relative to their other benefits, that they will tend to make communications adversarial if you are not remaining vigilant about the possibility that it might do so. My request to you, dear reader, if you are going to use such tools, is to always ask yourself, "who might I be hurting, if I use an LLM for this?"

If you're using an "AI", who is its adversary? If you haven't given it one yet, who might the "AI" turn into an adversary? Who might you overwhelm with an asymmetric amount of output, or, if you're receiving information and not sending it, who are you taking that information from without consulting?

Figure out the answers to these questions and conduct yourself accordingly; the answer might be "yourself".

Acknowledgments

Thank you to my patrons who are supporting my writing on this blog. If you like what you've read here and you'd like to read more of it, or you'd like to support my various open-source endeavors, you can support my work as a sponsor!


  1. One of the reasons that software developers tend to prefer greenfield development is that when you are given a blank page, you can project your own specific understanding onto it. You can structure the codebase in a way that works for your brain, down to the variable naming conventions and the module layouts. LLM-assisted development makes everything into instant brownfield work, which makes developers instantly miserable; even those who are excited about the technology will frequently complain about how it feels like their agency has been stolen and their joy in the work has been diminished. But I digress.

  2. Modulo the massive amount of other externalities involved in using LLMs, of course, but I don't have the time or energy to get into those here.

23 Jun 2026 8:06pm GMT

09 Jun 2026

feedPlanet Twisted

Hynek Schlawack: How to Ditch Codecov for Python Projects

Codecov's unreliability breaking CI on my open source projects has been a constant source of frustration for me for years. I have found a way to enforce coverage over a whole GitHub Actions build matrix that doesn't rely on third-party services.

09 Jun 2026 12:00am GMT

22 May 2026

feedPlanet Twisted

Glyph Lefkowitz: Opaque Types in Python

Let's say you're writing a Python library.

In this library, you have some collection of state that represents "options" or "configuration" for a bunch of operations. Such a set of options is a bundle of potentially ever-increasing complexity. Thus, you will want it to have an extremely minimal compatibility surface, with a very carefully chosen public interface, that is either small, or perhaps nothing at all. Such an object conveys state and might have some private behavior, but all you want consumers to be able to do is build it in very constrained, specific ways, and then pass it along as a parameter to your own APIs.

By way of example, imagine that you're wrapping a library that handles shipping physical packages.

There are a zillion ways to do it ship a package. There are different carriers who can ship it for you. There's air freight, and ground freight, and sea freight. There's overnight shipping. There's the option to require a signature. There's package tracking and certified mail. Suffice it to say, lots of stuff.

If you are starting out to implement such a library, you might need an object called something like ShippingOptions that encapsulates some of this. At the core of your library you might have a function like this:

1
2
3
4
5
async def shipPackage(
        how: ShippingOptions,
        where: Address,
    ) -> ShippingStatus:
    ...

If you are starting out implementing such a library, you know that you're going to get the initial implementation of ShippingOptions wrong; or, at the very least, if not "wrong", then "incomplete". You should not want to commit to an expansive public API with a ton of different attributes until you really understand the problem domain pretty well.

Yet, ShippingOptions is absolutely vital to the rest of your library. You'll need to construct it and pass it to various methods like estimateShippingCost and shipPackage. So you're not going to want a ton of complexity and churn as you evolve it to be more complex.

Worse yet, this object has to hold a ton of state. It's got attributes, maybe even quite complex internal attributes that relate to different shipping services.

Right now, today, you need to add something so you can have "no rush", "standard" and "expedited" options. You can't just put off implementing that indefinitely until you can come up with the perfect shape. What to do?

The tool you want here is the opaque data type design pattern. C is lousy with such things (FILE, pthread_*_t, fd_set, etc). A typedef in a header file can easily achieve this.

But in Python, if you expose a dataclass - or any class, really - even if you keep all your fields private, the constructor is still, inherently, public. You can make it raise an exception or something, but your type checker still won't help your users; it'll still look like it's a normal class.

Luckily, Python typing provides a tool for this: typing.NewType.

Let's review our requirements:

  1. We need a type that our client code can use in its type annotations; it needs to be public.
  2. They need to be able to consruct it somehow, even if they shouldn't be able to see its attributes or its internal constructor arguments.
  3. To express high-level things (like "ship fast") that should stay supported as we add more nuanced and complex configurations in the future (like "ship with the fastest possible option provided by the lowest-cost carrier that supports signature verification").

In order to solve these problems respectively, we will use:

  1. a public NewType, which gives us our public name...
  2. which wraps a private class with entirely private attributes, to give us an actual data structure, while not exposing the constructor,
  3. a set of public constructor functions, which returns our NewType.

When we put that all together, it looks like this:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
from dataclasses import dataclass
from typing import Literal, NewType

@dataclass
class _RealShipOpts:
    _speed: Literal["fast", "normal", "slow"]

ShippingOptions = NewType("ShippingOptions", _RealShipOpts)

def shipFast() -> ShippingOptions:
    return ShippingOptions(_RealShipOpts("fast"))

def shipNormal() -> ShippingOptions:
    return ShippingOptions(_RealShipOpts("normal"))

def shipSlow() -> ShippingOptions:
    return ShippingOptions(_RealShipOpts("slow"))

As a snapshot in time, this is not all that interesting; we could have just exposed _RealShipOpts as a public class and saved ourselves some time. The fact that this exposes a constructor that takes a string is not a big deal for the present moment. For an initial quick and dirty implementation, we can just do checks like if options._speed == "fast" in our shipping and estimation code.

However, the main thing we are doing here is preserving our flexibility to evolve the related APIs into the future, so let's see how we might do that. For example, let's allow the shipping options to contain a concrete and specific carrier and freight method:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
from dataclasses import dataclass
from enum import Enum, auto
from typing import NewType

class Carrier(Enum):
    FedEx = auto()
    USPS = auto()
    DHL = auto()
    UPS = auto()

class Conveyance(Enum):
    air = auto()
    truck = auto()
    train = auto()

@dataclass
class _RealShipOpts:
    _carrier: Carrier
    _freight: Conveyance

ShippingOptions = NewType("ShippingOptions", _RealShipOpts)

def shipFast() -> ShippingOptions:
    return ShippingOptions(_RealShipOpts(Carrier.FedEx, Conveyance.air))

def shipNormal() -> ShippingOptions:
    return ShippingOptions(_RealShipOpts(Carrier.UPS, Conveyance.truck))

def shipSlow() -> ShippingOptions:
    return ShippingOptions(_RealShipOpts(Carrier.USPS, Conveyance.train))

def shippingDetailed(
    carrier: Carrier, conveyance: Conveyance
) -> ShippingOptions:
    return ShippingOptions(_RealShipOpts(carrier, conveyance))

As a NewType, our public ShippingOptions type doesn't have a constructor. Since _RealShipOpts is private, and all its attributes are private, we can completely remove the old versions.

Anything within our shipping library can still access the private variables on ShippingOptions; as a NewType, it's the same type as its base at runtime, so it presents minimal1 overhead.

Clients outside our shipping library can still call all of our public constructors: shipFast, shipNormal, and shipSlow all still work with the same (as far as calling code knows) signature and behavior.

If you need to build and convey some state within your public API, while avoiding breakages associated with compatibility churn, hopefully this technique can help you do that!


Acknowledgments

Thanks for reading, and thank you to my patrons who are supporting my writing on this blog. If you like what you've read here and you'd like to read more of it, or you'd like to support my various open-source endeavors, you can support my work as a sponsor.


  1. The overhead is minimal, but it is not completely zero. The suggested idiom for converting to a NewType is to call it like a function, as I've done in these examples, but if you are wanting to use this pattern inside of a hot loop, you can use # type: ignore[return-value] comments to avoid that small cost.

22 May 2026 12:33am GMT