04 Feb 2025
Planet Python
Łukasz Langa: Generating visual art every day in January
There's a small but dedicated community of crazy artists that gather every January to generate stunning visual art for no apparent reason other than Genuary being a clever pun on January. I joined them this year to get better at PyScript and WebGL. Here's how it went.
04 Feb 2025 8:29pm GMT
PyCoder’s Weekly: Issue #667: String Templates, Missing Data, Dynamic Forms, and More (Feb. 4, 2025)
#667 - FEBRUARY 4, 2025
View in Browser »
A Revamped Python String-Formatting Proposal
PEP 750 proposes template strings, and the PEP has been through a lot of modifications since it was originally introduced. This article covers the latest and why the changes have been made to the proposal.
JAKE EDGE
How to Deal With Missing Data in Polars
In this tutorial, you'll learn how to deal with missing data in Polars to ensure it doesn't interfere with your data analysis. You'll discover how to check for missing values, update them, and remove them.
REAL PYTHON
Build AI Agents in Just Hours not Months
Design and deploy multi-agent systems at zero cost using our free plan, no credit card required. Skip complex configurations with pre-integrated setups for top LLMs. Speed up development with pre-built agent templates, and unlock advanced workflow functionality using our open-source Python package →
DYNAMIQ sponsor
Dynamic Forms With Flask
This post shows you how to create dynamic web forms, where fields are added on the fly, when coding within the Flask web framework.
MIGUEL GRINBERG
Python Jobs
Backend Software Engineer (Anywhere)
Articles & Tutorials
Python Rgonomics: 2025 Update
"Switching languages is about switching mindsets - not just syntax. New developments in Python data science toolings, like polars and seaborn's object interface, can capture the 'feel' that converts from R/tidyverse love while opening the door to truly pythonic workflows. (Updated from 2025 for new tools)."
EMILY RIEDERER
Portable Python Bundles on Windows
Packaging and distributing a Python environment on MS Windows can be tricky. What if neither venvs nor tools like PyInstaller meet your needs? Here's one take on it: Portable Python Bundles on Windows. Bundles are like venvs - but are self-contained, path independent and "just work" on any Windows machine.
DEV.TO • Shared by Chris Korneck
Learn How to Build GenAI Projects
Join a FREE 6-week virtual bootcamp and get hands-on experience building GenAI projects. It's open to all skill levels with live, instructor-led classes guiding you every step of the way. Spots are filling fast-register today!
INTEL CORPORATION sponsor
3D Printing Giant Things With a Python Jigsaw Generator
This is a long, detailed article on 3D printing objects too large for the printer bed. The author has created dovetail joints to assemble pieces together. He wrote a Python program to automatically split up the larger model files into the jigsaw pieces needed to build a final result.
CAL BRYANT
Creating a Scalable Flask Web Application From Scratch
In this video course, you'll explore the process of creating a boilerplate for a Flask web project. It's a great starting point for any scalable Flask web app that you wish to develop in the future, from basic web pages to complex web applications.
REAL PYTHON course
My First Steps With Playwright
Playwright is a browser based automation tool that can be used for web scraping or testing. This intro article shows you how to use the Python interface to access a page including using cookies.
NICOLAS FRÄNKEL
Looking at Django Task Runners and Queues
There are a lot of different ways of running asynchronous tasks in Django. This article talks about two mechanisms used by the author, as well as the extra challenge of one-off jobs.
KEVIN RENSKERS
PyPI Now Supports Project Archival
The ability to mark a project as archived has been added to the Python Package Index. This article covers what archival can be used for as well as hinting at future improvements.
FACUNDO TUESCA
Re-Creating Async in 10 Lines Using Generators
This article outlines a simple re-implementation of the concepts available in the async library so you can better learn how it works under the covers.
LUCAS SEIKI OSHIRO
Building Cython (Or C) Extensions Using uv
Developing Python libraries with C extensions can be tricky. Learn how uv and setuptools can work together to build Cython-powered projects.
SIDDHANT GOEL
Projects & Code
Events
Weekly Real Python Office Hours Q&A (Virtual)
February 5, 2025
REALPYTHON.COM
Fine-Grained Authorization in Python (Webinar)
February 6, 2025
OSO
Canberra Python Meetup
February 6, 2025
MEETUP.COM
Sydney Python User Group (SyPy)
February 6, 2025
SYPY.ORG
PyCascades 2025
February 8 to February 10, 2025
PYCASCADES.COM
PyDelhi User Group Meetup
February 8, 2025
MEETUP.COM
DFW Pythoneers 2nd Saturday Teaching Meeting
February 8, 2025
MEETUP.COM
Happy Pythoning!
This was PyCoder's Weekly Issue #667.
View in Browser »
[ Subscribe to 🐍 PyCoder's Weekly 💌 - Get the best Python news, articles, and tutorials delivered to your inbox once a week >> Click here to learn more ]
04 Feb 2025 7:30pm GMT
Hugo van Kemenade: How to delay a Python release
Prologue #
This was a Twitter thread from 15th January 2022 about my first CPython bug. Eight days from report to fix to merge, not bad!
Delay #
I helped delay the release of Python 3.11.0a4! But in a good way! 😇
Python 3.11 is due out in October, but they make early alpha, beta and release candidates available for people to help test Python itself and their own code before the big release.
So I tested Pillow…
Tests #
The Pillow test suite passed with 3.11 ✅
Next I tried building the documentation with 3.11 ❌
The docs program, Sphinx, emitted a couple of warnings. Warnings are often missed because they don't error. But luckily we use the "-W" option to turn warnings into hard errors.
Sphinx #
Maybe Sphinx isn't ready for Python 3.11?
Rather than submitting a report with the full Pillow documentation (lots of files) I made a new, "minimal" example with just enough stuff to reproduce it.
This makes it easier to investigate what's up.
Report 1 #
I reported this to Sphinx. The problem was that a page in a subdirectory could load an image from one directory, but not from another, further away directory.
It occurs for the Python 3.11.0a3 alpha, but not 3.7-3.10.
CPython #
A few hours later the Sphinx maintainer Takeshi said it looks like a change to a part of Python itself - os.path.normpath()
- since 3.11.0a3, and as it wasn't mentioned on the "What's New in Python 3.11" page it could be a bug in Python.
He asked me to report it to Python.
Report 2 #
I reported it to Python with Takeshi's even more minimal example.
Half an hour later Christian pointed out a change which may have caused this.
I tested and confirmed.
The next day Steve confirmed it was a bug and set it as a "release blocker".
Fix #
Steve also said it will needs tests, because this bug slipped out due to a gap in testing.
I didn't know how to fix the bug, but I could write some test cases!
neonene then took the tests and fixed the bug! In doing so they found even more bugs!
Merge #
These extra bugs also existed in earlier versions.
But it turns out path handling can get pretty complicated in places, so Steve decided to only fix my bug now to get it released and the others can be sorted later.
The fix was merged and I confirmed it also worked with Sphinx ✅
Conclusion #
And that's about it!
It's now fixed in 3.11.0a4; much better to find these before 3.11.0 final is released to the world in October. Along the way we found more issues to address.
Short version: test your code with 3.11 now, you may find issues in your code or in Python itself 🚀
Epilogue #
Back to 2025: Please test and delay Python 3.14 alpha - but in a good way! 😇
04 Feb 2025 6:55pm GMT
Python Insider: Python 3.13.2 and 3.12.9 now available!
A small release day today! That is to say the releases are relatively small; the day itself was of average size, as most days are.
Python 3.13.2
Python 3.13's second maintenance release. About 250 changes went into this update, and can be yours for free if you just upgrade now.
https://www.python.org/downloads/release/python-3132/
Python 3.12.9
Python 3.12's ninth maintenance release already. Just 180 changes for 3.12, but it's still worth upgrading.
https://www.python.org/downloads/release/python-3129/Enjoy the new releases
Thanks to all of the many volunteers who help make Python Development and these releases possible! Please consider supporting our efforts by volunteering yourself or through organization contributions to the Python Software Foundation.
Regards from your tireless, tireless release team,
Thomas Wouters
Ned Deily
Steve Dower
Łukasz Langa
04 Feb 2025 2:58pm GMT
Real Python: NumPy Techniques and Practical Examples
The NumPy library is a Python library used for scientific computing. It provides you with a multidimensional array object for storing and analyzing data in a wide variety of ways. In this video course, you'll see examples of some features NumPy provides that aren't always highlighted in other tutorials.
In this video course, you'll learn how to:
- Create multidimensional arrays from data stored in files
- Identify and remove duplicate data from a NumPy array
- Use structured NumPy arrays to reconcile the differences between datasets
- Analyze and chart specific parts of hierarchical data
- Create vectorized versions of your own functions
[ Improve Your Python With 🐍 Python Tricks 💌 - Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]
04 Feb 2025 2:00pm GMT
Python Circle: Installing Python3.13 on Ubuntu 22.04
python3.13 installation, building python from source code, python 3.13, latest python installation, solving sqlite not found error, python stable version installation on linux, ubuntu python 3.13 installation, No module named '_sqlite3'
04 Feb 2025 4:41am GMT
HoloViz: HoloViz and Bokeh for Neuroscience
04 Feb 2025 12:00am GMT
Armin Ronacher: Fat Rand: How Many Lines Do You Need To Generate A Random Number?
I recently wrote about dependencies in Rust. The feedback, both within and outside the Rust community, was very different. A lot of people, particularly some of those I greatly admire expressed support. The Rust community, on the other hand, was very dismissive on on Reddit and Lobsters.
Last time, I focused on the terminal_size crate, but I also want to show you a different one that I come across once more: rand. It has a similarly out-of-whack value-to-dependency ratio, but in a slightly different way. More than terminal_size, you are quite likely to use it. If for instance if you want to generate a random UUID, the uuid crate will depend on it. Due to its nature it also has a high security exposure.
I don't want to frame this as "rand is a bad crate". It's not a bad crate at all! It is however a crate that does not appear very concerned about how many dependencies it has, and I want to put this in perspective: of all the dependencies and lines of codes it pulls in, how many does it actually use?
As the name implies, the rand crate is capable of calculating random numbers. The crate itself has seen a fair bit of churn: for instance 0.9 broke backwards compatibility with 0.8. So, as someone who used that crate, I did what a responsible developer is supposed to do, and upgraded the dependency. After all, I don't want to be the reason there are two versions of rand in the dependency tree. After the upgrade, I was surprised how fat that dependency tree has become over the last nine months.
Today, this is what the dependency tree looks like for the default feature set on macOS and Linux:
x v0.1.0 (/private/tmp/x) └── rand v0.9.0 ├── rand_chacha v0.9.0 │ ├── ppv-lite86 v0.2.20 │ │ └── zerocopy v0.7.35 │ │ ├── byteorder v1.5.0 │ │ └── zerocopy-derive v0.7.35 (proc-macro) │ │ ├── proc-macro2 v1.0.93 │ │ │ └── unicode-ident v1.0.16 │ │ ├── quote v1.0.38 │ │ │ └── proc-macro2 v1.0.93 (*) │ │ └── syn v2.0.98 │ │ ├── proc-macro2 v1.0.93 (*) │ │ ├── quote v1.0.38 (*) │ │ └── unicode-ident v1.0.16 │ └── rand_core v0.9.0 │ ├── getrandom v0.3.1 │ │ ├── cfg-if v1.0.0 │ │ └── libc v0.2.169 │ └── zerocopy v0.8.14 ├── rand_core v0.9.0 (*) └── zerocopy v0.8.14
About a year ago, it looked like this:
x v0.1.0 (/private/tmp/x) └── rand v0.8.5 ├── libc v0.2.169 ├── rand_chacha v0.3.1 │ ├── ppv-lite86 v0.2.17 │ └── rand_core v0.6.4 │ └── getrandom v0.2.10 │ ├── cfg-if v1.0.0 │ └── libc v0.2.169 └── rand_core v0.6.4 (*)
Not perfect, but better.
So, let's investigate what all these dependencies do. The current version pulls in quite a lot.
Platform Dependencies
First there is the question of getting access to the system RNG. On Linux and Mac it uses libc, for Windows it uses the pretty heavy Microsoft crates (windows-targets). The irony is that the Rust standard library already implements a way to get a good seed from the system, but it does not expose it. Well, not really at least. There is a crate called fastrand which does not have any dependencies which seeds itself by funneling out seeds from the stdlib via the hasher system. That looks a bit like this:
use std::collections::hash_map::RandomState;
use std::hash::{BuildHasher, Hasher};
fn random_seed() -> u64 {
RandomState::new().build_hasher().finish()
}
Now obviously that's a hack, but it will work because the hashmap's hasher is randomly seeded from good sources. There is a single-dependency crate too which can read from the system's entropy source and that's getrandom. So there at least could be a world where rand only depends on that.
Dependency Chain
If you want to audit the entire dependency chain, you end up with maintainers that form eight distinct groups:
- libc: rust core + various externals
- cfg-if: rust core + Alex Crichton
- windows-*: Microsoft
- rand_* and getrandom: rust nursery + rust-random
- ppv-lite86: Kaz Wesley
- zerocopy and zerocopy-derive: Google (via two ICs there, Google does not publish)
- byteorder: Andrew Gallant
- syn, quote, proc-macro2, unicode-ident: David Tolnay
If I also cared about WASM targets, I'd have to consider even more dependencies.
Code Size
So let's vendor it. How much code is there? After removing all tests, we end up with 29 individual crates vendored taking up 62MB disk space. Tokei reports 209,150 lines of code.
Now this is a bit misleading, because like many times most of this is within windows-*. But how much of windows-* does getrandom need? A single function:
extern "system" fn ProcessPrng(pbdata: *mut u8, cbdata: usize) -> i32
For that single function (and the information which DLL it needs link into), we are compiling and downloading megabytes of windows-targets. Longer term this might not be necessary, but today it is.
On Unix, it's harder to avoid libc because it tries multiple APIs. These are mostly single-function APIs, but some non-portable constants make libc difficult to avoid.
Beyond the platform dependencies, what else is there?
- ppv-lite86 (the rand's picked default randon number generator) alone comes to 3,587 lines of code including 168 unsafe blocks. If the goal of using zerocopy was to avoid unsafe, there is still a ton of unsafe remaining.
- The combination of proc-macro2, quote, syn, and unicode-ident comes to 49,114 lines of code.
- byteorder clocks in at 3,000 lines of code.
- The pair of zerocopy and zerocopy-derive together? 14,004 lines of code.
All of these are great crates, but do I need all of this just to generate a random number?
Compilation Times
Then there are compile times. How long does it take to compile? 4.3 seconds on my high-end M1 Max. A lot of dependencies block each other, particularly the part that waits for the derives to finish.
- rand depends on rand_chacha,
- which depends on ppv-lite86,
- which depends on zerocopy (with the derive feature),
- which depends on zerocopy-derive
- which pulls compiler plugins crates.
Only after all the code generation finished, the rest will make meaningful progress. In total a release build produces 36MB of compiler artifacts. 12 months ago, it took just under 2 seconds.
Final Thoughts
The Rust developer community on Reddit doesn't seem very concerned. The main sentiment is that rand now uses less unsafe so that's benefit enough. While the total amount of unsafe probably did not go down, that moved unsafe is is now in a common crate written by people that know how to use unsafe (zerocopy). There is also the sentiment that all of this doesn't matter anyways, because we will will all soon depend on zerocopy everywhere anyways, as more and more dependencies are switching over to it.
Maybe this points to Rust not having a large enough standard library. Perhaps features like terminal size detection and random number generation should be included. That at least is what people pointed out on Twitter.
We already treat crates like regex, rand, and serde as if they were part of the standard library. The difference is that I can trust the standard library as a whole-it comes from a single set of authors, making auditing easier. If these external, but almost standard crates were more cautious about dependencies and make it more of a goal to be auditable, we would all benefit.
Or maybe this is just how Rust works now. That would make me quite sad.
Update: it looks like there is some appetite in rand to improve on this.
- zerocopy might be removed in the core library: issue #1574 and PR #1575.
- a stripped down version of chacha20 (which does not require zerocopy or most of the rust-crypto ecosystem) might replace ppv-lite86: PR #934.
- if you use Rust 1.71 or later, windows-target becomes mostly a no-op if you compile with --cfg=windows_raw_dylib.
Edit: This post originally incorrectly said that getrandom depends on windows-sys. That is incorrect, it only depends on windows-targets.
04 Feb 2025 12:00am GMT
03 Feb 2025
Planet Python
Eli Bendersky: Decorator JITs - Python as a DSL
Spend enough time looking at Python programs and packages for machine learning, and you'll notice that the "JIT decorator" pattern is pretty popular. For example, this JAX snippet:
import jax.numpy as jnp
import jax
@jax.jit
def add(a, b):
return jnp.add(a, b)
# Use "add" as a regular Python function
... = add(...)
Or the Triton language for writing GPU kernels directly in Python:
import triton
import triton.language as tl
@triton.jit
def add_kernel(x_ptr,
y_ptr,
output_ptr,
n_elements,
BLOCK_SIZE: tl.constexpr):
pid = tl.program_id(axis=0)
block_start = pid * BLOCK_SIZE
offsets = block_start + tl.arange(0, BLOCK_SIZE)
mask = offsets < n_elements
x = tl.load(x_ptr + offsets, mask=mask)
y = tl.load(y_ptr + offsets, mask=mask)
output = x + y
tl.store(output_ptr + offsets, output, mask=mask)
In both cases, the function decorated with jit doesn't get executed by the Python interpreter in the normal sense. Instead, the code inside is more like a DSL (Domain Specific Language) processed by a special purpose compiler built into the library (JAX or Triton). Another way to think about it is that Python is used as a meta language to describe computations.
In this post I will describe some implementation strategies used by libraries to make this possible.
Preface - where we're going
The goal is to explain how different kinds of jit decorators work by using a simplified, educational example that implements several approaches from scratch. All the approaches featured in this post will be using this flow:
Expr IR --> LLVM IR --> Execution" /> Expr IR --> LLVM IR --> Execution" class="align-center" src="https://eli.thegreenplace.net/images/2025/decjit-python.png" />These are the steps that happen when a Python function wrapped with our educational jit decorator is called:
- The function is translated to an "expression IR" - Expr.
- This expression IR is converted to LLVM IR.
- Finally, the LLVM IR is JIT-executed.
Steps (2) and (3) use llvmlite; I've written about llvmlite before, see this post and also the pykaleidoscope project. For an introduction to JIT compilation, be sure to read this and maybe also the series of posts starting here.
First, let's look at the Expr IR. Here we'll make a big simplification - only supporting functions that define a single expression, e.g.:
def expr2(a, b, c, d):
return (a + d) * (10 - c) + b + d / c
Naturally, this can be easily generalized - after all, LLVM IR can be used to express fully general computations.
Here are the Expr data structures:
class Expr:
pass
@dataclass
class ConstantExpr(Expr):
value: float
@dataclass
class VarExpr(Expr):
name: str
arg_idx: int
class Op(Enum):
ADD = "+"
SUB = "-"
MUL = "*"
DIV = "/"
@dataclass
class BinOpExpr(Expr):
left: Expr
right: Expr
op: Op
To convert an Expr into LLVM IR and JIT-execute it, we'll use this function:
def llvm_jit_evaluate(expr: Expr, *args: float) -> float:
"""Use LLVM JIT to evaluate the given expression with *args.
expr is an instance of Expr. *args are the arguments to the expression, each
a float. The arguments must match the arguments the expression expects.
Returns the result of evaluating the expression.
"""
llvm.initialize()
llvm.initialize_native_target()
llvm.initialize_native_asmprinter()
llvm.initialize_native_asmparser()
cg = _LLVMCodeGenerator()
modref = llvm.parse_assembly(str(cg.codegen(expr, len(args))))
target = llvm.Target.from_default_triple()
target_machine = target.create_target_machine()
with llvm.create_mcjit_compiler(modref, target_machine) as ee:
ee.finalize_object()
cfptr = ee.get_function_address("func")
cfunc = CFUNCTYPE(c_double, *([c_double] * len(args)))(cfptr)
return cfunc(*args)
It uses the _LLVMCodeGenerator class to actually generate LLVM IR from Expr. This process is straightforward and covered extensively in the resources I linked to earlier; take a look at the full code here.
My goal with this architecture is to make things simple, but not too simple. On one hand - there are several simplifications: only single expressions are supported, very limited set of operators, etc. It's very easy to extend this! On the other hand, we could have just trivially evaluated the Expr without resorting to LLVM IR; I do want to show a more complete compilation pipeline, though, to demonstrate that an arbitrary amount of complexity can be hidden behind these simple interfaces.
With these building blocks in hand, we can review the strategies used by jit decorators to convert Python functions into Exprs.
AST-based JIT
Python comes with powerful code reflection and introspection capabilities out of the box. Here's the astjit decorator:
def astjit(func):
@functools.wraps(func)
def wrapper(*args, **kwargs):
if kwargs:
raise ASTJITError("Keyword arguments are not supported")
source = inspect.getsource(func)
tree = ast.parse(source)
emitter = _ExprCodeEmitter()
emitter.visit(tree)
return llvm_jit_evaluate(emitter.return_expr, *args)
return wrapper
This is a standard Python decorator. It takes a function and returns another function that will be used in its place (functools.wraps ensures that function attributes like the name and docstring of the wrapper match the wrapped function).
Here's how it's used:
from astjit import astjit
@astjit
def some_expr(a, b, c):
return b / (a + 2) - c * (b - a)
print(some_expr(2, 16, 3))
After astjit is applied to some_expr, what some_expr holds is the wrapper. When some_expr(2, 16, 3) is called, the wrapper is invoked with *args = [2, 16, 3].
The wrapper obtains the AST of the wrapped function, and then uses _ExprCodeEmitter to convert this AST into an Expr:
class _ExprCodeEmitter(ast.NodeVisitor):
def __init__(self):
self.args = []
self.return_expr = None
self.op_map = {
ast.Add: Op.ADD,
ast.Sub: Op.SUB,
ast.Mult: Op.MUL,
ast.Div: Op.DIV,
}
def visit_FunctionDef(self, node):
self.args = [arg.arg for arg in node.args.args]
if len(node.body) != 1 or not isinstance(node.body[0], ast.Return):
raise ASTJITError("Function must consist of a single return statement")
self.visit(node.body[0])
def visit_Return(self, node):
self.return_expr = self.visit(node.value)
def visit_Name(self, node):
try:
idx = self.args.index(node.id)
except ValueError:
raise ASTJITError(f"Unknown variable {node.id}")
return VarExpr(node.id, idx)
def visit_Constant(self, node):
return ConstantExpr(node.value)
def visit_BinOp(self, node):
left = self.visit(node.left)
right = self.visit(node.right)
try:
op = self.op_map[type(node.op)]
return BinOpExpr(left, right, op)
except KeyError:
raise ASTJITError(f"Unsupported operator {node.op}")
When _ExprCodeEmitter finishes visiting the AST it's given, its return_expr field will contain the Expr representing the function's return value. The wrapper then invokes llvm_jit_evaluate with this Expr.
Note how our decorator interjects into the regular Python execution process. When some_expr is called, instead of the standard Python compilation and execution process (code is compiled into bytecode, which is then executed by the VM), we translate its code to our own representation and emit LLVM from it, and then JIT execute the LLVM IR. While it seems kinda pointless in this artificial example, in reality this means we can execute the function's code in any way we like.
AST JIT case study: Triton
This approach is almost exactly how the Triton language works. The body of a function decorated with @triton.jit gets parsed to a Python AST, which then - through a series of internal IRs - ends up in LLVM IR; this in turn is lowered to PTX by the NVPTX LLVM backend. Then, the code runs on a GPU using a standard CUDA pipeline.
Naturally, the subset of Python that can be compiled down to a GPU is limited; but it's sufficient to run performant kernels, in a language that's much friendlier than CUDA and - more importantly - lives in the same file with the "host" part written in regular Python. For example, if you want testing and debugging, you can run Triton in "interpreter mode" which will just run the same kernels locally on a CPU.
Note that Triton lets us import names from the triton.language package and use them inside kernels; these serve as the intrinsics for the language - special calls the compiler handles directly.
Bytecode-based JIT
Python is a fairly complicated language with a lot of features. Therefore, if our JIT has to support some large portion of Python semantics, it may make sense to leverage more of Python's own compiler. Concretely, we can have it compile the wrapped function all the way to bytecode, and start our translation from there.
Here's the bytecodejit decorator that does just this [1]:
def bytecodejit(func):
@functools.wraps(func)
def wrapper(*args, **kwargs):
if kwargs:
raise BytecodeJITError("Keyword arguments are not supported")
expr = _emit_exprcode(func)
return llvm_jit_evaluate(expr, *args)
return wrapper
def _emit_exprcode(func):
bc = func.__code__
stack = []
for inst in dis.get_instructions(func):
match inst.opname:
case "LOAD_FAST":
idx = inst.arg
stack.append(VarExpr(bc.co_varnames[idx], idx))
case "LOAD_CONST":
stack.append(ConstantExpr(inst.argval))
case "BINARY_OP":
right = stack.pop()
left = stack.pop()
match inst.argrepr:
case "+":
stack.append(BinOpExpr(left, right, Op.ADD))
case "-":
stack.append(BinOpExpr(left, right, Op.SUB))
case "*":
stack.append(BinOpExpr(left, right, Op.MUL))
case "/":
stack.append(BinOpExpr(left, right, Op.DIV))
case _:
raise BytecodeJITError(f"Unsupported operator {inst.argval}")
case "RETURN_VALUE":
if len(stack) != 1:
raise BytecodeJITError("Invalid stack state")
return stack.pop()
case "RESUME" | "CACHE":
# Skip nops
pass
case _:
raise BytecodeJITError(f"Unsupported opcode {inst.opname}")
The Python VM is a stack machine; so we emulate a stack to convert the function's bytecode to Expr IR (a bit like an RPN evaluator). As before, we then use our llvm_jit_evaluate utility function to lower Expr to LLVM IR and JIT execute it.
Using this JIT is as simple as the previous one - just swap astjit for bytecodejit:
from bytecodejit import bytecodejit
@bytecodejit
def some_expr(a, b, c):
return b / (a + 2) - c * (b - a)
print(some_expr(2, 16, 3))
Bytecode JIT case study: Numba
Numba is a compiler for Python itself. The idea is that you can speed up specific functions in your code by slapping a numba.njit decorator on them. What happens next is similar in spirit to our simple bytecodejit, but of course much more complicated because it supports a very large portion of Python semantics.
Numba uses the Python compiler to emit bytecode, just as we did; it then converts it into its own IR, and then to LLVM using llvmlite [2].
By starting with the bytecode, Numba makes its life easier (no need to rewrite the entire Python compiler). On the other hand, it also makes some analyses harder, because by the time we're in bytecode, a lot of semantic information existing in higher-level representations is lost. For example, Numba has to sweat a bit to recover control flow information from the bytecode (by running it through a special interpreter first).
Tracing-based JIT
The two approaches we've seen so far are similar in many ways - both rely on Python's introspection capabilities to compile the source code of the JIT-ed function to some extent (one to AST, the other all the way to bytecode), and then work on this lowered representation.
The tracing strategy is very different. It doesn't analyze the source code of the wrapped function at all - instead, it traces its execution by means of specially-boxed arguments, leveraging overloaded operators and functions, and then works on the generated trace.
The code implementing this for our smile demo is surprisingly compact:
def tracejit(func):
@functools.wraps(func)
def wrapper(*args, **kwargs):
if kwargs:
raise TraceJITError("Keyword arguments are not supported")
argspec = inspect.getfullargspec(func)
argboxes = []
for i, arg in enumerate(args):
if i >= len(argspec.args):
raise TraceJITError("Too many arguments")
argboxes.append(_Box(VarExpr(argspec.args[i], i)))
out_box = func(*argboxes)
return llvm_jit_evaluate(out_box.expr, *args)
return wrapper
Each runtime argument of the wrapped function is assigned a VarExpr, and that is placed in a _Box, a placeholder class which lets us do operator overloading:
@dataclass
class _Box:
expr: Expr
_Box.__add__ = _Box.__radd__ = _register_binary_op(Op.ADD)
_Box.__sub__ = _register_binary_op(Op.SUB)
_Box.__rsub__ = _register_binary_op(Op.SUB, reverse=True)
_Box.__mul__ = _Box.__rmul__ = _register_binary_op(Op.MUL)
_Box.__truediv__ = _register_binary_op(Op.DIV)
_Box.__rtruediv__ = _register_binary_op(Op.DIV, reverse=True)
The remaining key function is _register_binary_op:
def _register_binary_op(opcode, reverse=False):
"""Registers a binary opcode for Boxes.
If reverse is True, the operation is registered as arg2 <op> arg1,
instead of arg1 <op> arg2.
"""
def _op(arg1, arg2):
if reverse:
arg1, arg2 = arg2, arg1
box1 = arg1 if isinstance(arg1, _Box) else _Box(ConstantExpr(arg1))
box2 = arg2 if isinstance(arg2, _Box) else _Box(ConstantExpr(arg2))
return _Box(BinOpExpr(box1.expr, box2.expr, opcode))
return _op
To understand how this works, consider this trivial example:
@tracejit
def add(a, b):
return a + b
print(add(1, 2))
After the decorated function is defined, add holds the wrapper function defined inside tracejit. When add(1, 2) is called, the wrapper runs:
- For each argument of add itself (that is a and b), it creates a new _Box holding a VarExpr. This denotes a named variable in the Expr IR.
- It then calls the wrapped function, passing it the boxes as runtime parameters.
- When (the wrapped) add runs, it invokes a + b. This is caught by the overloaded __add__ operator of _Box, and it creates a new BinOpExpr with the VarExprs representing a and b as children. This BinOpExpr is then returned [3].
- The wrapper unboxes the returned Expr and passes it to llvm_jit_evaluate to emit LLVM IR from it and JIT execute it with the actual runtime arguments of the call: 1, 2.
This might be a little mind-bending at first, because there are two different executions that happen:
- The first is calling the wrapped add function itself, letting the Python interpreter run it as usual, but with special arguments that build up the IR instead of doing any computations. This is the tracing step.
- The second is lowering this IR our tracing step built into LLVM IR and then JIT executing it with the actual runtime argument values 1, 2; this is the execution step.
This tracing approach has some interesting characteristics. Since we don't have to analyze the source of the wrapped functions but only trace through the execution, we can "magically" support a much richer set of programs, e.g.:
@tracejit
def use_locals(a, b, c):
x = a + 2
y = b - a
z = c * x
return y / x - z
print(use_locals(2, 8, 11))
This just works with our basic tracejit. Since Python variables are placeholders (references) for values, our tracing step is oblivious to them - it follows the flow of values. Another example:
@tracejit
def use_loop(a, b, c):
result = 0
for i in range(1, 11):
result += i
return result + b * c
print(use_loop(10, 2, 3))
This also just works! The created Expr will be a long chain of BinExpr additions of i's runtime values through the loop, added to the BinExpr for b * c.
This last example also leads us to a limitation of the tracing approach; the loop cannot be data-dependent - it cannot depend on the function's arguments, because the tracing step has no concept of runtime values and wouldn't know how many iterations to run through; or at least, it doesn't know this unless we want to perform the tracing run for every runtime execution [4].
The tracing approach is useful in several domains, most notably automatic differentiation (AD). For a slightly deeper taste, check out my radgrad project.
Tracing JIT case study: JAX
The JAX ML framework uses a tracing approach very similar to the one described here. The first code sample in this post shows the JAX notation. JAX cleverly wraps Numpy with its own version which is traced (similar to our _Box, but JAX calls these boxes "tracers"), letting you write regular-feeling Numpy code that can be JIT optimized and executed on accelerators like GPUs and TPUs via XLA. JAX's tracer builds up an underlying IR (called jaxpr) which can then be emitted to XLA ops and passed to XLA for further lowering and execution.
For a fairly deep overview of how JAX works, I recommend reading the autodidax doc.
As mentioned earlier, JAX has some limitations with things like data-dependent control flow in native Python. This won't work, because there's control flow that depends on a runtime value (count):
import jax
@jax.jit
def sum_datadep(a, b, count):
total = a
for i in range(count):
total += b
return total
print(sum_datadep(10, 3, 3))
When sum_datadep is executed, JAX will throw an exception, saying something like:
This concrete value was not available in Python because it depends on the value of the argument count.
As a remedy, JAX has its own built-in intrinsics from the jax.lax package. Here's the example rewritten in a way that actually works:
import jax
from jax import lax
@jax.jit
def sum_datadep_fori(a, b, count):
def body(i, total):
return total + b
return lax.fori_loop(0, count, body, a)
fori_loop (and many other built-ins in the lax package) is something JAX can trace through, generating a corresponding XLA operation (XLA has support for While loops, to which this lax.fori_loop can be lowered).
The tracing approach has clear benefits for JAX as well; because it only cares about the flow of values, it can handle arbitrarily complicated Python code, as long as the flow of values can be traced. Just like the local variables and data-independent loops shown earlier, but also things like closures. This makes meta-programming and templating easy.
Code
The full code for this post is available on GitHub.
[1] | Once again, this is a very simplified example. A more realistic translator would have to support many, many more Python bytecode instructions. |
[2] | In fact, llvmlite itself is a Numba sub-project and is maintained by the Numba team, for which I'm grateful! |
[3] | For a fun exercise, try adding constant folding to the wrapped _op: when both its arguments are constants (not boxes), instead placing each in a _Box(ConstantExpr(...)), it could perform the mathematical operation on them and return a single constant box. This is a common optimization in compilers! |
[4] |
In all the JIT approaches showed in this post, the expectation is that compilation happens once, but the compiled function can be executed many times (perhaps in a loop). This means that the compilation step cannot depend on the runtime values of the function's arguments, because it has no access to them. You could say that it does, but that's just for the very first time the function is run (in the tracing approach); it has no way of knowing their values the next times the function will run. JAX has some provisions for cases where a function is invoked with a small set of runtime values and we want to separately JIT each of them. |
03 Feb 2025 10:22pm GMT
PyBites: The Mutable Trap: Avoiding Unintended Side Effects in Python
Ever had a Python function behave strangely, remembering values between calls when it shouldn't? You're not alone! This is one of Python's sneakiest pitfalls-mutable default parameters.
Recently someone asked for help in our Pybites Circle Community with a Bite exercise that seemed to be behaving unexpectedly.
It turned out that this was a result of modifying a mutable parameter passed to a function.
For folks new to programming it is not obvious why modifying a variable inside a function might cause a change outside of that function. Let's have a closer look at the underlying issue.
What is a Python Variable
When considering variables in Python it is a good idea to differentiate between a variable's name and the object that it represents.
Think of a variable like a name tag on an object. An object can have more than one name tag, and modifying the object affects everyone holding a reference to it.
# The inspiration variable is pointing to the Singleton None
>>> inspiration = None
>>> id(inspiration)
140713396607856
# The inspiration variable is now pointing to a string
>>> inspiration = "Read Atomic Habits"
>>> id(inspiration)
3034242491760
# The inspiration variable is now pointing to a different string,
# and as strings are immutable, the id has changed.
# It's a different object.
>>> inspiration = "Bob and Julian"
>>> id(inspiration)
3034242497712
>>> nums = list(range(10))
# The nums variable is pointing to a list
>>> nums
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> id(nums)
3034242497984
>>> nums.append(10)
# The list that nums is pointing to has been modified,
# but the id is the same because lists are mutable.
>>> nums
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
>>> id(nums)
3034242497984
>>> me = "Tarzan"
>>> me_too = me
>>> id(me)
2546636178128
>>> id(me_too)
2546636178128
# Variables me and me_too are both pointing to the same string.
>>> me is me_too
True
Python passes parameters by Reference
Python passes parameters to functions by Reference - also referred to as call by sharing. This results in multiple names bound to the same object.
Consider this simple case where a global variable is passed into a function:
# Our global variable
FAB_FOUR = ["John", "Paul", "George", "Ringo"]
# Our positional (mutable) parameter
def meet_the_beatles(members) -> None:
# Sorting our local variable
members.sort()
print(f" ... {members}")
def main():
print(f"Before: {FAB_FOUR}")
meet_the_beatles(FAB_FOUR)
print(f" After: {FAB_FOUR}")
if __name__ == "__main__":
main()
Running the above code results in the following output:
Before: ['John', 'Paul', 'George', 'Ringo']
... ['George', 'John', 'Paul', 'Ringo']
After: ['George', 'John', 'Paul', 'Ringo']
Which shows that our global variable FAB_FOUR
has indeed been modified. This is because our function variable members
is really just an alias for the global variable FAB_FOUR
- they both point to the same object.
The excellent site Python Tutor can be used to provide a nice visualisation:
Take Care when programming with Mutable Parameters
Functions that mutate their input values or modify state in other parts of the program behind the scenes are said to have side effects and as a general rule this is best avoided. However, it is not uncommon to encounter such behaviour in real-world applications and it something we need to be aware of.
At the very least, you should consider carefully whether the caller expects the argument to be changed.
If you want to protect your code from such side effects, consider using immutable types where possible/practical.
If it is not clear to you whether it is safe to modify a passed mutable parameter - create a copy of the parameter and modify that instead. Comprehensions provide a nice pythonic way to create new objects as does the copy module with its copy
and deepcopy
functions.
Mutable Types as Parameter Defaults
Python allows us to provide default values for function parameters - making them optional.
For example, when we call members.sort()
in our code above, the sort
method has an optional keyword argument reverse
which defaults to False
. We can pass it with the value True
to override the default behaviour:
>>> FAB_FOUR = ["John", "Paul", "George", "Ringo"]
>>> FAB_FOUR.sort()
>>> FAB_FOUR
['George', 'John', 'Paul', 'Ringo']
>>> FAB_FOUR.sort(reverse=True)
>>> FAB_FOUR
['Ringo', 'Paul', 'John', 'George']
The default values are evaluated once, at the point of function definition in the defining scope.
Using a Mutable Type as a paramater default should be avoided if possible because it can lead to unexpected and inconsistent behaviour
Consider the following code:
def enroll_student(name, students=[]):
students.append(name)
return students
def main():
print(enroll_student("Biffa"))
print(enroll_student("Moose"))
print(enroll_student("Cheeseman"))
if __name__ == "__main__":
main()
Running this code produces the following output:
['Biffa']
['Biffa', 'Moose']
['Biffa', 'Moose', 'Cheeseman']
Our function seems to be retaining information from previous calls.
This is because default values are stored in function attribute __defaults__
and if mutable, can be changed by the function code.
Lets modify our code to show this:
def enroll_student(name, students=[]):
# id of function local variable students
print(f"ID of students: {id(students)}")
students.append(name)
return students
def main():
enroll_student("Biffa")
# Function enroll_student parameter default values
print(f"Function Default Values: {enroll_student.__defaults__}")
# id of students default value
print(f"ID of students default value: {id(enroll_student.__defaults__[0])}")
if __name__ == "__main__":
main()
Running this code produces the following output:
ID of students: 2667038141888
Function Default Values: (['Biffa'],)
ID of students default value: 2667038141888
This shows that our local students
variable and its default value both point to the same object. Modifying the local variable will cause the default value to be updated on the function object!
To prevent this behaviour, we can specify None
as the default value, and handle that case inside our code. None
is immutable. Here is our new version of the function:
def enroll_student(name, students=None):
if students is None:
students = []
students.append(name)
return students
Looking at the output, we can see that the immutable None
is stored as the function default for students
and is no longer coupled to our local students
variable:
ID of students: 1545913462208
Function Default Values: (None,)
ID of students default value: 140713967338128
And our code works consistently:
def main():
print(enroll_student("Biffa"))
print(enroll_student("Moose"))
print(enroll_student("Cheeseman"))
Which produces:
['Biffa']
['Moose']
['Cheeseman']
Key Takeaways
- Python passes objects by reference, meaning variables can point to the same object.
- Mutable parameters can cause side effects by modifying variables in enclosing scopes.
- Mutable default parameters persist across function calls, which can cause unexpected behaviour.
- To avoid issues, use
None
as a default and create a new object inside the function.
03 Feb 2025 6:26pm GMT
Real Python: Python for Loops: The Pythonic Way
Python's for
loop allows you to iterate over the items in a collection, such as lists, tuples, strings, and dictionaries. The for
loop syntax declares a loop variable that takes each item from the collection in each iteration. This loop is ideal for repeatedly executing a block of code on each item in the collection. You can also tweak for
loops further with features like break
, continue
, and else
.
By the end of this tutorial, you'll understand that:
- Python's
for
loop iterates over items in a data collection, allowing you to execute code for each item. - To iterate from
0
to10
, you use thefor index in range(11):
construct. - To repeat code a number of times without processing the data of an iterable, use the
for _ in range(times):
construct. - To do index-based iteration, you can use
for index, value in enumerate(iterable):
to access both index and item.
In this tutorial, you'll gain practical knowledge of using for
loops to traverse various collections and learn Pythonic looping techniques. Additionally, you'll learn how to handle exceptions and how to use asynchronous iterations to make your Python code more robust and efficient.
Get Your Code: Click here to download the free sample code that shows you how to use for loops in Python.
Take the Quiz: Test your knowledge with our interactive "The Python for Loop" quiz. You'll receive a score upon completion to help you track your learning progress:
Interactive Quiz
The Python for LoopIn this quiz, you'll test your understanding of Python's for loop and the concepts of definite iteration, iterables, and iterators. With this knowledge, you'll be able to perform repetitive tasks in Python more efficiently.
Getting Started With the Python for
Loop
In programming, loops are control flow statements that allow you to repeat a given set of operations a number of times. In practice, you'll find two main types of loops:
for
loops are mostly used to iterate a known number of times, which is common when you're processing data collections with a specific number of data items.while
loops are commonly used to iterate an unknown number of times, which is useful when the number of iterations depends on a given condition.
Python has both of these loops and in this tutorial, you'll learn about for
loops. In Python, you'll generally use for
loops when you need to iterate over the items in a data collection. This type of loop lets you traverse different data collections and run a specific group of statements on or with each item in the input collection.
In Python, for
loops are compound statements with a header and a code block that runs a predefined number of times. The basic syntax of a for
loop is shown below:
for variable in iterable:
<body>
In this syntax, variable
is the loop variable. In each iteration, this variable takes the value of the current item in iterable
, which represents the data collection you need to iterate over. The loop body can consist of one or more statements that must be indented properly.
Here's a more detailed breakdown of this syntax:
for
is the keyword that initiates the loop header.variable
is a variable that holds the current item in the input iterable.in
is a keyword that connects the loop variable with the iterable.iterable
is a data collection that can be iterated over.<body>
consists of one or more statements to execute in each iteration.
Here's a quick example of how you can use a for
loop to iterate over a list:
>>> colors = ["red", "green", "blue", "yellow"]
>>> for color in colors:
... print(color)
...
red
green
blue
yellow
In this example, color
is the loop variable, while the colors
list is the target collection. Each time through the loop, color
takes on a successive item from colors
. In this loop, the body consists of a call to print()
that displays the value on the screen. This loop runs once for each item in the target iterable. The way the code above is written is the Pythonic way to write it.
However, what's an iterable anyway? In Python, an iterable is an object-often a data collection-that can be iterated over. Common examples of iterables in Python include lists, tuples, strings, dictionaries, and sets, which are all built-in data types. You can also have custom classes that support iteration.
Note: Python has both iterables and iterators. Iterables support the iterable protocol consisting of the .__iter__()
special method. Similarly, iterators support the iterator protocol that's based on the .__iter__()
and .__next__()
special methods.
Both iterables and iterators can be iterated over. All iterators are iterables, but not all iterables are iterators. Python iterators play a fundamental role in for
loops because they drive the iteration process.
A deeper discussion on iterables and iterators is beyond the scope of this tutorial. However, to learn more about them, check out the Iterators and Iterables in Python: Run Efficient Iterations tutorial.
You can also have a loop with multiple loop variables:
>>> points = [(1, 4), (3, 6), (7, 3)]
>>> for x, y in points:
... print(f"{x = } and {y = }")
...
x = 1 and y = 4
x = 3 and y = 6
x = 7 and y = 3
In this loop, you have two loop variables, x
and y
. Note that to use this syntax, you just need to provide a tuple of loop variables. Also, you can have as many loop variables as you need as long as you have the correct number of items to unpack into them. You'll also find this pattern useful when iterating over dictionary items or when you need to do parallel iteration.
Sometimes, the input iterable may be empty. In that case, the loop will run its header once but won't execute its body:
>>> for item in []:
... print(item)
...
Read the full article at https://realpython.com/python-for-loop/ »
[ Improve Your Python With 🐍 Python Tricks 💌 - Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]
03 Feb 2025 2:00pm GMT
Zato Blog: LDAP and Active Directory as Python API Services
LDAP and Active Directory as Python API Services
LDAP and Active Directory often play key a role in the management of a company's network resources yet it is not always very convenient to query a directory directly using the LDAP syntax and protocol that few people truly specialize in. This is why in this article we are using Zato to offer a REST API on top of directory services so that API clients can use REST and JSON instead.
Installing Zato
Start off by installing Zato - if you are not sure what to choose, pick the Docker Quickstart option and this will set up a working environment in a few minutes.
Creating connections
Once Zato is running, connections can be easily created in its Dashboard (by default, http://127.0.0.1:8183). Navigate to Connections -> Outgoing -> LDAP ..
.. and then click Create a new connection which will open a form as below:
The same form works for both regular LDAP and Active Directory - in the latter case, make sure that Auth type is set to NTLM.
The most important information is:
- User credentials
- Authentication type
- Server or servers to connect to
Note that if authentication type is not NTLM, user credentials can be provided using the LDAP syntax, e.g. uid=MyUser,ou=users,o=MyOrganization,dc=example,dc=com.
Right after creating a connection be sure to set its password too - the password asigned by default is a randomly generated one.
Pinging
It is always prudent to ping a newly created connection to ensure that all the information entered was correct.
Note that if you have more than one server in a pool then the first available one of them will be pinged - it is the whole pool that is pinged, not a particular part of it.
Active Directory as a REST service
As the first usage example, let's create a service that will translate JSON queries into LDAP lookups - given username or email the service will basic information about the person's account, such as first and last name.
Note that the conn object returned by client.get() below is capable of running any commands that its underlying Python library offers - in this case we are only using searches but any other operation can also be used, e.g. add or modify as well.
# -*- coding: utf-8 -*-
# stdlib
from json import loads
# Bunch
from bunch import bunchify
# Zato
from zato.server.service import Service
# Where in the directory we expect to find the user
search_base = 'cn=users, dc=example, dc=com'
# On input, we are looking users up by either username or email
search_filter = '(&(|(uid={user_info})(mail={user_info})))'
# On output, we are interested in username, first name, last name and the person's email
query_attributes = ['uid', 'givenName', 'sn', 'mail']
class ADService(Service):
""" Looks up users in AD by their username or email.
"""
class SimpleIO:
input_required = 'user_info'
output_optional = 'message', 'username', 'first_name', 'last_name', 'email'
response_elem = None
skip_empty_keys = True
def handle(self):
# Connection name to use
conn_name = 'My AD Connection'
# Get a handle to the connection pool
with self.out.ldap[conn_name].conn.client() as client:
# Get a handle to a particular connection
with client.get() as conn:
# Build a filter to find a user by
user_info = self.request.input['user_info']
user_filter = search_filter.format(user_info=user_info)
# Returns True if query succeeds and has any information on output
if conn.search(search_base, user_filter, attributes=query_attributes):
# This is where the actual response can be found
response = conn.entries
# In this case, we expect at most one user matching input criteria
entry = response[0]
# Convert it to JSON for easier handling ..
entry = entry.entry_to_json()
# .. and load it from JSON to a Python dict
entry = loads(entry)
# Convert to a Bunch instance to get dot access to dictionary keys
entry = bunchify(entry['attributes'])
# Now, actually produce a JSON response. For simplicity's sake,
# assume that users have only one of email or other attributes.
self.response.payload.message = 'User found'
self.response.payload.username = entry.uid[0]
self.response.payload.first_name = entry.givenName[0]
self.response.payload.last_name = entry.sn[0]
self.response.payload.email = entry.mail[0]
else:
# No business response = no such user found
self.response.payload.message = 'No such user'
After creating a REST channel, we can invoke the service from command line, thus confirming that we can offer the directory as a REST service:
$ curl "localhost:11223/api/get-user?user_info=MyOrganization\\MyUser" ; echo
{
"message": "User found",
"username": "MyOrganization\\MyUser",
"first_name": "First",
"last_name": "Last",
"email": "address@example.com"
}
$
More resources
➤ Python API integration tutorial
➤ What is a Network Packet Broker? How to automate networks in Python?
➤ What is an integration platform?
➤ Python Integration platform as a Service (iPaaS)
➤ What is an Enterprise Service Bus (ESB)? What is SOA?
➤ Open-source iPaaS in Python
03 Feb 2025 8:00am GMT
Seth Michael Larson: Connection without Connectivity (#1: Space)
This is the first article in a 7-part series about software for connection.
Feeling connected to others is a basic human need, so it is no surprise we want software to enable human connection. The surprise is that despite computing device ownership and internet usage being at an all-time high, feelings of loneliness have also never been higher.
The shape of today's "software for connection" uses centralized servers in the cloud, algorithmic curation, and incentives optimized for users to connect with the platform (aka "engagement" and "parasociality"), not necessarily with other humans. Software for connection has followed in the same rut created by big tech, as can be seen in protocols, browsers, and infrastructure.
The software we have today feels like a tiny fraction of what should be possible in a world full of people with personal computing devices in their pockets. There's no problem with a well-traveled road (it's the road that led you here, after all ❤️), but maybe you'll join me for a walk down a less-traveled path to explore how software can connect people outside the common paradigms and what assumptions and restrictions we can drop to enable different methods of connection.
Dōbutsu no Mori+ cover art (archive.org)
This 7-part series explores the feeling of "connection without connectivity" through the design of an offline video-game: Animal Crossing
Animal Crossing, known in Japan as "Dōbutsu no Mori+" (どうぶつの森+), was released on the GameCube in December 2001. According to director Katsuya Eguchi (江口 勝也), Animal Crossing features three themes:
"family, friendship, and community"
Animal Crossing is able to fulfill these themes without internet-connectivity, LAN-play, or even concurrent local multiplayer.
Sharing spaces, not concurrently
This first article is about space, a place where people can convene and feel togetherness. Spaces can be large or small, public or private, or somewhere in between.
Sharing a space with others, not necessarily at the same time, can evoke feelings of connection by experiencing changes to the space that others have made. By making your own changes to a shared space, you are connecting to others in the future.
Animal Crossing didn't support any kind of concurrent multiplayer, but that didn't stop the game from feeling like a multiplayer experience. In the game, players would collect bugs and fish, decorate their house, and plant trees and flowers. The many animal villagers that lived in the town would "remember" past conversations with other players, making the world feel more alive.
Because each player shared the same space with other players, everyone can see the changes made to the town over time. Katsuya Eguchi remarked on Animal Crossing being a shared space for his family to connect across time:
"[My family is] playing games, and I'm playing games, but we're not really doing it together. It'd be nice to have a play experience where even though we're not playing at the same time, we're still sharing things together. ...[I wanted] to create a space where my family and I could interact more, even if we weren't playing together."
What does this mean for modern software? From this we learn that concurrency is not needed to feel connected. Today's internet-connected software typically requires a persistent connection, often because data doesn't exist on the same hardware that's running the software.
Requiring a persistent connection presents accessibility problems: internet access isn't distributed evenly throughout the world. Even in places where internet connectivity is common, a persistent connection isn't always possible (airplanes, subways, tunnels, inside a building, infrastructure outage).
Removing the need for real-time synchronization and data access means that internet-connectivity becomes optional. Users can then engage and create with the software at any time, regardless of connectivity. This empowers the user to weave the software into their own life and schedule at an engagement level that works for them.
Local-First Software is a software design paradigm that brings the benefits of colocated data and user interface to software. More details about the benefits of Local-First Software are outlined by Ink and Switch.
When adopting this paradigm, the onus is on the software to create a connected experience with the data that the software has on hand. Compare this to software demanding users be online and using the software as often as possible to feel connected to a space.
Local-First Software logo. The SVG is impressively compact.
Future articles will discuss more implications for software for connection where connectivity is optional. If you've enjoyed this article it's likely you'll enjoy the others, I hope you'll follow along for more. Thanks for reading!
03 Feb 2025 12:00am GMT
Quansight Labs Blog: From napari to the world: how we generalized the `conda/constructor` stack for distributing Python applications
Our work for the napari project resulted in multiple beneficial side effects for the conda packaging ecosystem.
03 Feb 2025 12:00am GMT
02 Feb 2025
Planet Python
Real Python: Develop Data Visualization Interfaces in Python With Dash
Dash is a popular Python framework for creating interactive data visualization interfaces. With Dash, you build web applications using only Python, without needing advanced web development skills. It integrates seamlessly with technologies like Flask, React.js, and Plotly.js to render user interfaces and generate charts.
By the end of this tutorial, you'll understand that:
- Dash is an open-source framework for building data visualization interfaces using Python.
- Good use cases for Dash include interactive dashboards for data analysis and visualization tasks.
- You can customize the style of a Dash app using CSS, either inline or with external files.
- You can deploy Dash applications on PythonAnywhere, a platform offering free hosting for Python web apps.
Dash gives data scientists the ability to showcase their results in interactive web applications. You don't need to be an expert in web development. In this tutorial, you'll explore how to create, style, and deploy a Dash application, transforming a basic dashboard into a fully interactive tool.
You can download the source code, data, and resources for the sample application that you'll make in this tutorial by clicking the link below:
Get the Source Code: Click here to get the source code you'll use to learn about creating data visualization interfaces in Python with Dash in this tutorial.
What Is Dash?
Dash is an open-source framework for building data visualization interfaces. Released in 2017 as a Python library, it's grown to include implementations for R, Julia, and F#. Dash helps data scientists build analytical web applications without requiring advanced web development knowledge.
Three technologies constitute the core of Dash:
- Flask supplies the web server functionality.
- React.js renders the user interface of the web page.
- Plotly.js generates the charts used in your application.
But you don't have to worry about making all these technologies work together. Dash will do that for you. You just need to write Python, R, Julia, or F# and sprinkle in a bit of CSS.
Plotly, a Canada-based company, built Dash and supports its development. You may know the company from the popular graphing libraries that share its name. The company released Dash as open source under an MIT license, so you can use Dash at no cost.
Plotly also offers a commercial companion to Dash called Dash Enterprise. This paid service provides companies with support services such as hosting, deploying, and handling authentication on Dash applications. But these features live outside of Dash's open-source ecosystem.
Dash will help you build dashboards quickly. If you're used to analyzing data or building data visualizations using Python, then Dash will be a useful addition to your toolbox. Here are a few examples of what you can make with Dash:
- A dashboard showing object detection for self-driving cars
- A visualization of millions of Uber rides
- An interactive tool for analyzing soccer match data
This is just a tiny sample. If you'd like to see other interesting use cases, then go check out the Dash App Gallery.
Note: You don't need advanced knowledge of web development to follow this tutorial, but some familiarity with HTML and CSS won't hurt.
You should know the basics of the following topics, though:
- Python graphing libraries such as Plotly, Bokeh, and Matplotlib
- HTML and the structure of an HTML file
- CSS and style sheets
If you feel comfortable with the requirements and want to learn how to use Dash in your next project, then continue to the following section!
Get Started With Dash in Python
In this tutorial, you'll go through the end-to-end process of building a dashboard using Dash. If you follow along with the examples, then you'll go from a bare-bones dashboard on your local machine to a styled dashboard deployed on PythonAnywhere.
To build the dashboard, you'll use a dataset of sales and prices of avocados in the United States between 2015 and 2018. Justin Kiggins compiled this dataset using data from the Hass Avocado Board.
How to Set Up Your Local Environment
To develop your app, you'll need a new directory to store your code and data. You'll also need a clean Python virtual environment. To create those, execute the commands below, choosing the version that matches your operating system:
Read the full article at https://realpython.com/python-dash/ »
[ Improve Your Python With 🐍 Python Tricks 💌 - Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]
02 Feb 2025 2:00pm GMT
Real Python: Build a Dice-Rolling Application With Python
In this tutorial, you'll learn how to create a Python dice roll simulator. The tutorial guides you through building a text-based user interface (TUI) application that simulates rolling dice using Python's random
module. You'll learn to gather and validate user input, use random.randint()
for dice rolling, and display results with ASCII art.
By the end of this tutorial, you'll understand that:
- To simulate dice-rolling events, you can use
random.randint()
. - To get the user's input, you use the built-in
input()
function. - To display dice in Python, you generate ASCII art representations of dice faces and use
print()
. - To manipulate strings, you use methods such as
.center()
and.join()
.
Building small projects, like a text-based user interface (TUI) dice-rolling application, will help you level up your Python programming skills. You'll learn how to gather and validate the user's input, import code from modules and packages, write functions, use for
loops and conditionals, and neatly display output by using strings and the print()
function.
Click the link below to download the entire code for this dice-rolling application and follow along while you build the project yourself:
Get Source Code: Click here to get the source code you'll use to build your Python dice-rolling app.
Demo
In this step-by-step project, you'll build an application that runs dice-rolling simulations. The app will be able to roll up to six dice, with each die having six faces. After every roll, the application will generate an ASCII diagram of dice faces and display it on the screen. The following video demonstrates how the app works:
When you run your dice-rolling simulator app, you get a prompt asking for the number of dice you want to roll. Once you provide a valid integer from 1 to 6, inclusive, then the application simulates the rolling event and displays a diagram of dice faces on the screen.
Project Overview
Your dice-rolling simulator app will have a minimal yet user-friendly text-based user interface (TUI), which will allow you to specify the number of six-sided dice that you'd like to roll. You'll use this TUI to roll the dice at home without having to fly to Las Vegas.
Here's a description of how the app will work internally:
Tasks to Run | Tools to Use | Code to Write |
---|---|---|
Prompt the user to choose how many six-sided dice to roll, then read the user's input | Python's built-in input() function |
A call to input() with appropriate arguments |
Parse and validate the user's input | String methods, comparison operators, and conditional statements | A user-defined function called parse_input() |
Run the dice-rolling simulation | Python's random module, specifically the randint() function |
A user-defined function called roll_dice() |
Generate an ASCII diagram with the resulting dice faces | Loops, list.append() , and str.join() |
A user-defined function called generate_dice_faces_diagram() |
Display the diagram of dice faces on the screen | Python's built-in print() function |
A call to print() with appropriate arguments |
Keeping these internal workings in mind, you'll code three custom functions to provide the app's main features and functionalities. These functions will define your code's public API, which you'll call to bring the app to life.
To organize the code of your dice-rolling simulator project, you'll create a single file called dice.py
in a directory of your choice in your file system. Go ahead and create the file to get started!
Prerequisites
You should be comfortable with the following concepts and skills before you start building this dice-rolling simulation project:
- Ways to run scripts in Python
- Python's
import
mechanism - The basics of Python data types, mainly strings and integer numbers
- Basic data structures, especially lists
- Python variables and constants
- Python comparison operators
- Boolean values and logical expressions
- Conditional statements
- Python
for
loops - The basics of input, output, and string formatting in Python
If you don't have all of the prerequisite knowledge before starting this coding adventure, then that's okay! You might learn more by going ahead and getting started! You can always stop and review the resources linked here if you get stuck.
Step 1: Code the TUI of Your Python Dice-Rolling App
In this step, you'll write the required code to ask for the user's input of how many dice they want to roll in the simulation. You'll also code a Python function that takes the user's input, validates it, and returns it as an integer number if the validation was successful. Otherwise, the function will ask for the user's input again.
To download the code for this step, click the following link and navigate to the source_code_step_1/
folder:
Get Source Code: Click here to get the source code you'll use to build your Python dice-rolling app.
Take the User's Input at the Command Line
Read the full article at https://realpython.com/python-dice-roll/ »
[ Improve Your Python With 🐍 Python Tricks 💌 - Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]
02 Feb 2025 2:00pm GMT