20 May 2025
Planet Python
PyCoder’s Weekly: Issue #682: pylock.toml, NumPy, t-strings, and More (May 20, 2025)
#682 - MAY 20, 2025
View in Browser »
Going Beyond requirements.txt
With pylock.toml
What is the best way to record the Python dependencies for the reproducibility of your projects? What advantages will lock files provide for those projects? This week on the show, we welcome back Python Core Developer Brett Cannon to discuss his journey to bring PEP 751 and the pylock.toml file format to the community.
REAL PYTHON podcast
I Don't Like NumPy
This opinion piece talks about why NumPy gets difficult fast. Two dimensions to your array? No problem, the calc is mostly self evident. Add a couple more dimensions and it gets messy fast. See also the associated HN discussion, which also includes possible solutions.
DYNOMIGHT
Webinar: Inside Dust's AI-Powered Vision for the Future of Work
Join us on June 5 for a behind-the-scenes look at how Dust is building the future of work with AI and why Temporal is at the heart of their platform. We'll explore how Dust is transforming those everyday tasks with intelligent agents that are deeply contextual, resilient, and production-ready →
TEMPORAL sponsor
Unravelling t-strings
PEP 750 introduced t-strings
for Python 3.14. These are a template string mechanism similar to f-strings. Although they are in 3.14.0b1, there isn't any documentation yet, so this post explains what they are how they can be used.
BRETT CANNON
Python Jobs
Senior Software Engineer - Quant Investment Platform (LA or Dallas) (Los Angeles, CA, USA)
Causeway Capital Management LLC
Articles & Tutorials
Working With Missing Data in Polars
In this video course, you'll learn how to deal with missing data in Polars to ensure it doesn't interfere with your data analysis. You'll discover how to check for missing values, update them, and remove them.
REAL PYTHON course
Django Security Best Practices
This guide contains a long list of best practices for securing your Django site. It includes topics such as ensuring HTTPS everywhere, hardening your database, using key built-in security features, and more.
AHMAD SADEDDIN
Read My Lips: Automate Lip Syncing with Wav2Lip
Don't let your message be misunderstood! Use Wav2Lip for automated and AI-powered lip syncing. Get the code now.
INTEL CORPORATION sponsor
t-strings
in Python (PEP 750)
Python 3.14 will include the new t-string concept, an f-string like mechanism for string templating. Talk Python interviews Paul Everitt, David Peck, and Jim Baker and they talk about the upcoming feature.
KENNEDY ET AL podcast
The Database Row That Did and Didn't Exist
Misty ran into a confounding bug, where loading and saving certain Django ORM objects was failing, but not all of them. Read on to understand what went wrong.
MISTY DE MÉO
The Appearing Built-In
Rodrigo discovered that examining the builtins
module in the REPL changes the size of the builtins
module. Read on to find what causes this oddity.
RODRIGO GIRÃO SERRÃO
What Does @Slot()
Do in Qt?
When working with Qt slots and signals in PySide6 you might discover the @Slot
decorator. This post tells you what it does and why you might use it.
MARTIN FITZPATRICK
5 Steps to N-Body Simulation
This article shows you how to simulate gravity in an N-body problem. It starts out with simpler simulation then adds optimizations as it goes along.
ALVIN NG
PEPs & Co.
A little bit of Python history: where the term "PEP" came from, along with a list of similar types of documents for other languages.
HUGO VAN KEMENADE
The First Year of Free-Threaded Python
A recap of the first year of work on enabling support for the free-threaded build of CPython in community packages.
NATHAN GOLDBAUM
Pandas: Drop Columns by Name in DataFrames
A quick overview of the variety of ways to drop columns from a pandas DataFrame, referencing the columns by name.
ERIK MARSJA
Working on Complex Systems: What I Learned at Google
Explore the key characteristics of complex systems and learn practical patterns for tackling complex problems.
TEIVA HARSANYI
Building a Real-Time Dashboard With Flask and Svelte
Learn how to build a real-time analytics dashboard using Flask and Svelte with server-sent events.
AMIR TADRISI
Projects & Code
Events
Python Leiden User Group
May 22, 2025
PYTHONLEIDEN.NL
Python Sheffield
May 27, 2025
GOOGLE.COM
PyCon Italia 2025
May 28 to June 1, 2025
PYCON.IT
PyKla Monthly Meetup
May 28, 2025
MEETUP.COM
Happy Pythoning!
This was PyCoder's Weekly Issue #682.
View in Browser »
[ Subscribe to 🐍 PyCoder's Weekly 💌 - Get the best Python news, articles, and tutorials delivered to your inbox once a week >> Click here to learn more ]
20 May 2025 7:30pm GMT
LXer Linux News
United States Federal Government's Digital Analytics Program (DAP): GNU/Linux Users Represent Close to 6% of Visitors This Year
The first occurrence of GNU/Linux was owing to my older colleague and it happened here in Manchester, not in Finland or in Portland (Oregon). That's just the real history!
20 May 2025 6:34pm GMT
Red Hat Enterprise Linux 10 Released, This Is What’s New
RHEL 10 debuts with built-in AI guidance, post-quantum cryptography, and streamlined OS-container management for modern hybrid infrastructure.
20 May 2025 5:03pm GMT
DZone Java Zone
Tired of Spring Overhead? Try Dropwizard for Your Next Java Microservice
Instead of a monolith, build your first Java microservice with Dropwizard.
Hello, my fellow programmers! I'm positive you do not want to read another complex article on how to build Java microservices. We are going to take a look at Dropwizard today. It is fairly convenient as it has everything loaded in it, i.e., Jetty, Jersey, Jackson, etc., and also provides you with the ability to set your business logic without the boilerplates.
20 May 2025 4:00pm GMT
LXer Linux News
Big News for Linux on Windows: Windows Subsystem for Linux (WSL) is Now Officially Open Source!
After years of work, Microsoft has made the WSL code publicly available. Yes, Windows Subsystem for Linux is now officially Open Source!
20 May 2025 3:31pm GMT
Planet Python
Ari Lamstein: Course Review: Build AI Agents with Python
I was slow to jump on the LLM bandwagon.
In some ways this is surprising. Although it was a while ago, I did take several AI courses in grad school. So I am probably in a good position to dive into new developments in the field.
But when the LLM wave started I was busy trying to develop software engineering expertise in Python. I wanted to finish that "project" before moving onto something new. It was only a few months ago, when the PR of my multiyear
module was merged to the censusdis
package (link), that I felt like I could turn my attention elsewhere.
Coincidentally, around that time Kevin Markham created his first course on LLMs. I wrote a review of that course here. In this post I review his followup course: Build AI Agents with Python.
The Problem AI Agents Solve
Something I learned in this course is that LLMs like GPT-4o mini are called "foundational models". Foundational models have very real limitations. Consider this conversation I just had with GPT-4o mini:
User: What is your cutoff date?
AI: My knowledge cutoff date is October 2021. This means I do not have information on events or developments that occurred after that date. If you have questions based on information available up to then, feel free to ask!User: Do you have access to the web?
AI: No, I do not have access to the web or the ability to browse the internet. My responses are generated based on a mixture of licensed data, data created by human trainers, and publicly available information up to my knowledge cutoff date in October 2021. If you have any questions within that scope, feel free to ask!User: What is the latest version of Python?
AI: As of my last knowledge update in October 2021, the latest stable release of Python was Python 3.10, which was released on October 4, 2021. For any updates or newer versions released after that date, I recommend checking the official Python website at https://www.python.org/.
In short, the answers that foundational models give are limited to the data they were trained on. For reference, the latest version of Python is 3.13.3, which was released in April 2025.
Augmenting Foundational Models
You can think of AI Agents as augmenting foundational models with tools, data and prompts that you provide them. This allows them to provide better answers to certain questions.
As an example, Build AI Agents with Python walks you though the process of giving GPT-4o mini the "tool" of web search using Tavily. This resulted in the following conversation:
User: What is your cutoff date?
AI: My knowledge cutoff date is in October 2021. I do not have information on events or developments that occurred after that date. However, I can access real-time information through specific tools. If you have a particular query, feel free to ask!User: Do you have access to the web?
AI: Yes, I can access real-time information through a specific tool to search the web for current events and updates. If you have a particular topic or question in mind, let me know, and I can look it up for you!User: What is the latest version of python?
AI: The latest stable version of Python is 3.13.3. You can download it from the official Python website https://www.python.org/downloads/.
I had not heard of Tavily prior to taking this course. It turns out that there is an entire ecosystem of tools around LLMs!
In addition to adding in a tool for web search, the course guides you through integrating custom data sources using Retrieval Augmented Generation (RAG) and building a Q&A system that can interact with a SQL database. For a course that only contains two hours of videos, I felt that it covered a lot.
Like with all online courses, there is an argument that I could have learned all this on my own. But "get your hands dirty with LLMs" had been on my todo list for a while. It was only taking this course that made it happen. If you are looking for a similar push, then you might enjoy this course too.
Wish List
Kevin is gifted at coming up with good examples and clearly explaining his solutions. If there was one change I could make to the course it would be for him to add exercises to it.
Longtime readers might remember that I have created and taught several courses myself. My initial courses did not have exercises. It was only when I read Telling Ain't Training that I learned how valuable they are for students. That book also presents a framework for creating exercises that I still use today.
Next Steps
After taking an online course, I often want to apply what I learned to a project of my own. In this case, I can now imagine adding an LLM feature to my Covid Demographics Explorer.
The dataset that powers that app is in a single CSV file. I think it would be fun to create a chatbot that has access to that file and uses it to answer questions. That would allow users to ask questions about the dataset using natural language and get answers in natural language.
20 May 2025 3:21pm GMT
Planet Lisp
Joe Marshall: Management = Bullshit
The more I have to deal with management, the more I have to deal with bullshit. The higher up in the management chain, the denser the bullshit. Now I'm not going to tell you that all management is useless, but there is a lot more problem generation than problem solving.
Lately I've been exploring the potentials of LLMs as a tool in my day-to-day work. They have a number of technical limitations, but some things they excel at. One of those things is generating the kinds of bullshit that management loves to wallow in. Case in point: our disaster recovery plan.
Someone in management got it into their head that we should have a formal disaster recovery plan. Certainly this is a good idea, but there are tradeoffs to be made. After all, we have yearly fire drills, but we don't practice "duck and cover" or evacuation in case of flooding. We have a plan for what to do in case of a fire, but we don't have a plan for what to do in case of a zombie apocalypse. But management wants a plan for everything, no matter how unlikely.
Enter the LLM. It can generate plans like nobody's business. It can generate a plan for what to do in case of a fire, a meteor strike, or a zombie apocalypse. The plans are useless, naturally. They are just bullshit. But they satisfy management's jonesing for plans, and best of all, they require no work on my part. It saved me hours of work yesterday.
20 May 2025 2:14pm GMT
Planet Python
Real Python: First Steps With LangChain
You've likely interacted with large language models (LLMs), like the ones behind OpenAI's ChatGPT, and experienced their remarkable ability to answer questions, summarize documents, write code, and much more.
While LLMs are remarkable by themselves, with a little programming knowledge, you can leverage libraries like LangChain to create your own LLM-powered applications that can do just about anything.
In this video course, you'll learn how to:
- Use LangChain to build LLM-powered applications
- Create reusable instructions with prompt templates
- Create and extend LangChain chains
- Debug what happens when a chain executes
[ Improve Your Python With 🐍 Python Tricks 💌 - Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]
20 May 2025 2:00pm GMT
Drupal.org aggregator
ImageX: Drupal Image Optimization, Part 2: Performance Booster Modules
Our Drupal image optimization mini-series is in full swing - and we're glad to have you along for the ride! In Part 1, we explored how to improve your Drupal images visually through smart cropping, artistic effects, and subtle enhancements like watermarks.
20 May 2025 1:57pm GMT
The Drop Times: “And Beyond” Starts Here: How Drupaljam 2025 Is Rewriting the Rules
Drupaljam 2025 returns to De Fabrique in Utrecht with the bold theme "And Beyond" - a call to reimagine Drupal's future through code, ethics, education, and community. This year's edition isn't just another meetup; it's a carefully curated, community-powered experience shaped by volunteers and leaders from Stichting Drupal Nederland. With insights from key organisers Esmeralda Tijhof and Jean-Paul Vosmeer, this feature dives deep into what makes Drupaljam 2025 stand apart: a post-growth keynote by Melanie Rieback, a forward-looking "Post-AI" closing by Jeroen van der Most, hands-on workshops like Drupal in a Day, a strong emphasis on inclusion, and an open-source spirit that goes far beyond software.
20 May 2025 12:52pm GMT
The Drop Times: Marine Gandy and Joris Vercammen Elected as NEDA Representatives to Drupal Working Group
Marine Gandy and Joris Vercammen have been elected to represent the Network of European Drupal Associations in the new Drupal Working Group. Their mission is to help define the structure of a European Drupal Federation alongside the Drupal Association and global partners.
20 May 2025 10:59am GMT
Django community aggregator: Community blog posts
Python: a quick cProfile recipe with pstats
Python comes with two built-in profilers for measuring the performance of your code: cProfile and profile. They have the same API, but cProfile is a C extension, while profile is implemented in Python. You nearly always want to use cProfile, as it's faster and doesn't skew measurements as much.
By default, cProfile's CLI profiles a command and displays its profile statistics afterwards. But that can be a bit limited, especially for reading large profiles or re-sorting the same data in different ways.
For more flexibility, cProfile can instead save the profile data to a file, which you can then read with the pstats module. This is my preferred way of using it, and this post covers a recipe for doing so, with a worked example.
The recipe
First, profile your script:
$ python -m cProfile -o profile <script> [args]
Replace <script>
with the path to your Python file, and [args]
with any arguments you want to pass to it. cProfile will run your script under its profiling machinery, saving the results to a file called profile
, as specified by the -o
option.
Second, view the profile file using pstats:
$ python -m pstats profile <<< $'sort cumtime\nstats 1000' | less
The pstats CLI provides a REPL for interacting with profile files, based on its Stats
class. The CLI is oddly undocumented, but its help
command lists the available commands.
The above command passes several commands to pstats in a string. The string uses the $
syntax, a Bash feature for C-style strings, allowing \n
to represent a newline, passing two commands:
sort cumtime
: Sort the output by cumulative time, largest first. This means the time spent in a function and all its callees.stats 1000
: Show the first 1,000 lines of the profile.
The output is passed to less
, a common pager, allowing you to scroll through the results. Press q
to quit when you're done!
Profile a module
If you're running a module instead of a script, add -m
like:
$ python -m cProfile -o profile -m <module> [args]
Replace <module>
with the name of the module you want to profile, and [args]
with any arguments you want to pass to it.
Multiple profiles
If you're profiling code before and after, consider using different profile file names instead of just profile
. For example, for checking the results of some optimization, I often use the names before.profile
and after.profile
, like:
$ python -m cProfile -o before.profile example.py
$ git switch optimize_all_the_things
$ python -m cProfile -o after.profile example.py
Alternative sort orders
To sort by other metrics, swap cumtime
in sort cumtime
for one of these values, per the Stats.sort_stats()
documentation:
-
time
: internal time-the time spent in the function itself, excluding calls to other functions.This is useful for finding the slowest functions in your code.
-
calls
: number of calls to the function.This is useful for finding functions that are called many times and may be candidates for optimization, such as caching.
A Djangoey example
Here's a worked example showing how to apply this recipe to a Django management command. Say you are testing a database migration locally:
$ ./manage.py migrate example 0002
Operations to perform:
Target specific migration: 0002_complexito, from example
Running migrations:
Applying example.0002_complexito... OK
While it did pass, it was unexpectedly slow. To profile it, you would first reverse the migration to reset your test database:
$ ./manage.py migrate example 0001
...
Then you could apply the recipe to profile the migration.
First, stick the cProfile
command in front of the migration command:
$ python -m cProfile -o profile ./manage.py migrate example 0002
Operations to perform:
Target specific migration: 0002_complexito, from example
Running migrations:
Applying example.0002_complexito... OK
Then, run the second pstats
command to view the results:
$ python -m pstats profile <<< $'sort cumtime\nstats 1000' | less
This opens less
with a long table, starting:
Welcome to the profile statistics browser.
profile% profile% Mon May 19 23:52:37 2025 profile
213287 function calls (206021 primitive calls) in 1.150 seconds
Ordered by: cumulative time
List reduced from 3576 to 1000 due to restriction <1000>
ncalls tottime percall cumtime percall filename:lineno(function)
425/1 0.001 0.000 1.150 1.150 {built-in method builtins.exec}
1 0.000 0.000 1.150 1.150 ./manage.py:1(<module>)
1 0.000 0.000 1.150 1.150 ./manage.py:7(main)
1 0.000 0.000 1.109 1.109 /.../django/core/management/__init__.py:439(execute_from_command_line)
...
The header tells us how many function calls were made, how many were primitive calls, and how long the code took to run. Then there's the table of all function calls, limited to 1,000 entries.
Since we're sorting by cumtime
, cumulative time spent in each function, the first line shows the total time spent in all functions. That exec
is cProfile running your code, and the later lines represent the top-level wrappers from Django.
Generally, it's best to find the first listed function within your code base. In this profile, you would search for ``example/
and find this entry:
ncalls tottime percall cumtime percall filename:lineno(function)
...
1 0.000 0.000 1.005 1.005 /.../example/migrations/0002_complexito.py:4(forward)
...
One call to the forward()
function in the migration file took 1.005 seconds, nearly all of the 1.150 seconds total runtime. That's a bit suspicious!
Right above that entry, you might also spot the time spent running queries:
ncalls tottime percall cumtime percall filename:lineno(function)
...
13 0.000 0.000 1.007 0.077 /.../django/db/backends/utils.py:78(execute)
13 0.000 0.000 1.007 0.077 /.../django/db/backends/utils.py:88(_execute_with_wrappers)
13 0.000 0.000 1.007 0.077 /.../django/db/backends/utils.py:94(_execute)
13 0.000 0.000 1.007 0.077 /.../django/db/backends/sqlite3/base.py:354(execute)
13 1.006 0.077 1.006 0.077 {function SQLiteCursorWrapper.execute at 0x1054f7f60}
...
This stack of functions all show 13 calls, with a cumulative time of 1.007 or 1.006 seconds. They represent Django's database backend wrappers, which eventually pass the query to Python's SQLiteCursorWrapper.execute()
, which is displayed differently because it's implemented in C.
So, we can tell that the migration ran 13 queries in total, and at least one of them was slow and ran by forward()
. At this point, you might look at the source of forward()
to see if you can find the slow query. But first, you might want to re-display the profile to show only the forward()
function and its callees (the functions it called), which might shed some light on what it was doing.
To show only forward()
and its callees, you can use the pstats callees
command. This takes a regular expression to match the function names you want to show:
$ python -m pstats profile <<< $'sort cumtime\ncallees \\bforward\\b' | less
Welcome to the profile statistics browser.
profile% profile% Ordered by: cumulative time
List reduced from 3576 to 1 due to restriction <'\\bforward\\b'>
Function
called...
ncalls tottime cumtime
/.../example/migrations/0002_complexito.py:4(forward)
-> 1 0.000 0.000 /.../django/db/backends/utils.py:41(__enter__)
1 0.000 0.000 /.../django/db/backends/utils.py:44(__exit__)
1 0.000 1.005 /.../django/db/backends/utils.py:78(execute)
1 0.000 0.000 /.../django/utils/asyncio.py:15(inner)
1 0.000 0.000 {method 'create_function' of 'sqlite3.Connection' objects}
profile%
Goodbye.
(Output wrapped.)
This has revealed:
forward()
only callsexecute()
once, so there's only one slow query.- There's also a call to SQLite's
create_function()
. It's fast, rounding down to 0.000 seconds, but perhaps may be something to do with the slow query.
Okay, time to look at the source:
def forward(apps, schema_editor):
import time
schema_editor.connection.connection.create_function(
"sleep",
1,
time.sleep,
)
with schema_editor.connection.cursor() as cursor:
cursor.execute("SELECT sleep(1)")
Ah, it's a deliberate pause that I added to show you this example. Well, that solves that mystery.
20 May 2025 4:00am GMT
19 May 2025
Linux Today
Final Bookworm-Based Raspberry Pi OS Released Ahead of Debian Trixie
Discover the final release of the Bookworm-based Raspberry Pi OS, launched just before Debian Trixie. Explore new features and enhancements today!
The post Final Bookworm-Based Raspberry Pi OS Released Ahead of Debian Trixie appeared first on Linux Today.
19 May 2025 2:40pm GMT
Ubuntu 25.10 Will Default to Rust-Powered sudo-rs
Learn about Ubuntu 25.10's shift to Rust-powered sudo-rs, improving system security and efficiency. Stay updated on this exciting development!
The post Ubuntu 25.10 Will Default to Rust-Powered sudo-rs appeared first on Linux Today.
19 May 2025 2:37pm GMT
Slimbook Launches Kymera Black Linux Desktop Computer for Gamers and Creators
Discover the Kymera Black Linux Desktop by Slimbook, designed for gamers and creators. Experience powerful performance and sleek design for your projects.
The post Slimbook Launches Kymera Black Linux Desktop Computer for Gamers and Creators appeared first on Linux Today.
19 May 2025 2:34pm GMT
16 May 2025
DZone Java Zone
Using Python Libraries in Java
Advantages and Disadvantages of Python and Java
Java and Python are among the most widely used languages in the world (see Figure 1 below).
Both languages have their strengths and weaknesses and are popular in different fields. Unlike other articles, such as those from Tom Radcliffe which analyzes which language is technically better implemented, this article focuses on presenting practical use cases with explicit examples. Let us check the following table that provides a brief overview, of which language is best suited for which field (✅ = advantage, and ❌ = disadvantage).
16 May 2025 10:00pm GMT
Django community aggregator: Community blog posts
Django News - Django News is at PyCon US this weekend! - May 16th 2025
Introduction
Django News is at PyCon US this weekend!
Jeff and Will are at PyCon US in Pittsburgh this weekend and would love to meet fellow Django enthusiasts. Drop by the DSF or JetBrains booth to say hello and connect with the many Django community members and DSF folks who will be around all weekend.
Django Newsletter
News
Google Summer of Code 2025 - Django Projects
Three projects out of many worth proposals were accepted. Improvements to Django admin, adding django-template-partials
to core, and automating processes in the Django contribution workflow.
Waiting for Postgres 18: Accelerating Disk Reads with Asynchronous I/O
Postgres 18 introduces asynchronous I/O with new io_method options (worker and io_uring), which can double or triple read performance in high-latency cloud environments.
Django Software Foundation
Simon Charette is the DSF member of the month
Simon Charette is a longtime Django contributor and community member. He served on the Django 5.x Steering Council and is part of the Security team and the Triage and Review team.
Updates to Django
Today 'Updates to Django' is presented by Abigail Afi Gbadago from the DSF Board and Djangonaut Space!🚀
Last week we had 10 pull requests merged into Django by 7 different contributors - including a first-time contributor! Congratulations to Safrone for having their first commits merged into Django - welcome on board!🎉
This week's Django highlights 🌟
- Security release of Django 5.2.1, 5.1.9 and 4.2.21.
- Field names have been added to hints in admin duplicated fields errors.
- Maximum bulk size for SQLite bulk_create and bulk_update methods now respect SQLITE_LIMIT_VARIABLE_NUMBER.
Django Newsletter
Wagtail CMS
Our four contributors for Google Summer of Code 2025
Four GSoC 2025 contributors will extend Wagtail with grid-aware sustainability, strict CSP compatibility, improved media listings, and enhanced keyboard shortcut accessibility.
Sponsored Link 1
Hire Django developers without the hassle!
Building a team of skilled Django developers has never been easier. Trust HackSoft to help you with strategic Django team augmentation. Learn more!
Articles
Django Security Best Practices: A Comprehensive Guide for Software Engineers
Enforce up-to-date Django versions, HTTPS, strong SECRET_KEY
, ORM usage, built-in security middleware, XSS/CSRF defenses, robust authentication, dependency auditing, logging, and monitoring.
18 Years of REVSYS
Revsys marks 18 years offering Python and Django expertise, including code reviews, architectural design, cloud migrations, Kubernetes, CI/CD, AI integration, and team training.
Django: model field choices that can change without a database migration
Use Django 5.0 callable choices to avoid no-op migrations when updating model field choices, though database constraints still require migrations for data integrity.
Algorithms: Learning One's Learnings
Use Big O notation to choose efficient sorting in Django apps, leveraging Python's built-in Timsort or Quick Sort instead of Bubble Sort to improve performance.
Birds and Angles: Dabbling in Django Components
Combining django-bird and dj-angles enables Web-component style reusable Django template components for cleaner syntax and improved readability, despite limited filter parsing for props.
Setting up NGINX Unit (and switching from uWSGI)
Switch Django apps from uWSGI to NGINX Unit using JSON configuration, add SECURE_PROXY_SSL_HEADER
, adjust socket proxy_pass, and enable ASGI/WSGI deployments.
My DjangoCon Europe 2025
Paolo Melchiorre recaps his DjangoCon Europe 2025 experience in Dublin through Mastodon posts covering keynotes, talks on testing, migrations, community events, and mentoring.
Tutorials
Rapid AI-powered applications with Django MongoDB and Voyage API
Learn how to build an LLM-powered recipe recommendation website with Django and MongoDB.
Podcasts
Django Chat #182: Event Sourcing with Chris May
Chris is a Senior Staff Engineer at WellSky, a software company in the health industry. We discuss his background as a graphic designer, learning Python (and Django) as an adult, his multiple conference talks on HTMX, why he's a fan of event sourcing, and more.
Talk Python #505: t-strings in Python (PEP 750)
A panel discussion of PEP 750 on t-strings, scheduled for Python 3.14, which build on the idea of f-strings
to produce a template object rather than a standard string.
Django News Jobs
Python / Django Software Developer - full-time at Off Duty Management 🆕
Backend Python Developer (Django/DRF) at Paytree
Django Newsletter
Projects
astral-sh/ty
An extremely fast Python type checker and language server, written in Rust.
pydantic/pydantic-ai
Agent Framework / shim to use Pydantic with LLMs.
This RSS feed is published on https://django-news.com/. You can also subscribe via email.
16 May 2025 3:00pm GMT
My second DjangoCon Europe
Well I have been meaning to write this for over 2 weeks now, but better late than never! Towards the end of April 2025 I attended the DjangoCon Europe conference and Sprints and it was brilliant and exhausting all in one go.
Let's begin with the travel there, I decided to join those doing the SailRail for a relaxed train ride and crossing the sea to Dublin. This was great as I managed to make some use of the day (work and a blog post) while travelling as well as having some travel companions in the form of Thibaud, Sage, Tom & Daniele.
The next day kicked off the conference with an excellent keynote from Sarah Boyce, and other talks followed thoughout the next 2 days. Databases was a big theme along with community engagement and HTMX. However for me it was walking into the room and meeting folks from the community in person, that I have interacted with online for the past couple of years. This was also coupled with great conversations with friends new & old (mostly around making Django better). I also plucked up the courage and gave a lighting talk on the last day about my year of 100 words.
The evening socials again were excellent! Django Social on Wednesday and the official party on Friday, with a more chill evening going climbing with a couple of interested attendees. The weekend brought the Sprints which were just perfect. I managed to crack on with an open ticket/PR I have for the messages app in Django and also make some good progress on django-prodserver.
It was sad to leave, but reminds me that I want to go next year (if I am allowed by the family!). I am also excited with the energy I felt across the week reminding me that Django is going strong as ever and the communuity has a bright future. I could write more, but I am aware that I need to crack on with today's work, but I will leave you with the recommendation of getting to a DjangoCon if are use Django in any form, you will not be disappointed.
16 May 2025 5:00am GMT
15 May 2025
DZone Java Zone
Designing a Java Connector for Software Integrations
IT systems need integration to achieve seamless data flow alongside enhanced operational efficiency. The expansion of businesses creates independent operational systems that restrict performance through object isolation and restrict time-sensitive information acquisition. The system's functions become oriented for better workflow coordination, which minimizes both repetition and creates unified operations.
The creation of such connections necessitates comprehensive knowledge about technical and business needs, which include format patterns along with protocol standards. Java provides robust libraries alongside outstanding frameworks that lead organizations to choose Java when building scalable, customized solutions. A strategically developed connector fulfills present requirements while enabling future technology adjustments, thus becoming a sustainable factor that supports successful IT system integration.
15 May 2025 5:00pm GMT
Planet Lisp
Gábor Melis: PAX PDF Output
Thanks to Paul A. Patience, PAX now has PDF support. See pax-manual-v0.4.1.pdf and dref-manual-v0.4.1.pdf. The PDF is very similar to the HTML, even down to the locative types (e.g [function]
) being linked to the sources on GitHub, but cross-linking between PDFs doesn't work reliably on most viewers, so that's disabled. Also, for reading PDFs so heavy on internal linking to be enjoyable, one needs a viewer that supports going back within the PDF (not the case with Chrome at the moment). Here is a blurry screenshot to entice:
There is a bit of a Christmas tree effect due to syntax highlighting and the colouring of the links. Blue links are internal to the PDF, maroon links are external. I might want to change that to make it look more like the HTML, but I have not found a way on LaTeX to underline text without breaking automatic hyphenation.
15 May 2025 12:00am GMT
14 May 2025
Kernel Planet
Linux Plumbers Conference: Submission time for Linux Plumbers 2025
Submissions for the Refereed Track and Microconferences are now open. Linux Plumbers will be held this year in Tokyo from December 11th - 13th (Note, the 13th is on a Saturday).
The Refereed presentations are 45 minutes in length and should focus on a specific aspect of the "plumbing" in a Linux ecosystem. Examples of Linux plumbing include core kernel subsystems, init systems, core libraries, toolchains, windowing systems, management tools, device support, media creation/playback, testing, and so on. The best presentations are not about finished work, but rather problem statements, proposals, or proof-of-concept solutions that require face-to-face discussions and debate.
The Microconferences are 3 and a half hours of technical discussion, broken up into 15 to 30 minute subtopics. The only presentations allowed are those that are needed to bring the audience up to speed and should not last more than half the allotted time for the subtopic. To submit a Microconference, provide a topic, some examples of subtopics to be discussed and a list of key people that should be present to have meaningful discussions. For Microconferences that have been to Linux Plumbers in the past, they should provide a list of accomplishments that were a direct result of the discussions from their previous sessions (with links to patches and such).
Presentations and Microconference subtopic leads should ideally be physically present at the conference. Remote presentations may be available but are strongly discouraged.
The Refereed submissions end at 11:59PM UTC on Wednesday, September 10, 2025.
The Microconference submissions end at 11:59PM UTC on Sunday, June 29, 2025.
Go ahead and submit your Refereed track presentation or Microconference topic. We are looking forward to seeing the great content that is submitted that makes Linux Plumbers the best technical conference there is.
14 May 2025 8:44pm GMT
13 May 2025
Planet Lisp
Joe Marshall: Purchasing White Elephants
As a software engineer, I'm constantly trying to persuade management to avoid doing stupid things. Management is of the opinion that because they are paying the engineers anyway, the software is essentially free. In my experience, bespoke software is one of the most expensive things you can waste money on. You're usually better off setting your money on fire than writing custom software.
But managers get ideas in their heads and it falls upon us engineers to puncture them. I wish I were less ethical. I'd just take the money and spend it as long as it kept flowing. But I wouldn't be able to live with myself. I have to at least try to persuade them to avoid the most egregious boondoggles. If they still insist on doing the project, well, so be it.
I'm absolutely delighted to find that these LLMs are very good at making plausible sounding proposals for software projects. I was asked about a project recently and I just fed the parameters into the LLM and asked it for an outline of the project, estimated headcount, time, and cost. It suggested we could do it in 6 months with 15 engineers at a cost of $3M. (I think it was more than a bit optimistic, frankly, but it was a good start.) It provided a phased breakdown of the project and the burn rate. Management was curious about how long it would take 1 engineer and the LLM suggested 3-6 years.
Management was suitably horrified.
I've been trying to persuade them that the status quo has been satisfying our needs, costs nothing, needs no engineers, and is ready today, but they didn't want to hear it. But now they are starting to see the light.
13 May 2025 8:05pm GMT
06 May 2025
Planet PHP
Additional parameters on a PHP interface method
On the Roave Discord recently, there was a discussion about not breaking BC in interfaces inspired by this post by Jérôme Tamarelle: It's clearly true that if you add a new parameter to a method on an interface, then that's a BC break as every concrete implementation of the interface needs to change their signature. However, Gina commented that you don't need to use func_get_arg() as concrete implementations can add additional optional arguments. WHAT?!!! I… continue reading.
06 May 2025 10:00am GMT
30 Apr 2025
Kernel Planet
Brendan Gregg: Doom GPU Flame Graphs
AI Flame Graphs are now open source and include Intel Battlemage GPU support, which means it can also generate full-stack GPU flame graphs for providing new insights into gaming performance, especially when coupled with FlameScope (an older open source project of mine). Here's an example of GZDoom, and I'll start with flame scopes for both CPU and GPU utilization, with details annotated:
(Here are the raw CPU and GPU versions.) FlameScope shows a subsecond-offset heatmap of profile samples, where each column is one second (in this example, made up of 50 x 20ms blocks) and the color depth represents the number of samples, revealing variance and perturbation that you can select to generate a flame graph just for that time range.
Putting these CPU and GPU flame scopes side by side has enabled your eyes to do pattern matching to solve what would otherwise be a time-consuming task of performance correlation. The gaps in the GPU flame scope on the right - where the GPU was not doing much work - match the heavier periods of CPU work on the left.
CPU Analysis
FlameScope lets us click on the interesting periods. By selecting one of the CPU shader compilation stripes we get the flame graph just for that range:
This is brilliant, and we can see exactly why the CPUs were busy for about 180 ms (the vertical length of the red stripe): it's doing compilation of GPU shaders and some NIR preprocessing (optimizations to the NIR intermediate representation that Mesa uses internally). If you are new to flame graphs, you look for the widest towers and optimize them first. Here is the interactive SVG.
CPU flame graphs and CPU flame scope aren't new (from 2011 and 2018, both open source). What is new is full-stack GPU flame graphs and GPU flame scope.
GPU Analysis

Interesting details can also be selected in the GPU FlameScope for generating GPU flame graphs. This example selects the "room 3" range, which is a room in the Doom map that contains hundreds of enemies. The green frames are the actual instructions running on the GPU, aqua shows the source for these functions, and red (C) and yellow (C++) show the CPU code paths that initiated the GPU programs. The gray "-" frames just help highlight the boundary between CPU and GPU code. (This is similar to what I described in the AI flame graphs post, which included extra frames for kernel code.) The x-axis is proportional to cost, so you look for the widest things and find ways to reduce them.
I've included the interactive SVG version of this flame graph so you can mouse-over elements and click to zoom. (PNG version.)
The GPU flame graph is split between stalls coming from rendering walls (41.4%), postprocessing effects (35.7%), stenciling (17.2%), and sprites (4.95%). The CPU stacks are further differentiated by the individual shaders that are causing stalls, along with the reasons for those stalls.
GZDoom
We picked GZDoom to try since it's an open source version of a well known game that runs on Linux (our profiler does not support Windows yet). Intel Battlemage makes light work of GZDoom, however, and since the GPU profile is stall-based we weren't getting many samples. We could have switched to a more modern and GPU-demanding game, but didn't have any great open source ideas, so I figured we'd just make GZDoom more demanding. We built GPU demanding maps for GZDoom (I can't believe I have found a work-related reason to be using Slade), and also set some Battlemage tunables to limit resources, magnifying the utilization of remaining resources.
Our GZDoom test map has three rooms: room 1 is empty, room 2 is filled with torches, and room 3 is open with a large skybox and filled with enemies, including spawnpoints for Sergeants. This gave us a few different workloads to examine by walking between the rooms.
Using iaprof: Intel's open source accelerator profiler
The AI Flame Graph project is pioneering work, and has needed various changes to graphics compilers, libraries, and kernel drivers, not just the code but also how they are built. Since Intel has its own public cloud (the Intel® Tiber™ AI Cloud) we can fix the software stack in advance so that for customers it "just works." Check the available releases. It currently supports the Intel Max Series GPU.
If you aren't on the Intel cloud, or you wish to try this with Intel Battlemage, then it can require a lot of work to get the system ready to be profiled. Requirements include:
- A Linux system with superuser (root) access, so that eBPF and Intel eustalls can be used.
- A newer Linux kernel with the latest Intel GPU drivers. For Intel Battlemage this means Linux 6.15+ with the Xe driver; For the Intel Max Series GPU it's Linux 5.15 with the i915 driver.
- The Linux kernel built with Intel driver-specific eustall and eudebug interfaces (see the github docs for details). Some of these modifications are upstreamed in the latest versions of Linux and others are currently in progress. (These interfaces are made available by default on the Intel® Tiber™ AI Cloud.)
- All system libraries or programs that are being profiled need to include frame pointers so that the full stacks are visible, including Intel's oneAPI and graphics libraries. For this example, GZDoom itself needed to be compiled with frame pointers and also all libraries used by GZDoom (glibc, etc.). This is getting easier in the lastest versions of Fedora and Ubuntu (e.g., Ubuntu 24.04 LTS) which are shipping system libraries with frame pointers by default. But I'd expect there will be applications and dependencies that don't have frame pointers yet, and need recompilation. If your flame graph has areas that are very short, one or two frames deep, this is why.
If you are new to custom kernel builds and library tinkering, then getting this all working may feel like Nightmare! difficulty. Over time things will improve and gradually get easier: check the github docs. Intel can also develop a much easier version of this tool as part of a broader product offering and get it working on more than just Linux and Battlemage (either watch this space or, if you have an Intel rep, ask them to make it a priority).
Once you have it all working, you can run the iaprof command to profile the GPU. E.g.:
git clone https://github.com/intel/iaprof cd iaprof make deps make iaprof record > profile.txt cat profile.txt | iaprof flame > flame.svg
iaprof is modeled on the Linux perf command. (Maybe one day it'll become included in perf directly.) Thanks to Gabriel Muñoz for getting the work done to get this open sourced.
FAQ and Future Work
From the launch of AI flame graphs last year, I can guess what FAQ #1 will be: "What about NVIDIA?". They do have flame graphs in Nsight Graphics for GPU workloads, although their flame graphs are currently shallow as it is GPU code only, and onerous to use as I believe it requires an interposer; on the plus side they have click-to-source. The new GPU profiling method we've been developing allows for easy, everything, anytime profiling, like you expect from CPU profilers.
Future work will include github releases, more hardware support, and overhead reduction. We're the first to use eustalls in this way, and we need to add more optimization to reach our target of <5% overhead, especially with the i915 driver.
Conclusion
We've open sourced AI flame graphs and tested it on new hardware, Intel Battlemage, and a non-AI workload: GZDoom (gaming). It's great to see a view of both CPU and GPU resources down to millisecond resolution, where we can see visual patterns in the flame scope heat maps that can be selected to produce flame graphs to show the code. We applied these new tools to GZDoom and explained GPU pauses by selecting the corresponding CPU burst and reading the flame graph, as well as GPU code use for arbitrary time windows.
While we have open sourced this, getting it all running requires Intel hardware and Linux kernel and library tinkering - which can be a lot of work. (Actually playing Doom on Nightmare! difficulty may be easier.) This will get better over time. We look forward to seeing if anyone can fight their way through this work in the meantime and what new performance issues they can solve.
Authors: Brendan Gregg, Ben Olson, Brandon Kammerdiener, Gabriel Muñoz.
30 Apr 2025 2:00pm GMT
17 Apr 2025
Planet Twisted
Glyph Lefkowitz: Stop Writing `__init__` Methods
The History
Before dataclasses were added to Python in version 3.7 - in June of 2018 - the __init__
special method had an important use. If you had a class representing a data structure - for example a 2DCoordinate
, with x
and y
attributes - you would want to be able to construct it as 2DCoordinate(x=1, y=2)
, which would require you to add an __init__
method with x
and y
parameters.
The other options available at the time all had pretty bad problems:
- You could remove
2DCoordinate
from your public API and instead expose amake_2d_coordinate
function and make it non-importable, but then how would you document your return or parameter types? - You could document the
x
andy
attributes and make the user assign each one themselves, but then2DCoordinate()
would return an invalid object. - You could default your coordinates to 0 with class attributes, and while that would fix the problem with option 2, this would now require all
2DCoordinate
objects to be not just mutable, but mutated at every call site. - You could fix the problems with option 1 by adding a new abstract class that you could expose in your public API, but this would explode the complexity of every new public class, no matter how simple. To make matters worse,
typing.Protocol
didn't even arrive until Python 3.8, so, in the pre-3.7 world this would condemn you to using concrete inheritance and declaring multiple classes even for the most basic data structure imaginable.
Also, an __init__
method that does nothing but assign a few attributes doesn't have any significant problems, so it is an obvious choice in this case. Given all the problems that I just described with the alternatives, it makes sense that it became the obvious default choice, in most cases.
However, by accepting "define a custom __init__
" as the default way to allow users to create your objects, we make a habit of beginning every class with a pile of arbitrary code that gets executed every time it is instantiated.
Wherever there is arbitrary code, there are arbitrary problems.
The Problems
Let's consider a data structure more complex than one that simply holds a couple of attributes. We will create one that represents a reference to some I/O in the external world: a FileReader
.
Of course Python has its own open-file object abstraction, but I will be ignoring that for the purposes of the example.
Let's assume a world where we have the following functions, in an imaginary fileio
module:
open(path: str) -> int
read(fileno: int, length: int)
close(fileno: int)
Our hypothetical fileio.open
returns an integer representing a file descriptor1, fileio.read
allows us to read length
bytes from an open file descriptor, and fileio.close
closes that file descriptor, invalidating it for future use.
With the habit that we have built from writing thousands of __init__
methods, we might want to write our FileReader
class like this:
1 2 3 4 5 6 7 |
|
For our initial use-case, this is fine. Client code creates a FileReader
by doing something like FileReader("./config.json")
, which always creates a FileReader
that maintains its file descriptor int
internally as private state. This is as it should be; we don't want user code to see or mess with _fd
, as that might violate FileReader
's invariants. All the necessary work to construct a valid FileReader
- i.e. the call to open
- is always taken care of for you by FileReader.__init__
.
However, additional requirements will creep in, and as they do, FileReader.__init__
becomes increasingly awkward.
Initially we only care about fileio.open
, but later, we may have to deal with a library that has its own reasons for managing the call to fileio.open
by itself, and wants to give us an int
that we use as our _fd
, we now have to resort to weird workarounds like:
1 2 3 4 |
|
Now, all those nice properties that we got from trying to force object construction to give us a valid object are gone. reader_from_fd
's type signature, which takes a plain int
, has no way of even suggesting to client code how to ensure that it has passed in the right kind of int
.
Testing is much more of a hassle, because we have to patch in our own copy of fileio.open
any time we want an instance of a FileReader
in a test without doing any real-life file I/O, even if we could (for example) share a single file descriptor among many FileReader
s for testing purposes.
All of this also assumes a fileio.open
that is synchronous. Although for literal file I/O this is more of a hypothetical concern, there are many types of networked resource which are really only available via an asynchronous (and thus: potentially slow, potentially error-prone) API. If you've ever found yourself wanting to type async def __init__(self): ...
then you have seen this limitation in practice.
Comprehensively describing all the possible problems with this approach would end up being a book-length treatise on a philosophy of object oriented design, so I will sum up by saying that the cause of all these problems is the same: we are inextricably linking the act of creating a data structure with whatever side-effects are most often associated with that data structure. If they are "often" associated with it, then by definition they are not "always" associated with it, and all the cases where they aren't associated become unweildy and potentially broken.
Defining an __init__
is an anti-pattern, and we need a replacement for it.
The Solutions
I believe this tripartite assemblage of design techniques will address the problems raised above:
- using
dataclass
to define attributes, - replacing behavior that previously would have previously been in
__init__
with a new classmethod that does the same thing, and - using precise types to describe what a valid instance looks like.
Using dataclass
attributes to create an __init__
for you
To begin, let's refactor FileReader
into a dataclass
. This does get us an __init__
method, but it won't be one an arbitrary one we define ourselves; it will get the useful constraint enforced on it that it will just assign attributes.
1 2 3 4 5 6 7 |
|
Except... oops. In fixing the problems that we created with our custom __init__
that calls fileio.open
, we have re-introduced several problems that it solved:
- We have removed all the convenience of
FileReader("path")
. Now the user needs to import the low-levelfileio.open
again, making the most common type of construction both more verbose and less discoverable; if we want users to know how to build aFileReader
in a practical scenario, we will have to add something in our documentation to point at a separate module entirely. - There's no enforcement of the validity of
_fd
as a file descriptor; it's just some integer, which the user could easily pass an incorrect instance of, with no error.
In isolation, dataclass
by itself can't solve all our problems, so let's add in the second technique.
Using classmethod
factories to create objects
We don't want to require any additional imports, or require users to go looking at any other modules - or indeed anything other than FileReader
itself - to figure out how to create a FileReader
for its intended usage.
Luckily we have a tool that can easily address all of these concerns at once: @classmethod
. Let's define a FileReader.open
class method:
1 2 3 4 5 6 7 |
|
Now, your callers can replace FileReader("path")
with FileReader.open("path")
, and get all the same benefits.
Additionally, if we needed to await fileio.open(...)
, and thus we needed its signature to be @classmethod async def open
, we are freed from the constraint of __init__
as a special method. There is nothing that would prevent a @classmethod
from being async
, or indeed, from having any other modification to its return value, such as returning a tuple
of related values rather than just the object being constructed.
Using NewType
to address object validity
Next, let's address the slightly trickier issue of enforcing object validity.
Our type signature calls this thing an int
, and indeed, that is unfortunately what the lower-level fileio.open
gives us, and that's beyond our control. But for our own purposes, we can be more precise in our definitions, using NewType
:
1 2 |
|
There are a few different ways to address the underlying library, but for the sake of brevity and to illustrate that this can be done with zero run-time overhead, let's just insist to Mypy that we have versions of fileio.open
, fileio.read
, and fileio.write
which actually already take FileDescriptor
integers rather than regular ones.
1 2 3 4 |
|
We do of course have to slightly adjust FileReader
, too, but the changes are very small. Putting it all together, we get:
1 2 3 4 5 6 7 8 9 10 11 |
|
Note that the main technique here is not necessarily using NewType
specifically, but rather aligning an instance's property of "has all attributes set" as closely as possible with an instance's property of "fully valid instance of its class"; NewType
is just a handy tool to enforce any necessary constraints on the places where you need to use a primitive type like int
, str
or bytes
.
In Summary - The New Best Practice
From now on, when you're defining a new Python class:
- Make it a dataclass2.
- Use its default
__init__
method3. - Add
@classmethod
s to provide your users convenient and discoverable ways to build your objects. - Require that all dependencies be satisfied by attributes, so you always start with a valid object.
- Use
typing.NewType
to enforce any constraints on primitive data types (likeint
andstr
) which might have magical external attributes, like needing to come from a particular library, needing to be random, and so on.
If you define all your classes this way, you will get all the benefits of a custom __init__
method:
- All consumers of your data structures will receive valid objects, because an object with all its attributes populated correctly is inherently valid.
- Users of your library will be presented with convenient ways to create your objects that do as much work as is necessary to make them easy to use, and they can discover these just by looking at the methods on your class itself.
Along with some nice new benefits:
- You will be future-proofed against new requirements for different ways that users may need to construct your object.
- If there are already multiple ways to instantiate your class, you can now give each of them a meaningful name; no need to have monstrosities like
def __init__(self, maybe_a_filename: int | str | None = None):
- Your test suite can always construct an object by satisfying all its dependencies; no need to monkey-patch anything when you can always call the type and never do any I/O or generate any side effects.
Before dataclasses, it was always a bit weird that such a basic feature of the Python language - giving data to a data structure to make it valid - required overriding a method with 4 underscores in its name. __init__
stuck out like a sore thumb. Other such methods like __add__
or even __repr__
were inherently customizing esoteric attributes of classes.
For many years now, that historical language wart has been resolved. @dataclass
, @classmethod
, and NewType
give you everything you need to build classes which are convenient, idiomatic, flexible, testable, and robust.
Acknowledgments
Thank you to my patrons who are supporting my writing on this blog. If you like what you've read here and you'd like to read more of it, or you'd like to support my various open-source endeavors, you can support my work as a sponsor! I am also available for consulting work if you think your organization could benefit from expertise on topics like "but what is a 'class', really?".
-
If you aren't already familiar, a "file descriptor" is an integer which has meaning only within your program; you tell the operating system to open a file, it says "I have opened file 7 for you", and then whenever you refer to "7" it is that file, until you
close(7)
. ↩ -
Or an attrs class, if you're nasty. ↩
-
Unless you have a really good reason to, of course. Backwards compatibility, or compatibility with another library, might be good reasons to do that. Or certain types of data-consistency validation which cannot be expressed within the type system. The most common example of these would be a class that requires consistency between two different fields, such as a "range" object where
start
must always be less thanend
. There are always exceptions to these types of rules. Still, it's pretty much never a good idea to do any I/O in__init__
, and nearly all of the remaining stuff that may sometimes be a good idea in edge-cases can be achieved with a__post_init__
rather than writing a literal__init__
. ↩
17 Apr 2025 10:35pm GMT
16 Apr 2025
Planet PHP
Simplifying Python applications by using named connection pools
Python-oracledb 3.0 allows you to name pools when they are created. You can access them later by that name instead of having to pass around a pool handle. This feature is very helpful when your application spans many code files, or consists of independent libraries.
Multi-user (and some single-user) database applications should almost always use a driver connection pool. This has performance, scalability, and reliability benefits. Check out my previous posts on this topic are A driver connection pool, a DRCP pool, or an Implicit Connection Pool? and Always Use Connection Pools - and How.
But when your application spans multiple files, it can be tricky to pass the pool handle between your code modules. In python-oracledb 3.0 we introduced a driver connection pool cache to simplify your life. You can use the pool cache in both python-oracledb Thin and Thick modes with all the Oracle Database versions that python-oracledb supports. The same cache concept has already proven itself in our node-oracledb driver.
To put pool caching into practice, consider the new code connection_pool_pc.py which is a variant of the sample connection_pool.py. (Follow those links to see the full files).
The original connection_pool.py code creates a pool and returns its handle to the rest of the application:
pool = oracledb.create_pool(
user=sample_env.get_main_user(),
password=sample_env.get_main_password(),
dsn=sample_env.get_connect_string(),
params=sample_env.get_pool_params(),
min=pool_min,
max=pool_max,
increment=pool_inc,
session_callback=init_session,
)
return pool
The new code in connection_pool_pc.py adds a pool_alias=my_pool_alias parameter to create_pool(). It doesn't retain, or use, the pool handle returned by create_pool():
my_pool_alias = 'mypool'
oracledb.create_pool(
pool_alias=my_pool_alias,
user=sample_env.get_main_user(),
password=sample_env.get_main_password(),
dsn=sample_env.get_connect_string(),
params=sample_env.get_pool_params(),
min=pool_min,
max=pool_max,
increment=pool_inc,
session_callback=init_session,
)
Every time a connection is needed from the pool, the old code:
with pool.acquire() as connection:
is replaced to access the pool directly from the oracledb module:
with oracledb.connect(pool_alias=my_pool_alias) as connection:
The full diff between the files is:
71a72,73
> my_pool_alias = 'mypool'
>
88c90,91
< pool = oracledb.create_pool(
---
> oracledb.create_pool(
> pool_alias=my_pool_alias,
99d101
< return pool
101d102
<
128c129
< with pool.acquire() as connection:
---
> with oracledb.connect(pool_alias=my_pool_alias) as connection:
172c173
< with pool.acquire() as connection:
---
> with oracledb.connect(pool_alias=my_pool_alias) as connection:
190c191
< with pool.acquire() as connection:
---
> with oracledb.connect(pool_alias=my_pool_alias) as connection:
201c202
< pool = start_pool()
---
> start_pool()
The files run identically.
The benefit of pool caching is that modules and libraries that access a pool only need to agree on a name (or names - if you have multiple pools). After importing oracledb, each part of the code can access a pool directly off the imported oracledb module by using the agreed name.
You can also pass options to oracledb.connect() that you might have previously passed to pool.acquire(). The documented example is when you are using a heterogeneous pool where each connection could be a different user. In
Truncated by Planet PHP, read more at the original (another 1879 bytes)
16 Apr 2025 9:31pm GMT
Updated python-oracledb tutorial
The self-paced python-oracledb tutorial has been refreshed.
Our Python and Oracle Database: The New Wave of Scripting tutorial has had a refresh. This self-paced tutorial shows you how to use the python-oracledb driver to access Oracle Database. It has exercises and solutions. We have run this at conferences and had positive feedback from people new to the Python world. You can try it out on your own computer.
Once you have done the tutorial, you may also be interested in the container buildable from https://github.com/oracle/python-oracledb/tree/main/samples/containers/samples_and_db that installs the general python-oracledb sample scripts in a container with Oracle Database.
If you're new to this world, check out my colleague's post and video Exploring python-oracledb: A Simplified Approach to Oracle Database Connectivity.
Python-oracledb Resources
Python-oracledb is an open source package for the Python Database API specification with many additions to support advanced Oracle Database features. By default, it is a 'Thin' driver that is immediately usable without needing any additional install e.g. no Instant Client is required. It is used by many frameworks, ORMs, and libraries.
Links:
16 Apr 2025 12:31am GMT
01 Apr 2025
Planet Twisted
Glyph Lefkowitz: A Bigger Database
A Database File
When I was 10 years old, and going through a fairly difficult time, I was lucky enough to come into the possession of a piece of software called Claris FileMaker Pro™.
FileMaker allowed its users to construct arbitrary databases, and to associate their tables with a customized visual presentation. FileMaker also had a rudimentary scripting language, which would allow users to imbue these databases with behavior.
As a mentally ill pre-teen, lacking a sense of control over anything or anyone in my own life, including myself, I began building a personalized database to catalogue the various objects and people in my immediate vicinity. If one were inclined to be generous, one might assess this behavior and say I was systematically taxonomizing the objects in my life and recording schematized information about them.
As I saw it at the time, if I collected the information, I could always use it later, to answer questions that I might have. If I didn't collect it, then what if I needed it? Surely I would regret it! Thus I developed a categorical imperative to spend as much of my time as possible collecting and entering data about everything that I could reasonably arrange into a common schema.
Having thus summoned this specter of regret for all lost data-entry opportunities, it was hard to dismiss. We might label it "Claris's Basilisk", for obvious reasons.
Therefore, a less-generous (or more clinically-minded) observer might have replaced the word "systematically" with "obsessively" in the assessment above.
I also began writing what scripts were within my marginal programming abilities at the time, just because I could: things like computing the sum of every street number of every person in my address book. Why was this useful? Wrong question: the right question is "was it possible" to which my answer was "yes".
If I was obliged to collect all the information which I could observe - in case it later became interesting - I was similarly obliged to write and run every program I could. It might, after all, emit some other interesting information.
I was an avid reader of science fiction as well.
I had this vague sense that computers could kind of think. This resulted in a chain of reasoning that went something like this:
- human brains are kinda like computers,
- the software running in the human brain is very complex,
- I could only write simple computer programs, but,
- when you really think about it, a "complex" program is just a collection of simpler programs
Therefore: if I just kept collecting data, collecting smaller programs that could solve specific problems, and connecting them all together in one big file, eventually the database as a whole would become self-aware and could solve whatever problem I wanted. I just needed to be patient; to "keep grinding" as the kids would put it today.
I still feel like this is an understandable way to think - if you are a highly depressed and anxious 10-year-old in 1990.
Anyway.
35 Years Later
OpenAI is a company that produces transformer architecture machine learning generative AI models; their current generation was trained on about 10 trillion words, obtained in a variety of different ways from a large variety of different, unrelated sources.
A few days ago, on March 26, 2025 at 8:41 AM Pacific Time, Sam Altman took to "X™, The Everything App™," and described the trajectory of his career of the last decade at OpenAI as, and I quote, a "grind for a decade trying to help make super-intelligence to cure cancer or whatever" (emphasis mine).
I really, really don't want to become a full-time AI skeptic, and I am not an expert here, but I feel like I can identify a logically flawed premise when I see one.
This is not a system-design strategy. It is a trauma response.
You can't cure cancer "or whatever". If you want to build a computer system that does some thing, you actually need to hire experts in that thing, and have them work to both design and validate that the system is fit for the purpose of that thing.
Aside: But... are they, though?
I am not an oncologist; I do not particularly want to be writing about the specifics here, but, if I am going to make a claim like "you can't cure cancer this way" I need to back it up.
My first argument - and possibly my strongest - is that cancer is not cured.
QED.
But I guess, to Sam's credit, there is at least one other company partnering with OpenAI to do things that are specifically related to cancer. However, that company is still in a self-described "initial phase" and it's not entirely clear that it is going to work out very well.
Almost everything I can find about it online was from a PR push in the middle of last year, so it all reads like a press release. I can't easily find any independently-verified information.
A lot of AI hype is like this. A promising demo is delivered; claims are made that surely if the technology can solve this small part of the problem now, within 5 years surely it will be able to solve everything else as well!
But even the light-on-content puff-pieces tend to hedge quite a lot. For example, as the Wall Street Journal quoted one of the users initially testing it (emphasis mine):
The most promising use of AI in healthcare right now is automating "mundane" tasks like paperwork and physician note-taking, he said. The tendency for AI models to "hallucinate" and contain bias presents serious risks for using AI to replace doctors. Both Color's Laraki and OpenAI's Lightcap are adamant that doctors be involved in any clinical decisions.
I would probably not personally characterize "'mundane' tasks like paperwork and … note-taking" as "curing cancer". Maybe an oncologist could use some code I developed too; even if it helped them, I wouldn't be stealing valor from them on the curing-cancer part of their job.
Even fully giving it the benefit of the doubt that it works great, and improves patient outcomes significantly, this is medical back-office software. It is not super-intelligence.
It would not even matter if it were "super-intelligence", whatever that means, because "intelligence" is not how you do medical care or medical research. It's called "lab work" not "lab think".
To put a fine point on it: biomedical research fundamentally cannot be done entirely by reading papers or processing existing information. It cannot even be done by testing drugs in computer simulations.
Biological systems are enormously complex, and medical research on new therapies inherently requires careful, repeated empirical testing to validate the correspondence of existing research with reality. Not "an experiment", but a series of coordinated experiments that all test the same theoretical model. The data (which, in an LLM context, is "training data") might just be wrong; it may not reflect reality, and the only way to tell is to continuously verify it against reality.
Previous observations can be tainted by methodological errors, by data fraud, and by operational mistakes by practitioners. If there were a way to do verifiable development of new disease therapies without the extremely expensive ladder going from cell cultures to animal models to human trials, we would already be doing it, and "AI" would just be an improvement to efficiency of that process. But there is no way to do that and nothing about the technologies involved in LLMs is going to change that fact.
Knowing Things
The practice of science - indeed any practice of the collection of meaningful information - must be done by intentionally and carefully selecting inclusion criteria, methodically and repeatedly curating our data, building a model that operates according to rules we understand and can verify, and verifying the data itself with repeated tests against nature. We cannot just hoover up whatever information happens to be conveniently available with no human intervention and hope it resolves to a correct model of reality by accident. We need to look where the keys are, not where the light is.
Piling up more and more information in a haphazard and increasingly precarious pile will not allow us to climb to the top of that pile, all the way to heaven, so that we can attack and dethrone God.
Eventually, we'll just run out of disk space, and then lose the database file when the family gets a new computer anyway.
Acknowledgments
Thank you to my patrons who are supporting my writing on this blog. If you like what you've read here and you'd like to read more of it, or you'd like to support my various open-source endeavors, you can support my work as a sponsor! Special thanks also to Itamar Turner-Trauring and Thomas Grainger for pre-publication feedback on this article; any errors of course remain my own.
01 Apr 2025 12:47am GMT
18 Mar 2025
Kernel Planet
Matthew Garrett: Failing upwards: the Twitter encrypted DM failure
Almost two years ago, Twitter launched encrypted direct messages. I wrote about their technical implementation at the time, and to the best of my knowledge nothing has changed. The short story is that the actual encryption primitives used are entirely normal and fine - messages are encrypted using AES, and the AES keys are exchanged via NIST P-256 elliptic curve asymmetric keys. The asymmetric keys are each associated with a specific device or browser owned by a user, so when you send a message to someone you encrypt the AES key with all of their asymmetric keys and then each device or browser can decrypt the message again. As long as the keys are managed appropriately, this is infeasible to break.
But how do you know what a user's keys are? I also wrote about this last year - key distribution is a hard problem. In the Twitter DM case, you ask Twitter's server, and if Twitter wants to intercept your messages they replace your key. The documentation for the feature basically admits this - if people with guns showed up there, they could very much compromise the protection in such a way that all future messages you sent were readable. It's also impossible to prove that they're not already doing this without every user verifying that the public keys Twitter hands out to other users correspond to the private keys they hold, something that Twitter provides no mechanism to do.
This isn't the only weakness in the implementation. Twitter may not be able read the messages, but every encrypted DM is sent through exactly the same infrastructure as the unencrypted ones, so Twitter can see the time a message was sent, who it was sent to, and roughly how big it was. And because pictures and other attachments in Twitter DMs aren't sent in-line but are instead replaced with links, the implementation would encrypt the links but not the attachments - this is "solved" by simply blocking attachments in encrypted DMs. There's no forward secrecy - if a key is compromised it allows access to not only all new messages created with that key, but also all previous messages. If you log out of Twitter the keys are still stored by the browser, so if you can potentially be extracted and used to decrypt your communications. And there's no group chat support at all, which is more a functional restriction than a conceptual one.
To be fair, these are hard problems to solve! Signal solves all of them, but Signal is the product of a large number of highly skilled experts in cryptography, and even so it's taken years to achieve all of this. When Elon announced the launch of encrypted DMs he indicated that new features would be developed quickly - he's since publicly mentioned the feature a grand total of once, in which he mentioned further feature development that just didn't happen. None of the limitations mentioned in the documentation have been addressed in the 22 months since the feature was launched.
Why? Well, it turns out that the feature was developed by a total of two engineers, neither of whom is still employed at Twitter. The tech lead for the feature was Christopher Stanley, who was actually a SpaceX employee at the time. Since then he's ended up at DOGE, where he apparently set off alarms when attempting to install Starlink, and who today is apparently being appointed to the board of Fannie Mae, a government-backed mortgage company.
Anyway. Use Signal.
comments
18 Mar 2025 11:58pm GMT
15 Jan 2025
Planet Twisted
Glyph Lefkowitz: Small PINPal Update
Today on stream, I updated PINPal to fix the memorization algorithm.
If you haven't heard of PINPal before, it is a vault password memorization tool. For more detail on what that means, you can check it out the README, and why not give it a ⭐ while you're at it.
As I started writing up an update post I realized that I wanted to contextualize it a bit more, because it's a tool I really wish were more popular. It solves one of those small security problems that you can mostly ignore, right up until the point where it's a huge problem and it's too late to do anything about it.
In brief, PINPal helps you memorize new secure passcodes for things you actually have to remember and can't simply put into your password manager, like the password to your password manager, your PC user account login, your email account1, or the PIN code to your phone or debit card.
Too often, even if you're properly using a good password manager for your passwords, you'll be protecting it with a password optimized for memorability, which is to say, one that isn't random and thus isn't secure. But I have also seen folks veer too far in the other direction, trying to make a really secure password that they then forget right after switching to a password manager. Forgetting your vault password can also be a really big deal, making you do password resets across every app you've loaded into it so far, so having an opportunity to practice it periodically is important.
PINPal uses spaced repetition to ensure that you remember the codes it generates.
While periodic forced password resets are a bad idea, if (and only if!) you can actually remember the new password, it is a good idea to get rid of old passwords eventually - like, let's say, when you get a new computer or phone. Doing so reduces the risk that a password stored somewhere on a very old hard drive or darkweb data dump is still floating around out there, forever haunting your current security posture. If you do a reset every 2 years or so, you know you've never got more than 2 years of history to worry about.
PINPal is also particularly secure in the way it incrementally generates your password; the computer you install it on only ever stores the entire password in memory when you type it in. It stores even the partial fragments that you are in the process of memorizing using the secure keyring
module, avoiding plain-text whenever possible.
I've been using PINPal to generate and memorize new codes for a while, just in case2, and the change I made today was because encountered a recurring problem. The problem was, I'd forget a token after it had been hidden, and there was never any going back. The moment that a token was hidden from the user, it was removed from storage, so you could never get a reminder. While I've successfully memorized about 10 different passwords with it so far, I've had to delete 3 or 4.
So, in the updated algorithm, the visual presentation now hides tokens in the prompt several memorizations before they're removed. Previously, if the password you were generating was 'hello world', you'd see hello world
5 times or so, times, then •••• world
; if you ever got it wrong past that point, too bad, start over. Now, you'll see hello world
, then °°°° world
, then after you have gotten the prompt right without seeing the token a few times, you'll see •••• world
after the backend has locked it in and it's properly erased from your computer.
If you get the prompt wrong, breaking your streak reveals the recently-hidden token until you get it right again. I also did a new release on that same livestream, so if this update sounds like it might make the memorization process more appealing, check it out via pip install pinpal
today.
Right now this tool is still only extremely for a specific type of nerd - it's command-line only, and you probably need to hand-customize your shell prompt to invoke it periodically. But I'm working on making it more accessible to a broader audience. It's open source, of course, so you can feel free to contribute your own code!
Acknowledgments
Thank you to my patrons who are supporting my writing on this blog. If you like what you've read here and you'd like to read more things like it, or you'd like to support my various open-source endeavors, you can support my work as a sponsor!
-
Your email account password can be stored in your password manager, of course, but given that email is the root-of-trust reset factor for so many things, being able to remember that password is very helpful in certain situations. ↩
-
Funny story: at one point, Apple had an outage which made it briefly appear as if a lot of people needed to reset their iCloud passwords, myself included. Because I'd been testing PINPal a bunch, I actually had several highly secure random passwords already memorized. It was a strange feeling to just respond to the scary password reset prompt with a new, highly secure password and just continue on with my day secure in the knowledge I wouldn't forget it. ↩
15 Jan 2025 12:54am GMT
29 Nov 2024
Planet Plone - Where Developers And Integrators Write
Maurits van Rees: Lightning talks Friday
Bonnie Tyler Sprint
On 12 August 2026 there is a total solar eclipse that can be seen from Valencia, Spain. So we organise a sprint there.
This conference
We had 291 participants, 234 in person and 57 online. 13 Brazilian states (that is all of them), 14 countries.
24.5 percent women, was 13% in 2013, so that has gone up, but we are not there yet. Thank you to PyLadies and Django Girls for making this happen.
We had more than 80 presenters, about 30 lightning talks, lots of talk in the hall ways.
Thanks also to the team!
Ramiro Luz: Yoga time
Yoga exercise.
Rikupekka: University case student portal
We have a student portal at the university. But mostly:
Welcome to Jyväskylä university in Finald for Plone conference 2025, October 13-19!
Jakob: Beethovensprint
26-30 mei 2025 in Bonn, Duitsland.
Afterwards, on May 30 and June 1 there will be FedCon in Bonn, a SciFi convention.
Piero/Victor: BYOUI
Add-ons first development with @plone/registry. See https://plone-registry.readthedocs.io/
It allows for development that is framework agnostic, so it is not only for Plone. It is around configuration that can be extended and injected, which is tricky in most javascript frameworks.
Imagine it.
Ana Dulce: 3D printing
For a difficult model I had trust the process, it took a week, but it worked.
Renan & Iza: Python Brasil
We organised the Python Brasil conference from 16 to 23 October this year in Rio de Janeiro.
Next year 21-27 October in São Paulo.
Erico: Python Cerrado
31 July to 2 August 2025 is the next Python Cerrado conference.
29 Nov 2024 10:25pm GMT
Maurits van Rees: Paul Roeland: The value of longevity
Link to talk information on Plone conference website.
I work for the Clean Clothes Campaign: https://cleanclothes.org/
After three large disasters in factories in 2012 and 2013 with over 1000 deaths, it took three years to get an agreement with clothes manufacturers to get 30 million dollar compensation. It does not bring lives back, but it helps the survivors.
See Open Supply Hub for open data that we collected, for checking which brands are produced in which factories.
Documenting history matters. Stories must be told.
The global closing industry is worth around 1.8 trillion dollars, in a country that would put them on the 12th place in the world. 75 million workers.
Our strongest weapon: backlinks. We have links from OECD, UN, wikipedia, school curriculum, books. Especially those last two don't change ever, so you should never change urls.
Plone: enable the sitemap, please, why not by default? Create a good robots.txt. I weekly check Google Search console, looking for broken links. Tag early, tag often, great tool, even if you have an AI do it.
Our website: started 1998 written in Notepad, 2004 Dreamweaver, 2006 Bluefish, 2010 Joomla, 2013 Plone 4, 2020 Castle CMS (opinionated distribution of Plone, but does not really exist anymore) 2024 Plone 6 with Volto Light Theme (work in progress). Thank you kitconcept for all the help, especially Jonas.
Migrations are painful. Along the years we used wget to csv to SQL to csv, Python script, "Franken-mogrifier", collective.exportimport.
Lessons learned: stable urls are awesome, migrations are painful. Please don't try to salvage CSS from your old site, just start fresh in your new system. Do not try to migrate composite pages or listings.
What if your website does not provide an export? Use wget, still works and is better than httrack. sed/awk/regex are your friend. archivebox (WARC).
Document your steps for your own sanity.
To manage json, jq or jello can be used. sq is a Swiss knife for json/sql/csv. emuto is a hybrid between jq and GraphQL.
Normalize import/export. We have `plone.exportimport` in core now.
In the future I would like a plone exporter script that accepts a regex and exports only matching pages. Switch backends: ZODB, relstorage, nick, quantum-db. Sitewide search/replace/sed. Sneakernet is useful in difficult countries where you cannot send data over the internet: so export to a usb stick.
A backup is only a backup if it regularly gets restored so you know that it works.
- Keeping content and URL stability is a superpower.
- Assuming that export/import/backup/restore/migration are rare occurrences, is wrong.
- Quick export/import is very useful.
Do small migrations, treat it as maintenance. Don't be too far behind. Large migrations one every five years will be costly. Do a small migration every year. Do your part. Clients should also do their part, by budgeting this yearly. That is how budgeting works. Use every iteration to review custom code.
Make your sites live long and prosper.
29 Nov 2024 8:58pm GMT
Maurits van Rees: Fred van Dijk: Run Plone in containers on your own cluster with coolify.io
Link to talk information on Plone conference website.
Sorry, I ran out of time trying to set up https://coolify.io
So let's talk about another problem. Running applications (stacks) in containers is the future. Well: abstraction and isolation is the future, and containers is the current phase.
I am on the Plone A/I team, with Paul, Kim, Erico. All senior sysadmins, so we kept things running. In 2022 we worked on containerisation. Kubernetes was the kool kid then, but Docker Swarm was easier. Checkout Erico's training with new cookieplone templates.
Doing devops well is hard. You have a high workload, but still need to keep learning new stuff to keep up with what is changing.
I want to plug Coolify, which is a full open source product. "Self-hosting with super powers." The main developer, Andras Bacsal, believes in open source and 'hates' pay by usage cloud providers with a vengeance.
Coolify is still docker swarm. We also want Kubernetes support. But we still need sysadmins. Someone will still need to install coolify, and keep it updated.
I would like to run an online DevOps course somewhere January-March 2025. 4-6 meetings of 2 hours, maybe Friday afternoon. Talk through devops and sysadmin concepts, show docker swarm, try coolify, etc.
29 Nov 2024 7:58pm GMT