01 Jul 2022

feedDjango community aggregator: Community blog posts

Django News - 2022 Campaign Results - Jul 1st 2022

News

PyCharm & DSF Campaign 2022 Results

The sixth annual JetBrains PyCharm promotion in June netted the Django Software Foundation $25,000 this year.

djangoproject.com

🚨 Django 4.0.6 and 3.2.14 high severity security releases on July 4th

Mariusz Felisiak gave everyone a heads up to look for a Django 4.0.6 and 3.2.14 high severity security update on July 4th.

twitter.com

Sponsored Ad

Django GDPR Cookie Consent

Highly customizable Django app to make your Django website compatible with GDPR Cookie law.

gumroad.com

Articles

Forms in Django 4.0+

Django 4.0 made rendering forms more flexible using the template engine. David Smith shows us how to render forms and highlights what features to expect in Django 4.1.

github.io

How to set up Webpack and TailwindCSS in a Django Project

A guide to one-time guide setting up Webpack and Tailwind CSS to work with a Django application.

builtwithdjango.com

How to Run a Django Migration "By Hand"

Adam Johnson shows us how to apply database migrations by hand using sqlmigrate and dbshell.

adamj.eu

Google Summer of Code 2022: One Month Progress

The first post by one of this year's Google Summer of Code recepients, Deepak Dinesh, who is focusing on improving benchmarking within Django.

medium.com

Videos

Python Web Conf 2022 Talks + Tutorials

All 90 videos from the 2022 Python Web Conf are now available on YouTube.

youtube.com

Securing Django Applications by Gajendra Deshpande

A talk on performing penetration testing on Django web applications as well as strategies and configuration settings for making the source code and Django applications secure.

youtu.be

Make the Most of Django by Paolo Melchiorre

A talk on how to take full advantage of Django while contributing to its success and that of its community.

youtu.be

Sponsored Link

Fit Django perfectly with Kubernetes

Hurricane is an initiative to fit Django perfectly with Kubernetes. It allows you to make use of many django standard features to leverage the capabilities of Kubernetes to its fullest extent.

django-hurricane.io

Projects

Bunny Fonts

Bunny Fonts are GDPR friendly and an alternative to services like Google Fonts.

bunny.net

Salaah01/django-form-creator

A Django app that allows users to dynamically create forms.

github.com

strawberry-graphql/strawberry-graphql-django: Strawberry GraphQL Django extension

Strawberry GraphQL Django extension.

github.com

Sponsorship

📰 Sponsor Django News

Want to get your product, service, job, or company in front of over 2,690 Django professionals each week?

We have some summer sponsorship inventory left and would love to feature you. Sponsorship availability, prices, and details are available here.

django-news.com


This RSS feed is published on https://django-news.com/. You can also subscribe via email.

01 Jul 2022 3:00pm GMT

29 Jun 2022

feedDjango community aggregator: Community blog posts

How to Run a Django Migration “By Hand”

Normally your Django project's deploy process runs the migrate command, and that takes care of updating your database as necessary. Especially on smaller databases, Django's migration system can "just do it" for you.

But sometimes it can be necessary to run migrations "by hand" in your database's SQL console. I have found this to be the case with larger, busy databases, and when using tools like pt-online-schema-change to apply schema changes. In this post we'll cover the process for running a migration by hand, and adapting it to reversing migrations.

Run a Migration "By Hand"

Hold onto your butts…

1. Find the SQL to run

We write Django migrations in Python, but they ultimately end up running a series of SQL statements. To run a migration by hand, you need those SQL statements, so you can run them yourself.

You can display the SQL for a migration with Django's sqlmigrate command like:

$ ./manage.py sqlmigrate <app> <prefix>

This outputs each migration operation with a commented header describing what it does and then its actual statements. Replace <app> with the label of the app that the migration lives in. Replace <prefix> with a unique prefix of the migration's name - normally a four digit number like 0003.

For example, to show the SQL for the core app's migration with name starting "0003", you would run:

$ ./manage.py sqlmigrate core 0003
BEGIN;
--
-- Add field page_count to book
--
ALTER TABLE "core_book" ADD COLUMN "page_count" integer NULL;
COMMIT;

You'll only see the bookending with BEGIN and COMMIT on databases that support transactional schema changes (SQLite and PostgreSQL, of Django's built-in backends). These may be disabled per-migration, when necessary.

If any operations cannot be run as SQL, they will have the message THIS OPERATION CANNOT BE WRITTEN AS SQL (from Django 4.1, older versions have slightly different wording). This normally means use of the RunPython operation. For such operations, will need to figure out how to run them by hand separate to the process I'm covering here. You could also consider splitting them into their own migration.

A small note: it's best to run sqlmigrate against your production database. For certain operations, Django queries the database to find names of objects, patricularly index names when migrating older index definitions. Depending on how your various environments' databases were created and migrated over time, these names can be different. Thus, the SQL that Django generates on your staging server may be different to that on production. But on the other hand, using your production settings with an unexecuted migration isn't always easy, so you may just want to beware of this issue and adjust SQL when necessary.

2. Execute the SQL statement-by-statement

Open up your database's SQL shell on the target environment with Django's dbshell command:

$ ./manage.py dbshell

Here you can run the migration's SQL statements from sqlmigrate, one by one. Skip the comments from sqlmigrate (the lines starting --), and make sure you copy whole SQL statements that end with ;.

For example, running the above on PostgreSQL:

$ ./manage.py dbshell
psql (14.4, server 13.5 (Debian 13.5-1.pgdg110+1))
Type "help" for help.

example=# BEGIN;
BEGIN
example=*# ALTER TABLE "core_book" ADD COLUMN "page_count" integer NULL;
ALTER TABLE
example=*# COMMIT;
COMMIT
example=#

Whilst running the migration, you should keep an eye on your database's key metrics with whatever monitoring tools you use. You may also want to use a second dbshell to run some administrative commands, for example in my recent PostgreSQL post I covered finding and stopping queries that block an ALTER TABLE.

One note for databases supporting transactional schema changes (SQLite, PostgreSQL). If the migration has a single schema-changing statement like ALTER TABLE, you can drop BEGIN and COMMIT. This means less SQL to run, and you'll hold schema locks for slightly less time, reducing the risk of the migration affecting your busy production database.

3. Record migration as executed

After you've run your migration's statements by hand, you need to record the migration as executed. If you don't, the migrate command will try to again execute that migration, which could end catastrophically (but will normally just result in an error like "column already exists").

Django's migration systems keeps a record of executed migrations in a table called django_migrations. You can record a migration as executed with this query template:

INSERT INTO django_migrations (app, name, applied) VALUES (<app>, <name>, NOW());

Replace <app> with the label of the app that the migration lives in. Replace <name> with the full name of the migration, which is the migration's filename without the .py extension. (Be careful not to use just a prefix like 0003!)

For example, to mark our example migration as complete:

example=# INSERT INTO django_migrations (app, name, applied) VALUES ('core', '0003_book_page_count', NOW());
INSERT 0 1

You can check your entry looks right by visually comparing it with others:

SELECT * FROM django_migrations ORDER BY applied DESC;

…and just like that, you're done applying the migration!

Reverse a Migration "By Hand"

You can use the above process to reverse a migration "by hand" with a couple of changes.

In step one, use the --backwards flag to sqlmigrate to generate the SQL statements to undo the migration. Use this template:

$ ./manage.py sqlmigrate --backwards <app> <prefix>

And in step three, you'll want to delete from, rather than insert into, the django_migrations table. Use this template:

DELETE FROM django_migrations WHERE app = <app> AND name = <name>;

Hopefully you don't need to roll back migrations often!

Fin

May all your migrations run smoothly,

-Adam

29 Jun 2022 4:00am GMT

24 Jun 2022

feedDjango community aggregator: Community blog posts

Django News - Django 4.1 beta 1 released - Jun 24th 2022

News

Django 4.1 beta 1 released

Django 4.1 beta 1 is now available. It represents the second stage in the 4.1 release cycle and is an opportunity for you to try out the changes coming in Django 4.1.

djangoproject.com

2022 Python Software Foundation Board Elections

Just a reminder that if you are a registered member of the Python Software Foundation, you may vote in this year's election until June 30th, 2022 AoE.

python.org

Get paid to contribute to urllib3

Announcing urllib3's bounty program.

sethmlarson.dev

Sponsored Ad

Affordable Heroku-like experience for Django.

Spin up a production-ready Django application, Postgres DB, and Redis in 5 minutes on AWS, DO, or any other provider.

appliku.com

Articles

Customizable Django Admin themes

Arpit shows us how to quickly use django-admin-interface to customize the Django Admin's color scheme and how to load custom themes.

dev.to

Six things I do every time I start a Django project

Everyone approaches starting a project differently, and this was a fun read to see how Brenton Cleeland does it. This topic might make a fun Django Forum thread too.

brntn.me

How to Find and Stop Running Queries on PostgreSQL

A look at stopping PostgreSQL queries via SQL, techniques for finding problematic queries, and the occasionally useful ability to cancel via operating system tools.

adamj.eu

Don't Mock What You Don't Own in 5 Minutes

One of the most common issues programmers have, when they try to test real-world software is how to deal with third-party dependencies.

hynek.me

Why new Macs break your Docker build, and how to fix it

On new Macs, many Python-based Dockerfiles fail in entirely unexpected ways. A look at why this happens and potential fixes in the future.

pythonspeed.com

Podcasts

Django Chat: How to Learn Django (Ep2 Replay)

Carlton and Will discuss how to learn Django whether you are a total beginner or experienced web developer.

djangochat.com

Sponsored Link

Fit Django perfectly with Kubernetes

Hurricane is an initiative to fit Django perfectly with Kubernetes. It allows you to make use of many django standard features to leverage the capabilities of Kubernetes to its fullest extent.

django-hurricane.io

Projects

samwillis/nodejs-pypi

If you ever wondered why you couldn't pip install nodejs-bin to install Node.js in your projects, well, now you can.

github.com

dozymoe/frozen-django

frozen-django creates static websites from your Django views into HTML files.

github.com


This RSS feed is published on https://django-news.com/. You can also subscribe via email.

24 Jun 2022 3:00pm GMT

Checking References: How to Check References

Part two of my reference check series, covering the nuts and bolts of conducting a reference check. When should you check references? How many? How should you contact references? What questions should you ask?

24 Jun 2022 5:00am GMT

23 Jun 2022

feedDjango community aggregator: Community blog posts

Refactoring and New Features - Building SaaS with Python and Django #137

In this episode, I continued on the teacher checklist for the homeschooling app. We added data to the context and this required some refactoring to be able to reuse data from other parts of the app.

23 Jun 2022 5:00am GMT

How to Patch Requests to Have a Default Timeout

Python's requests package is very popular. Even if you don't use it directly, it's highly likely one of your dependencies does.

One wrinkle in requests' design is that it has no default timeout. This means that requests can hang forever if the remote server doesn't respond, unless the author remembered to add a timeout. Issue #3070 tracks the discussion on adding such a default timeout, but it has been open several years. httpx learned a lesson from this and it has default timeout of five seconds.

This "missing default" has caused several production incidents at my client ev.energy. Remote server outages caused background tasks to take 45 minutes instead of 4.5 seconds, waiting for responses that wouldn't come. This caused other important background work to be delayed, with knock-on effects.

Auditing the codebase to add missing timeout parameters only got so far. New third party and first party code "slipped through the net" and was deployed without a timeout.

I came up with a solution to change requests to use a default timeout. I did so with patchy, my package for patching the source code of functions at runtime. Patchy provides an alternative to monkey-patching with a few advantages: it can modify lines in the body of a function, it fails if the target function changes, and references to the function don't need updating.

Below is the patching function. Feel free to copy it into your project!

import patchy
from requests.adapters import HTTPAdapter


def patch_requests_default_timeout() -> None:
    """
    Set a default timeout for all requests made with "requests".

    Upstream is waiting on this longstanding issue:
    https://github.com/psf/requests/issues/3070
    """

    patchy.patch(
        HTTPAdapter.send,
        """\
        @@ -14,6 +14,8 @@
             :param proxies: (optional) The proxies dictionary to apply to the request.
             :rtype: requests.Response
             \"""
        +    if timeout is None:
        +        timeout = 5.0

             try:
                 conn = self.get_connection(request.url, proxies)
        """,
    )

You need to call this once when your project initializes. On a Django project, this can be done in an AppConfig:

from django.apps import AppConfig

from example.core import backports


class ExampleConfig(AppConfig):
    name = "example"

    def ready(self) -> None:
        backports.requests.patch_default_timeout()

The above snippet uses a 5 second timeout, which copies from httpx. This may be a bit low for existing projects integrated with several services. You may wish to start higher and iterate down.

At ev.energy, I used a higher value of 30 seconds, since production metrics showed some third party services taking a while to respond. I'm planning on iteratively reducing the default timeout down to 5 seconds, whilst setting explicit longer timeouts for the few known-slow services.

Fin

-Adam

23 Jun 2022 4:00am GMT

22 Jun 2022

feedDjango community aggregator: Community blog posts

How to Learn Django (Ep2 Replay)

Groups

Support the Show

This podcast does not have any ads or sponsors. To support the show, please consider purchasing a book, signing up for Button, or reading the Django News newsletter.

22 Jun 2022 10:00pm GMT

Checking References: Yes, You Should Check References

Reference checking isn't optional: it can save you from making a big mistake. Reference checks are your last line of defense against hiring a jerk.

22 Jun 2022 5:00am GMT

21 Jun 2022

feedDjango community aggregator: Community blog posts

Don’t Mock What You Don’t Own in 5 Minutes

One of the most common issues programmers have when they try to test real-world software, is how to deal with third-party dependencies. Let's examine an old, but counter-intuitive principle.

21 Jun 2022 2:00pm GMT

How to optimize PostgreSQL queries from Django using pgMustard

Slow queries happen, and when they do, it can be tough to dissect why they're slow. This difficulty is compounded by using Django's ORM, since it generates the SQL for you, so you may have little idea of the actual queries "under the hood".

In this post we'll look at what pgMustard does and how to use it with the Django ORM.

(Disclosure: I know pgMustard co-founder Michael Christofides, but I have not been paid for this post. I just like the product.)

pgMustard… Is This a Mustard?

PostgreSQL has an EXPLAIN statement to obtain a query plan for any given SELECT query. This plan contains structured performance data, and can contain both planned and actual metrics. Query plans are great, but interpreting them requires a broad understanding of PostgreSQL, and knowledge of what to focus on first. I ofetn find understanding query plans overwhelming.

This is where pgMustard enters the picture. It takes a query plan, visualizes it, and provides interpretation of the results. It ranks opportunities for optimization out of five stars (five = most optimizable) and provides detailed information about what you can do next.

For example, here's one of their demo visualizations:

pgMustard example query visualization, showing a tree diagram with explanation on the left. The main hint is that the query planner estimated nearly 200 times too few rows, with a recommendation to try using ANALYZE on the table to correct the statistics.

This query took 55 seconds to execute. pgMustard's visualization shows the query plan's steps in a hierarchical display. The sidebar reveals the why it was slow, with a short explanation showing action points that are expanded on in a blog post.

I like how pgMustard gives you informed optimizations for your queries and tables. It's like having a knowledgeable DBA on hand.

It Costs… Money!

pgMustard is a paid product, with a free trial that allows you to explain five query plans (no credit card required). This doesn't include queries with no optimization suggestions.

At current it costs €95 (~$105) a year for a single developer, or €500 (~$525) a year for a team (no limit on members). Both plans allow you to explain an unlimited amount of queries. I think this works well, as optimization is something you may focus on only a few times a year.

There are free tools for visualizing EXPLAIN output; see the list on the PostgreSQL Wiki. I've used two of them before: explain.dalibo.com (the original PEV version) and explain.depesz.com. They're handy, but they're focused on visualizing, so they don't have many interpretation features. For me, pgMustard's prioritized explanations are the main feature. We'll revisit the free tools at the end of this post.

Django → pgMustard Workflow

Alright, let's look at how to get from a Django QuerySet to a Query Plan to a pgMustard explanation.

1. Use QuerySet.explain() and All the Options

You use the EXPLAIN statement by prefixing it to the target statement, like:

EXPLAIN SELECT * FROM auth_user;

Django's ORM provides the QuerySet.explain() method to do this for any data-fetching query. It returns the query plan as a string (on PostgreSQL). For example:

In [2]: User.objects.all().explain()
Out[2]: 'Seq Scan on auth_user  (cost=0.00..699.32 rows=25632 width=109)'

Note that you can only use .explain() with methods that return QuerySets. Thus if you want to explain a .count(), .aggregate(), .exists(), etc., try explaining the QuerySet without that last method. This slightly changes the query, but it's often accurate enough.

PostgreSQL has many options for EXPLAIN, of which pgMustard asks you to use at least:

  • format json - output in JSON format for machine consumption, rather than the default text format intended for humans
  • analyze - run the query and gain actual performance data, not just predicted
  • buffers - track disk read metrics
  • verbose - add some miscellaneous details

There are also two newer options that pgMustard can use the data from:

  • setttings - PostgreSQL 12+, show settings that affect the query plan
  • wal - PostgreSQL 13+, add data on the Write Ahead Log, an important data structure

You can pass these options as arguments to QuerySet.explain(), with the value True, except format="json":

User.objects.all().explain(
    format="json", analyze=True, buffers=True, verbose=True, settings=True, wal=True
)

You then want to copy the output, in order to paste it into pgMustard.

The output from explain() is a string, which is a bit fiddly for copying as you don't want the surrounding quote marks, and it may include escaping. You can print() the string to avoid these problems. Here's a function you can use to run explain with all the options and provide copy-able output:

def show_explain(qs):
    print(
        qs.explain(
            format="json",
            analyze=True,
            buffers=True,
            verbose=True,
            settings=True,
            wal=True,
        )
    )

(Remove wal if you're using PostgreSQL < 13, and remove settings if you're using PostgreSQL < 12.)

Pass the QuerySet you're interested in, and you'll get back the query plan as a big chunk of JSON:

In [39]: show_explain(User.objects.filter(email__endswith="@example.org"))
[{"Plan": {"Node Type": "Gather", "Parallel Aware": false, "Startup Cost": 1000.0, "Total Cost": 191384.91, "Plan Rows": 1031, "Plan Width": 72, "Actual Startup Time": 36.64, "Actual Total Time": 4307.309, "Actual Rows": 100001, "Actual Loops": 1, "Output": ["id", "password", "last_login", "is_superuser", "username", "first_name", "last_name", "email", "is_staff", "is_active", "date_joined"], "Workers Planned": 2, "Workers Launched": 2, "Single Copy": false, "Shared Hit Blocks": 16148, "Shared Read Blocks": 120436, "Shared Dirtied Blocks": 0, "Shared Written Blocks": 0, "Local Hit Blocks": 0, "Local Read Blocks": 0, "Local Dirtied Blocks": 0, "Local Written Blocks": 0, "Temp Read Blocks": 0, "Temp Written Blocks": 0, "WAL Records": 0, "WAL FPI": 0, "WAL Bytes": 0, "Plans": [{"Node Type": "Seq Scan", "Parent Relationship": "Outer", "Parallel Aware": true, "Relation Name": "auth_user", "Schema": "public", "Alias": "auth_user", "Startup Cost": 0.0, "Total Cost": 190281.81, "Plan Rows": 430, "Plan Width": 72, "Actual Startup Time": 130.523, "Actual Total Time": 3996.232, "Actual Rows": 33334, "Actual Loops": 3, "Output": ["id", "password", "last_login", "is_superuser", "username", "first_name", "last_name", "email", "is_staff", "is_active", "date_joined"], "Filter": "((auth_user.email)::text ~~ '%@example.org'::text)", "Rows Removed by Filter": 3300013, "Shared Hit Blocks": 16148, "Shared Read Blocks": 120436, "Shared Dirtied Blocks": 0, "Shared Written Blocks": 0, "Local Hit Blocks": 0, "Local Read Blocks": 0, "Local Dirtied Blocks": 0, "Local Written Blocks": 0, "Temp Read Blocks": 0, "Temp Written Blocks": 0, "WAL Records": 0, "WAL FPI": 0, "WAL Bytes": 0, "Workers": [{"Worker Number": 0, "Actual Startup Time": 177.384, "Actual Total Time": 3897.37, "Actual Rows": 31986, "Actual Loops": 1, "JIT": {"Functions": 4, "Options": {"Inlining": false, "Optimization": false, "Expressions": true, "Deforming": true}, "Timing": {"Generation": 12.21, "Inlining": 0.0, "Optimization": 20.075, "Emission": 147.532, "Total": 179.817}}, "Shared Hit Blocks": 5394, "Shared Read Blocks": 38399, "Shared Dirtied Blocks": 0, "Shared Written Blocks": 0, "Local Hit Blocks": 0, "Local Read Blocks": 0, "Local Dirtied Blocks": 0, "Local Written Blocks": 0, "Temp Read Blocks": 0, "Temp Written Blocks": 0, "WAL Records": 0, "WAL FPI": 0, "WAL Bytes": 0}, {"Worker Number": 1, "Actual Startup Time": 179.124, "Actual Total Time": 3900.698, "Actual Rows": 32096, "Actual Loops": 1, "JIT": {"Functions": 4, "Options": {"Inlining": false, "Optimization": false, "Expressions": true, "Deforming": true}, "Timing": {"Generation": 12.141, "Inlining": 0.0, "Optimization": 19.935, "Emission": 149.706, "Total": 181.782}}, "Shared Hit Blocks": 5358, "Shared Read Blocks": 38514, "Shared Dirtied Blocks": 0, "Shared Written Blocks": 0, "Local Hit Blocks": 0, "Local Read Blocks": 0, "Local Dirtied Blocks": 0, "Local Written Blocks": 0, "Temp Read Blocks": 0, "Temp Written Blocks": 0, "WAL Records": 0, "WAL FPI": 0, "WAL Bytes": 0}]}]}, "Settings": {"search_path": "\"$user\", public, tiger"}, "Planning": {"Shared Hit Blocks": 0, "Shared Read Blocks": 0, "Shared Dirtied Blocks": 0, "Shared Written Blocks": 0, "Local Hit Blocks": 0, "Local Read Blocks": 0, "Local Dirtied Blocks": 0, "Local Written Blocks": 0, "Temp Read Blocks": 0, "Temp Written Blocks": 0}, "Planning Time": 2.314, "Triggers": [], "JIT": {"Functions": 12, "Options": {"Inlining": false, "Optimization": false, "Expressions": true, "Deforming": true}, "Timing": {"Generation": 36.606, "Inlining": 0.0, "Optimization": 42.797, "Emission": 328.02, "Total": 407.423}}, "Execution Time": 4355.994}]

Boom, now you can select and copy query plan.

Bug with olde Djangos

Note: format="json" only works properly on Django 4.0+, due to a bug reported in Ticket #32226. On older versions of Django use this adapted version of show_explain() that undoes the broken formatting:

import ast
import json


def show_explain(qs):
    plan = qs.explain(
        format="json",
        analyze=True,
        buffers=True,
        verbose=True,
        settings=True,
        wal=True,
    )
    print(json.dumps(ast.literal_eval(plan)))

But also, upgrade 😉

2. Analyze with pgMustard

pgMustard's "New plan" screen has a big text box to paste the query plan into:

pgMustard’s “New Plan” page, with a text box for pasting the plan and a “Review” button.

After doing so and pushing "Review", you will see the visualization and recommendations:

pgMustard’s “Plan Analysis” page, showing a tree breakdown of the query and three recommendations.

For this example query, there are three recommendations, two of which have five stars, meaning a lot of potential for speeding up the query. Expanding the first one, it says:

5.0 ⭐️ Index Potential

100,002 out of 10,000,041 table rows are returned after filtering (1.0%)

Sequential scans are often slow when not all rows are used.

Adding an index may allow the planner to look up the rows required more efficiently. Consider adding an index, or investigating why the query planner considers any existing index(es) unhelpful.

The table being scanned is:

auth_user

The filter being used is:

((auth_user.email)::text ~~ '%@example.org'::text)

So here you may consider adding an index to the email field. Note that a vanilla index cannot help with this end-of-string matching. You could use a functional index to store the split-out email domain, and adjust the query accordingly to use the same function calls. (Specifically it would be Substr("email", StrIndex("email", Value("@"))).)

By default, pgMustard stores your analyzed query plans in your browser's local storage, keeping them private. Query plans can include sensitive data in table names, filters, etc. so it's nice that it defaults to keeping this secret. You also have the option to "publish" a plan, which stores it on their server and gives it a random public URL you can share with colleagues:

pgMustard’s “Publish plan” dialog, warning about the implications and asking for a name before publishing.

Here's the link for this plan: app.pgmustard.com/#/explore/aea44714-89ef-4de6-95b2-1da18d595ebc. Great stuff.

Analyzing Other Types of Queries

QuerySet.explain() is very handy, but its restriction to methods that return QuerySets can limit your analysis. It's possible to EXPLAIN most PostgreSQL statements, such as INSERT (from e.g. Model.save()).

Below is a snippet of code containing a context manager to perform the same EXPLAIN on any query and output the query plan. This uses database instrumentation to modify the SQL and capture the explain result.

Keep in mind that the ANALYZE option actually executes the query. So if you're running something that modifies data, like Model.save(), QuerySet.delete(), etc., the changes are applied. This can also make it tricky to analyze a modification query after it has run, since it may no longer do anything.

Also beware that some Django ORM methods map onto multiple queries. This snippet will only analyze the first one, since it uses an exception to stop execution.

import json
from contextlib import contextmanager

from django.db import connection


class ExplainResult(Exception):
    pass


def explain_hook(execute, sql, params, many, context):
    sql = "EXPLAIN (ANALYZE, FORMAT JSON, BUFFERS, VERBOSE, SETTINGS, WAL) " + sql
    execute(sql, params, many, context)
    result = context["cursor"].fetchone()[0]
    raise ExplainResult(result)


@contextmanager
def explain():
    try:
        with connection.execute_wrapper(explain_hook):
            yield
    except ExplainResult as exc:
        print(json.dumps(exc.args[0]))

Use the explain() context manager to wrap a code block that executes your query, and you may get back a wall of JSON:

In [59]: with explain():
    ...:     User.objects.count()
[{"Plan": {"Node Type": "Aggregate", "Strategy": "Plain", "Partial Mode": "Finalize", "Parallel Aware": false, "Startup Cost": 191282.03, "Total Cost": 191282.04, "Plan Rows": 1, "Plan Width": 8, "Actual Startup Time": 4159.731, "Actual Total Time": 4178.435, "Actual Rows": 1, "Actual Loops": 1, "Output": ["count(*)"], "Shared Hit Blocks": 16145, "Shared Read Blocks": 120439, "Shared Dirtied Blocks": 0, "Shared Written Blocks": 0, "Local Hit Blocks": 0, "Local Read Blocks": 0, "Local Dirtied Blocks": 0, "Local Written Blocks": 0, "Temp Read Blocks": 0, "Temp Written Blocks": 0, "WAL Records": 0, "WAL FPI": 0, "WAL Bytes": 0, "Plans": [{"Node Type": "Gather", "Parent Relationship": "Outer", "Parallel Aware": false, "Startup Cost": 191281.81, "Total Cost": 191282.02, "Plan Rows": 2, "Plan Width": 8, "Actual Startup Time": 4156.926, "Actual Total Time": 4178.27, "Actual Rows": 3, "Actual Loops": 1, "Output": ["(PARTIAL count(*))"], "Workers Planned": 2, "Workers Launched": 2, "Single Copy": false, "Shared Hit Blocks": 16145, "Shared Read Blocks": 120439, "Shared Dirtied Blocks": 0, "Shared Written Blocks": 0, "Local Hit Blocks": 0, "Local Read Blocks": 0, "Local Dirtied Blocks": 0, "Local Written Blocks": 0, "Temp Read Blocks": 0, "Temp Written Blocks": 0, "WAL Records": 0, "WAL FPI": 0, "WAL Bytes": 0, "Plans": [{"Node Type": "Aggregate", "Strategy": "Plain", "Partial Mode": "Partial", "Parent Relationship": "Outer", "Parallel Aware": false, "Startup Cost": 190281.81, "Total Cost": 190281.82, "Plan Rows": 1, "Plan Width": 8, "Actual Startup Time": 3950.091, "Actual Total Time": 3950.119, "Actual Rows": 1, "Actual Loops": 3, "Output": ["PARTIAL count(*)"], "Shared Hit Blocks": 16145, "Shared Read Blocks": 120439, "Shared Dirtied Blocks": 0, "Shared Written Blocks": 0, "Local Hit Blocks": 0, "Local Read Blocks": 0, "Local Dirtied Blocks": 0, "Local Written Blocks": 0, "Temp Read Blocks": 0, "Temp Written Blocks": 0, "WAL Records": 0, "WAL FPI": 0, "WAL Bytes": 0, "Workers": [{"Worker Number": 0, "Actual Startup Time": 3846.571, "Actual Total Time": 3846.613, "Actual Rows": 1, "Actual Loops": 1, "JIT": {"Functions": 3, "Options": {"Inlining": false, "Optimization": false, "Expressions": true, "Deforming": true}, "Timing": {"Generation": 7.396, "Inlining": 0.0, "Optimization": 11.115, "Emission": 116.999, "Total": 135.51}}, "Shared Hit Blocks": 5405, "Shared Read Blocks": 38639, "Shared Dirtied Blocks": 0, "Shared Written Blocks": 0, "Local Hit Blocks": 0, "Local Read Blocks": 0, "Local Dirtied Blocks": 0, "Local Written Blocks": 0, "Temp Read Blocks": 0, "Temp Written Blocks": 0, "WAL Records": 0, "WAL FPI": 0, "WAL Bytes": 0}, {"Worker Number": 1, "Actual Startup Time": 3847.801, "Actual Total Time": 3847.84, "Actual Rows": 1, "Actual Loops": 1, "JIT": {"Functions": 3, "Options": {"Inlining": false, "Optimization": false, "Expressions": true, "Deforming": true}, "Timing": {"Generation": 7.155, "Inlining": 0.0, "Optimization": 10.949, "Emission": 118.094, "Total": 136.199}}, "Shared Hit Blocks": 5246, "Shared Read Blocks": 38227, "Shared Dirtied Blocks": 0, "Shared Written Blocks": 0, "Local Hit Blocks": 0, "Local Read Blocks": 0, "Local Dirtied Blocks": 0, "Local Written Blocks": 0, "Temp Read Blocks": 0, "Temp Written Blocks": 0, "WAL Records": 0, "WAL FPI": 0, "WAL Bytes": 0}], "Plans": [{"Node Type": "Seq Scan", "Parent Relationship": "Outer", "Parallel Aware": true, "Relation Name": "auth_user", "Schema": "public", "Alias": "auth_user", "Startup Cost": 0.0, "Total Cost": 179542.25, "Plan Rows": 4295825, "Plan Width": 0, "Actual Startup Time": 97.703, "Actual Total Time": 2546.515, "Actual Rows": 3333347, "Actual Loops": 3, "Shared Hit Blocks": 16145, "Shared Read Blocks": 120439, "Shared Dirtied Blocks": 0, "Shared Written Blocks": 0, "Local Hit Blocks": 0, "Local Read Blocks": 0, "Local Dirtied Blocks": 0, "Local Written Blocks": 0, "Temp Read Blocks": 0, "Temp Written Blocks": 0, "WAL Records": 0, "WAL FPI": 0, "WAL Bytes": 0, "Workers": [{"Worker Number": 0, "Actual Startup Time": 136.892, "Actual Total Time": 2509.312, "Actual Rows": 3223800, "Actual Loops": 1, "Shared Hit Blocks": 5405, "Shared Read Blocks": 38639, "Shared Dirtied Blocks": 0, "Shared Written Blocks": 0, "Local Hit Blocks": 0, "Local Read Blocks": 0, "Local Dirtied Blocks": 0, "Local Written Blocks": 0, "Temp Read Blocks": 0, "Temp Written Blocks": 0, "WAL Records": 0, "WAL FPI": 0, "WAL Bytes": 0}, {"Worker Number": 1, "Actual Startup Time": 138.042, "Actual Total Time": 2489.045, "Actual Rows": 3182375, "Actual Loops": 1, "Shared Hit Blocks": 5246, "Shared Read Blocks": 38227, "Shared Dirtied Blocks": 0, "Shared Written Blocks": 0, "Local Hit Blocks": 0, "Local Read Blocks": 0, "Local Dirtied Blocks": 0, "Local Written Blocks": 0, "Temp Read Blocks": 0, "Temp Written Blocks": 0, "WAL Records": 0, "WAL FPI": 0, "WAL Bytes": 0}]}]}]}]}, "Settings": {"search_path": "\"$user\", public, tiger"}, "Planning": {"Shared Hit Blocks": 0, "Shared Read Blocks": 0, "Shared Dirtied Blocks": 0, "Shared Written Blocks": 0, "Local Hit Blocks": 0, "Local Read Blocks": 0, "Local Dirtied Blocks": 0, "Local Written Blocks": 0, "Temp Read Blocks": 0, "Temp Written Blocks": 0}, "Planning Time": 4.296, "Triggers": [], "JIT": {"Functions": 11, "Options": {"Inlining": false, "Optimization": false, "Expressions": true, "Deforming": true}, "Timing": {"Generation": 24.741, "Inlining": 0.0, "Optimization": 23.089, "Emission": 251.59, "Total": 299.42}}, "Execution Time": 4190.409}]

Ta-daa! Ready to be pasted into pgMustard.

Skip the Legwork With auto_explain

auto_explain is a PostgreSQL module that can run EXPLAIN on slow queries automatically (based on a threshold of execution time). It outputs the explain plans in the PostgreSQL log file. This way you can gather query plans from your production environment without having to go through the steps above. It is also great for caputring query plans of queries that are only slow occasionally, e.g. with certain data.

pgMustard have a guide on enabling auto_explain with appropriate options. This can save a human time, at the cost of a small overhead. I would think it's generally worth it.

Using the Free Tools

If you finish the free trial on pgMustard and find it beyond your price range, you can use one of the free tools mentioned earlier. They're listed on the PostgreSQL Wiki page "Using EXPLAIN". All of the above advice on obtaining EXPLAIN output still applies, as they also take query plans in JSON format

For example, here's that first query plan analyzed by explain.dalibo.com.

explain.dalibo.com showing a query plan visualization.

I made this plan public: explain.dalibo.com/plan/RYV.

explain.dalibo.com offers a little interpretation, in terms of some icons that highlight the scan as slow and expensive. But it doesn't give any advice on what to do to fix this. Still, with some research you can get ideas on what to do next.

Fin

May your queries run ever leaner and meaner,

-Adam

21 Jun 2022 4:00am GMT

17 Jun 2022

feedDjango community aggregator: Community blog posts

Django News - Last Chance for a DjangoCon Europe 2023 - Jun 17th 2022

News

Get PyCharm, Support Django

Until June 20, you can purchase PyCharm Pro at 30% off, with all of the proceeds being donated to the Django Software Foundation.

jetbrains.com

Last Chance for a DjangoCon Europe 2023

There will not be a DjangoCon Europe 2023 if the DSF board does not receive viable proposals for one by August 10 End of Day AoE. There is not sufficient time after that for an organizing group to plan a DjangoCon to happen in the traditional and calendar blocked window of April-June.

djangoproject.com

The PSF's 2021 Annual Report

The Python Software Foundation published its 2021 Annual Report, including comments from its new Executive Director Deb Nicholson, outgoing Board Chair Lorena Mesa, financials, grant funding, and much more.

blogspot.com

Sponsored Ad

Affordable Heroku-like experience for Django.

Spin up a production-ready Django application, Postgres DB, and Redis in 5 minutes on AWS, DO, or any other provider.

appliku.com

Articles

Utilising caching in your applications

A detailed look at improving application performance with select_related, prefetch_related, and caching.

django-cms.org

Share editor settings with EditorConfig

Marijke Luttekes shows us how to use EditorConfig to fix common team formatting issues.

marijkeluttekes.dev

The End of Localhost

A prediction of the future web development that relies on the cloud more than localhost.

dx.tips

Building Wordle with PyScript + Other Examples

In the second part of a two-part series, Robby Boney shows us his Wordle clone and some other examples built using PyScript.

medium.com

Django CMS Fellowship Program

The Django CMS open source community is growing, and after securing enough funding, we are pleased to announce the launch of the Django CMS Fellowship Program.

django-cms.org

Videos

You Don't Need JavaScript

Matt Layman lays down a solid argument and video about why you may not need JavaScript in your project and how to use a library like HTMx effectively.

mattlayman.com

Sponsored Link

Fit Django perfectly with Kubernetes

Hurricane is an initiative to fit Django perfectly with Kubernetes. It allows you to make use of many django standard features to leverage the capabilities of Kubernetes to its fullest extent.

django-hurricane.io

Projects

akx/django-managerie

Expose Django management commands in the admin.

github.com

israelabraham/django-sotp

Generate a secured base32 one-time password to authenticate your user!

github.com


This RSS feed is published on https://django-news.com/. You can also subscribe via email.

17 Jun 2022 3:00pm GMT

DORA Metrics: the Right Answer to measuring engineering team performance

"What metrics should I use to measure my engineering team's performance?" Believe it not, there is a Right Answer: the so-called DORA metrics.

17 Jun 2022 5:00am GMT

16 Jun 2022

feedDjango community aggregator: Community blog posts

New App In Project - Building SaaS with Python and Django #136

In this episode, I broke ground on a brand new feature. The feature will be a task checklist for teachers. Since the checklist is directly for teachers, I created a new Django app named teachers and showed how to hook that into the rest of my Django project.

16 Jun 2022 5:00am GMT

Making a Compelling Offer — in this economy?

An edited transcript of a talk I delivered at the CTOCraft Hiring MiniConf. How do you make a job offer that'll be accepted when other companies are out there offering candidates over a million dollars?

16 Jun 2022 5:00am GMT

Rotterdam (NL) June 2022 python meetup summaries

(Some summaries of a talk at the June 2022 Rotterdam python meetup).

Leveraging Python for secure medical image data transfer to the cloud - Ronald van 't Klooster (Quantib)

Usecase: they want to use python to connect a hospital to the cloud to upload radiology images securely. Imagery of patients, so it has to be really secure. And in their case also certified.

Radiology images are normally in the "DICOM" format. There's a special dicom protocol to transfer images, both for sending and getting images. A computer that implements the protocol is a "dicom node" and can be used by many applications.

The image format is mostly jpeg (actually a set of images: cross-sections of the brain, for instance) plus metadata.

They have an AI program to help diagnose images, with a microservices based architecture using docker+django. It used to run inside the hospital on their network. They wanted to move the AI server to the cloud. But how to get that working while still retaining their certification? This means that the actual certified AI server program cannot be changed.

The solution was to use an open source DIOCM node software called Orthanc. With a nice REST interface, so usable over https. They added one instance inside the network and one in the cloud, next to the AI server.

Dicom node can send/recieve to/from another dicom node, but how to manage it? This is where they used python. A daemon on both sides that monitores the REST api and the local orthanc databases and triggers the necessary "gets" and "sends". Also the sending/getting from the orthanc node to the actual node where the orthanc is placed next to.

Python helped them a lot. Requests, threading, abstract base classes, logging... everything is there. And with https and client certificates, the security is good. The node on the hospital side only does get and send, so nothing needs to be send to the hospital, as requesting an open port would be very hard :-)

Deployment is with docker in the cloud. In the hospitals, requesting a windows machine is easiest in practice, so they packaged the small orthanc node up as an executable that can be run as a service.

Developing Python apps on Kubernetes - Marcel Koek (Erasmus MC)

They slowly learned how to get a python app to run fine in kubernetes, including getting your local development environment to match the server environment as much as possible. And... making it as comfortable to work with as possible.

He demoed a small app build with fastapi as an example.

When using docker, there's an extra layer between you and your code. If you change code, you need to rebuild your docker image. Or you need to mount the dir with your code into your running docker. How to manage that? And you also want to make it resemble your production environment.

One step closer to production is to use kubernetes locally. He uses k3d to run a local kubernetes cluster: it runs kubernetes in your local docker daemon! Easy to run. It also includes a local docker registry, which you can push your image to.

For actually deploying the docker, he uses "helm", kind of a package manager for docker.

What really helps getting development working nice with kubernetes: tilt. Smart rebuilds, live updates, automatic docker rebuilding. His demo "tiltfile" even included an automatic pip install -r requirements if the requirements change.

Note: a tiltfile looks like a python file, but it is a "go" configuration file format...

Build and scale containerized applications on AWS - James Meakin (Radboud UMC)

He's working on https://grand-challenge.org/ , an open source project for AI and medical imaging. You can submit imagery ("a challenge") that can then be evaluated with various algorithms. You can test your algorithms this way.

Techs they use: django, celery, vuejs, htmlx, pydicom and many more. And it is containers everywhere with a multi-region deployment on AWS for really low-latency image access from the browser.

The platform is extensible with container images running, for instance, specific algorithms. That's a security challenge, as they're written by lots of people. And you've got to manage the allowable resources.

In AWs there are lots of ways to run your containers. The three most used ones:

  • AWS app runner. Cheap and easy. Good if your apps have more than 25% idle time.
  • Amazon kubernetes, EKS. For his taste, it is way too complex. "Yaml hell".
  • Amazon elastic container service, ECS. It is AWS' opinionated way to run containers at scale. You have much less to understand: it mostly looks and handles like docker-compose.

You can use AWS fargate to run tasks on either EKS or ECS. Their software allows them to use "spot instances", instances that are cheap because they might get killed.

They use quite some AWS services. Simple queue service, eventbridge, elastic file storage. So they bought into the AWS ecosystem. But, as the actual software is all open source, they can move to other providers with some effort.

Simulating 6-axis industrial robots with Python - Thijs Damsma (CEAD)

He now works for a firm that builds 3d printing robots. 3d stuff that's up to 40 meter long :-) He's trying to use python to manage/steer them.

His current experiments are with https://pypi.org/project/roboticstoolbox-python/, which you can use comfortably in a jupyter notebook, including a 3d visualisation. He showed a nice demo.

16 Jun 2022 4:00am GMT

13 Jun 2022

feedDjango community aggregator: Community blog posts

Choosing a Django Version

One of the first things you need to do when starting a new Django project is to choose which version of Django you are going to use. At any given time, there could be as many as three supported Django versions available to choose from:

I'm excluding pre-releases from this list which should only be used for testing.

What are you optimizing for?

The first question to ask yourself is what are you optimizing for. When we are building applications for our clients we are almost always optimizing to reduce costs, both during the initial development process and in support post-launch. For this reason, we are choosing versions which meet the goals of maximimizing interoperability with the Django ecosystem and minimizing maintenance (aka toil) down-the-road. Unless you are working on a hobby project and want to play with new features, I'd argue these goals are universal.

Notes on Django Versioning

Django does not follow the semantic versioning standard strictly. Versions will always be made up of three numbers (major.minor.patch). Minor version 2 is always an LTS version. After the LTS, the major version is incremented and two minor releases are made on it (x.0 and x.1). Patch versions are released for bugs found on any supported versions. For more info, see DEP-0004.

Check Supported Versions to see the current LTS cycle. As of June, 2022 it looks like this,

gantt chart of Django supported versions

Minimizing Maintenance

The best way to minimize future maintenance is to choose an LTS release. The promise of not having to do a major version upgrade for up to three years means fewer upgrade cycles and less maintenance work.

Maximizing Interoperability

It's rare to build a Django application that doesn't depend on other open source Django libraries. Coming across a library that's the perfect fit for your project only to find it's not compatible with the version of Django will slow your progress. Being a good open source community member and submitting a patch is the right thing to do, but it takes time and effort that could otherwise be spent on your application.

Open source projects are usually maintained by volunteers who are working in their free time without compensation. As a result, support can lag behind official Django releases. If you are starting development shortly after a new LTS release, it may make more sense to use the previous LTS which should have wide adoption and still have close to 9 months of support remaining. If you take this route, read the new LTS release notes carefully to future-proof your project for the jump to the next LTS. In all cases, make sure your code runs without any deprecation warnings.

Another reason to avoid jumping on a brand new LTS is to avoid any bugs that weren't caught during testing. I generally prefer to wait for the first patch release to adopt a new LTS version (4.2.1 instead of 4.2.0). These usually show up about a month after the initial release and contain bug fixes found in the wild.

Exceptions

As with all coding rules-of-thumb, there are exceptions, but avoid getting sucked into the "it's new-and-shiny" trap. If a newer version of Django has a killer feature that is core to your application, that might be a good reason to jump off an LTS temporarily. An example of this might be the introduction of native JSON support for Postgres in Django 1.9.

If you're using minimal dependencies or know all your dependencies support a new LTS version, that might be a good reason to start using an LTS that was recently released. It's easy to paint yourself into a corner with this approach however. As your project comes to life, you may find you actually do need a library you didn't expect when you started.

Finally, if you're developing as a hobby or to learn something new, use whatever version you want! Not every development project needs to be dictated by cost or what's best for the business.


Tldr; Use the latest LTS which has received at least one patch release. The version number will be in the format x.2.z where z > 0. This usually isn't the latest version you can download from PyPI.

13 Jun 2022 11:57pm GMT