16 Dec 2025

feedplanet.freedesktop.org

Timur Kristóf: How do graphics drivers work?

I'd like to give an overview on how graphics drivers work in general, and then write a little bit about the Linux graphics stack for AMD GPUs. The intention of this post is to clear up a bunch of misunderstandings that people have on the internet about open source graphics drivers.

What is a graphics driver?

A graphics driver is a piece of software code that is written for the purpose of allowing programs on your computer to access the features of your GPU. Every GPU is different and may have different capabilities or different ways of achieving things, so they need different drivers, or at least different code paths in a driver that may handle multiple GPUs from the same vendor and/or the same hardware generation.

The main motivation for graphics drivers is to allow applications to utilize your hardware efficiently. This enables games to render pretty pixels, scientific apps to calculate stuff, as well as video apps to encode / decode efficiently.

Organization of graphics drivers

Compared to drivers for other hardware, graphics is very complicated because the functionality is very broad and the differences between each piece of hardware can be also vast.

Here is a simplified explanation on how a graphics driver stack usually works. Note that most of the time, these components (or some variation) are bundled together to make them easier to use.

I'll give a brief overview of each component below.

GPU firmware

Most GPUs have additional processors (other than the shader cores) which run a firmware that is responsible for operating the low-level details of the hardware, usually stuff that is too low-level even for the kernel.

The firmware on those processors are responsible for: power management, context switching, command processing, display, video encoding/decoding etc. Among other things it parses the commands we submitted to it, launches shaders, distributes work between the shader cores etc.

Some GPU manufacturers are moving more and more functionality to firmware, which means that the GPU can operate more autonomously and less intervention is needed by the CPU. This tendency is generally positive for reducing CPU time spent on programming the GPU (as well as "CPU bubbles"), but at the same time it also means that the way the GPU actually works becomes less transparent.

Kernel driver

You might ask, why not implement all driver functionality in the kernel? Wouldn't it be simpler to "just" have everything in the kernel? The answer is no, mainly because there is a LOT going on which nobody wants in the kernel.

So, usually, the KMD is only left with some low-level tasks that every user needs:

Userspace driver

Applications interact with userspace drivers instead of the kernel (or the hardware directly). Userspace drivers are compiled as shared libraries and are responsible for implementing one or more specific APIs for graphics, compute or video for a specific family of GPUs. (For example, Vulkan, OpenGL or OpenCL, etc.) Each graphics API has entry points which load the available driver(s) for the GPU(s) in the user's system. The Vulkan loader is an example of this; other APIs have similar components for this purpose.

The main functionality of a userspace driver is to take the commands from the API (for example, draw calls or compute dispatches) and turn them into low level commands in a binary format that the GPU can understand. In Vulkan, this is analogous to recording a command buffer. Additionally, they utilize a shader compiler to turn a higher level shader language (eg. GLSL) or bytecode (eg. SPIR-V) into hardware instructions which the GPU's shader cores can execute.

Furthermore, userspace drivers also take part in memory management, they basically act as an interface between the memory model of the graphics API and kernel's memory manager.

The userspace driver calls the aforementioned kernel uAPI to submit the recorded commands to the kernel which then schedules it and hands it to the firmware to be executed.

Shader compiler

If you've seen a loading screen in your favourite game which told you it was "compiling shaders…" you probably wondered what that's about and why it's necessary.

Unlike CPUs which have converged to a few common instruction set architectures (ISA), GPUs are a mess and don't share the same ISA, not even between different GPU models from the same manufacturer. Although most modern GPUs have converged to SIMD based architectures, the ISA is still very different between manufacturers and it still changes from generation to generation (sometimes different chips of the same generation have slightly different ISA). GPU makers keep adding new instructions when they identify new ways to implement some features more effectively.

To deal with all that mess, graphics drivers have to do online compilation of shaders (as opposed to offline compilation which usually happens for apps running on your CPU).

This means that shaders have to be recompiled when the userspace graphics driver is updated either because new functionality is available or because bug fixes were added to the driver and/or compiler.

But I only downloaded one driver!

On some systems (especially proprietary operating systems like Windows), GPU manufacturers intend to make users' lives easier by offering all of the above in a single installer package, which is just called "the driver".

Typically such a package includes:

But I didn't download any drivers!

On some systems (typically on open source systems like Linux distributions), usually you can already find a set of packages to handle most common hardware, so you can use most functionality out of the box without needing to install anything manually.

Neat, isn't it?

However, on open source systems, the graphics stack is more transparent, which means that there are many parts that are scattered across different projects, and in some cases there is more than one driver available for the same HW. To end users, it can be very confusing.

However, this doesn't mean that open source drivers are designed worse. It is just that due to their community oriented nature, they are organized differently.

One of the main sources of confusion is that various Linux distributions mix and match different versions of the kernel with different versions of different UMDs which means that users of different distros can get a wildly different user experience based on the choices made for them by the developers of the distro.

Another source of confusion is that we driver developers are really, really bad at naming things, so sometimes different projects end up having the same name, or some projects have nonsensical or outdated names.

The Linux graphics stack

In the next post, I'll continue this story and discuss how the above applies to the open source Linux graphics stack.

16 Dec 2025 12:09am GMT

Hari Rana: Please Fund My Continued Accessibility Work on GNOME!

Hey, I have been under distress lately due to personal circumstances that are outside my control. I cannot find a permanent job that allows me to function, I am not eligible for government benefits, my grant proposals to work on free and open-source projects got rejected, paid internships are quite difficult to find, especially when many of them prioritize new contributors. Essentially, I have no stable, monthly income that allows me to sustain myself.

Nowadays, I mostly volunteer to improve accessibility throughout GNOME apps, either by enhancing the user experience for people with disabilities, or enabling them to use them. I helped make most of GNOME Calendar accessible with a keyboard and screen reader, with additional ongoing effort involving merge requests !564 and !598 to make the month view accessible, all of which is an effort no company has ever contributed to, or would ever contribute to financially. These merge requests require literal thousands of hours for research, development, and testing, enough to sustain me for several years if I were employed.

I would really appreciate any kinds of donations, especially ones that happen periodically to increase my monthly income. These donations will allow me to sustain myself while allowing me to work on accessibility throughout GNOME, essentially 'crowdfunding' development without doing it on the behalf of the GNOME Foundation or another organization.

Donate on Liberapay

Support on Ko-fi

Sponsor on GitHub

16 Dec 2025 12:00am GMT

13 Dec 2025

feedplanet.freedesktop.org

Sebastian Wick: Flatpak Pre-Installation Approaches

Together with my then-colleague Kalev Lember, I recently added support for pre-installing Flatpak applications. It sounds fancy, but it is conceptually very simple: Flatpak reads configuration files from several directories to determine which applications should be pre-installed. It then installs any missing applications and removes any that are no longer supposed to be pre-installed (with some small caveats).

For example, the following configuration tells Flatpak that the devel branch of the app org.test.Foo from remotes which serve the collection org.test.Collection, and the app org.test.Bar from any remote should be installed:

[Flatpak Preinstall org.test.Foo]
CollectionID=org.test.Collection
Branch=devel

[Flatpak Preinstall org.test.Bar]

By dropping in another confiuration file with a higher priority, pre-installation of the app org.test.Foo can be disabled:

[Flatpak Preinstall org.test.Foo]
Install=false

The installation procedure is the same as it is for the flatpak-install command. It supports installing from remotes and from side-load repositories, which is to say from a repository on a filesystem.

This simplicity also means that system integrators are responsible for assembling all the parts into a functioning system, and that there are a number of choices that need to be made for installation and upgrades.

The simplest way to approach this is to just ship a bunch of config files in /usr/share/flatpak/preinstall.d and config files for the remotes from which the apps are available. In the installation procedure, flatpak-preinstall is called and it will download the Flatpaks from the remotes over the network into /var/lib/flatpak. This works just fine, until someone needs one of those apps but doesn't have a suitable network connection.

The next way one could approach this is exactly the same way, but with a sideload repository on the installation medium which contains the apps that will get pre-installed. The flatpak-preinstall command needs to be pointed at this repository at install time, and the process which creates the installation medium needs to be adjusted to create this repository. The installation process now works without a network connection. System updates are usually downloaded over the network, just as new pre-installed applications will be.

It is also possible to simply skip flatpak-preinstall, and use flatpak-install to create a Flatpak installation containing the pre-installed apps which get shipped on the installation medium. This installation can then be copied over from the installation medium to /var/lib/flatpak in the installation process. It unfortunately also makes the installation process less flexible because it becomes impossible to dynamically build the configuration.

On modern, image-based operating systems, it might be tempting to just ship this Flatpak installation on the image because the flexibility is usually neither required nor wanted. This currently does not work for the simple reason that the default system installation is in /var/lib/flatpak, which is not in /usr which is the mount point of the image. If the default system installation was in the image, then it would be read-only because the image is read-only. This means we could not update or install anything new to the system installation. If we make it possible to have two different system installations - one in the image, and one in /var - then we could update and install new things, but the installation on the image would become useless over time because all the runtimes and apps will be in /var anyway as they get updated.

All of those issues mean that even for image-based operating systems, pre-installation via a sideload repository is not a bad idea for now. It is however also not perfect. The kind of "pure" installation medium which is simply an image now contains a sideload repository. It also means that a factory reset functionality is not possible because the image does not contain the pre-installed apps.

In the future, we will need to revisit these approaches to find a solution that works seamlessly with image-based operating systems and supports factory reset functionality. Until then, we can use the systems mentioned above to start rolling out pre-installed Flatpaks.

13 Dec 2025 5:17pm GMT

24 Nov 2025

feedplanet.freedesktop.org

Dave Airlie (blogspot): fedora 43: bad mesa update oopsie

F43 picked up the two patches I created to fix a bunch of deadlocks on laptops reported in my previous blog posting. Turns out Vulkan layers have a subtle thing I missed, and I removed a line from the device select layer that would only matter if you have another layer, which happens under steam.

The fedora update process caught this, but it still got published which was a mistake, need to probably give changes like this more karma thresholds.

I've released a new update https://bodhi.fedoraproject.org/updates/FEDORA-2025-2f4ba7cd17 that hopefully fixes this. I'll keep an eye on the karma.

24 Nov 2025 1:42am GMT

23 Nov 2025

feedplanet.freedesktop.org

Juan A. Suarez: Major Upgrades to the Raspberry Pi GPU Driver Stack (XDC 2025 Recap)

XDC 2025 happened at the end of September, beginning of October this year, in Kuppelsaal, the historic TU Wien building in Vienna. XDC, The X.Org Developer's Conference, is truly the premier gathering for open-source graphics development. The atmosphere was, as always, highly collaborative and packed with experts across the entire stack.

I was thrilled to present, together with my workmate Ella Stanforth, on the progress we have made in enhancing the Raspberry Pi GPU driver stack. Representing the broader Igalia Graphics Team that work on this GPU, Ella and I detailed the strides we have made in the OpenGL driver, though part of the improvements affect also the Vulkan driver.

The presentation was divided in two parts. In the first one, we talked about the new features that we were implementing, or are under implementation, mainly to make the driver more closely aligned with OpenGL 3.2. Key features explained were 16-bit Normalized Format support, Robust Context support, and Seamless cubemap implementation.

Beyond these core OpenGL updates, we also highlighted other features, such as NIR printf support, framebuffer fetch or dual source blend, which is important for some game emulators.

The second part was focused on specific work done to improve the performance. Here, we started with different traces from the popular GFXBench application, and explained the main improvements done throughout the year, with a look at how much each of these changes improved the performance for each of the benchmarks (or in average).

At the end, for some benchmarks we nearly doubled the performance compared to last year. I won't explain here each of the changes done, But I encourage the reader to watch the talk, which is already available.

For those that prefer to check the slides instead of the full video, you can view them here:

Outside of the technical track, the venue's location provided some excellent down time opportunities to have lunch at different nearby places. I need to highlight here one that I really enjoyed: An's Kitchen Karlsplatz. This cozy Vietnamese street food spot quickly became one of my favourite places, and I went there a couple of times.

On the last day, I also had the opportunity to visit some of the most recomendable sightseeings spots in Vienna. Of course, one needs more than a half-day to do a proper visit, but at least it helps to spark an interest to write it down to pay a full visit to the city.

Meanwhile, I would like to thank all the conference organizers, as well as all the attendees, and I look forward to see them again.

23 Nov 2025 11:00pm GMT

17 Nov 2025

feedplanet.freedesktop.org

Lennart Poettering: Mastodon Stories for systemd v258

Already on Sep 17 we released systemd v258 into the wild.

In the weeks leading up to that release I have posted a series of serieses of posts to Mastodon about key new features in this release, under the #systemd258 hash tag. It was my intention to post a link list here on this blog right after completing that series, but I simply forgot! Hence, in case you aren't using Mastodon, but would like to read up, here's a list of all 37 posts:

I intend to do a similar series of serieses of posts for the next systemd release (v259), hence if you haven't left tech Twitter for Mastodon yet, now is the opportunity.

We intend to shorten the release cycle a bit for the future, and in fact managed to tag v259-rc1 already yesterday, just 2 months after v258. Hence, my series for v259 will begin soon, under the #systemd259 hash tag.

In case you are interested, here is the corresponding blog story for systemd v257, and here for v256.

17 Nov 2025 11:00pm GMT

Rodrigo Siqueira: XDC 2025

It has been a long time since I published any update in this space. Since this was a year of colossal changes for me, maybe it is also time for me to make something different with this blog and publish something just for a change - why not start talking about XDC 2025?

This year, I attended XDC 2025 in Vienna as an Igalia developer. I was thrilled to see some faces from people I worked with in the past and people I'm working with now. I had a chance to hang out with some folks I worked with at AMD (Harry, Alex, Leo, Christian, Shashank, and Pierre), many Igalians (Žan, Job, Ricardo, Paulo, Tvrtko, and many others), and finally some developers from Valve. In particular, I met Tímur in person for the first time, even though we have been talking for months about GPU recovery. Speaking of GPU recovery, we held a workshop on this topic together.

The workshop was packed with developers from different companies, which was nice because it added different angles on this topic. We began our discussion by focusing on the topic of job resubmission. Christian began sharing a brief history of how the AMDGPU driver started handling resubmission and the associated issues. After learning from erstwhile experience, amdgpu ended up adopting the following approach:

  1. When a job cause a hang, call driver specific handler.
  2. Stop the scheduler.
  3. Copy all jobs from the ring buffer, minus the job that caused the issue, to a temporary ring.
  4. Reset the ring buffer.
  5. Copy back the other jobs to the ring buffer.
  6. Resume the scheduler.

Below, you can see one crucial series associated with amdgpu recovery implementation:

The next topic was a discussion around the replacement of drm_sched_resubmit_jobs() since this function became deprecated. Just a few drivers still use this function, and they need a replacement for that. Some ideas were floating around to extract part of the specific implementation from some drivers into a generic function. The next day, Philipp Stanner continued to discuss this topic in his workshop, DRM GPU Scheduler.

Another crucial topic discussed was improving GPU reset debuggability to narrow down which operations cause the hang (keep in mind that GPU recovery is a medicine, not the cure to the problem). Intel developers shared their strategy for dealing with this by obtaining hints from userspace, which helped them provide a better set of information to append to the devcoredump. AMD could adopt this alongside dumping the IB data into the devcoredump (I am already investigating this).

Finally, we discussed strategies to avoid hang issues regressions. In summary, we have two lines of defense:

Lighting talk

This year, as always, XDC was super cool, packed with many engaging presentations which I highly recommend everyone check out. If you are interested, check the schedule and the presentation recordings available on the X.Org Foundation Youtube page. Anyway, I hope this blog post marks the inauguration of a new era for this site, where I will start posting more content ranging from updates to tutorials. See you soon.

17 Nov 2025 12:00am GMT

15 Nov 2025

feedplanet.freedesktop.org

Simon Ser: Status update, November 2025

Hi!

This month a lot of new features have added to the Goguma mobile IRC client. Hubert Hirtz has implemented drafts so that unsent text gets saved and network disconnections don't disrupt users typing a message. He also enabled replying to one's own messages, changed the appearance of short messages containing only emoji, upgraded our emoji library to Unicode version 16, fixed some linkifier bugs and added unit tests.

Markus Cisler has added a new option in the message menu to show a user's profile. I've added an on-disk cache for images (with our own implementation, because the widely used cached_network_image package is heavyweight). I've been working on displaying network icons and blocking users, but that work is not finished yet. I've also contributed some maintenance fixes for our webcrypto.dart dependency (toolkit upgrades and CI fixes).

The soju IRC bouncer has also got some love this month. delthas has contributed support for labeled-response for soju clients, allowing more reliable matching of server replies with client commands. I've introduced a new icon directive to configure an image representing the bouncer. soju v0.10.0 has been released, followed by soju v0.10.1 including bug fixes from Karel Balej and Taavi Väänänen.

In Wayland news, wlroots v0.19.2 and v0.18.3 have been released thanks to Simon Zeni. I've added support for the color-representation protocol for the Vulkan renderer, allowing clients to configure the color encoding and range for YCbCr content. Félix Poisot has been hard at work with more color management patches: screen default color primaries are now extracted from the EDID and exposed to compositors, the cursor is now correctly converted to the output's primaries and transfer function, and some work-in-progress patches switch the renderer API from a descriptive model to a prescriptive model.

go-webdav v0.7.0 has been released with a patch from prasad83 to play well with Thunderbird. I've updated clients to make multi-status errors non-fatal, returning partial data alongside the error.

I've released drm_info v2.9.0 with improvements mentioned in the previous status update plus support for the TILE connector property.

See you next month!

15 Nov 2025 10:00pm GMT

10 Nov 2025

feedplanet.freedesktop.org

Dave Airlie (blogspot): a tale of vulkan/nouveau/nvk/zink/mutter + deadlocks

I had a bug appear in my email recently which led me down a rabbit hole, and I'm going to share it for future people wondering why we can't have nice things.

Bug:

1. Get an intel/nvidia (newer than Turing) laptop.

2. Log in to GNOME on Fedora 42/43

3. Hotplug a HDMI port that is connected to the NVIDIA GPU.

4. Desktop stops working.

My initial reproduction got me a hung mutter process with a nice backtrace which pointed at the Vulkan Mesa device selection layer, trying to talk to the wayland compositor to ask it what the default device is. The problem was the process was the wayland compositor, and how was this ever supposed to work. The Vulkan device selection was called because zink called EnumeratePhysicalDevices, and zink was being loaded because we recently switched to it as the OpenGL driver for newer NVIDIA GPUs.

I looked into zink and the device select layer code, and low and behold someone has hacked around this badly already, and probably wrongly and I've no idea what the code does, because I think there is at least one logic bug in it. Nice things can't be had because hacks were done instead of just solving the problem.

The hacks in place ensured under certain circumstances involving zink/xwayland that the device select code to probe the window system was disabled, due to deadlocks seen. I'd no idea if more hacks were going to help, so I decided to step back and try and work out better.

The first question I had is why WAYLAND_DISPLAY is set inside the compositor process, it is, and if it wasn't I would never hit this. It's pretty likely on the initial compositor start this env var isn't set, so the problem only becomes apparent when the compositor gets a hotplugged GPU output, and goes to load the OpenGL driver, zink, which enumerates and hits device select with env var set and deadlocks.

I wasn't going to figure out a way around WAYLAND_DISPLAY being set at this point, so I leave the above question as an exercise for mutter devs.

How do I fix it?

Attempt 1:

At the point where zink is loading in mesa for this case, we have the file descriptor of the GPU device that we want to load a driver for. We don't actually need to enumerate all the physical devices, we could just find the ones for that fd. There is no API for this in Vulkan. I wrote an initial proof of concept instance extensions call VK_MESA_enumerate_devices_fd. I wrote initial loader code to play with it, and wrote zink code to use it. Because this is a new instance API, device-select will also ignore it. However this ran into a big problem in the Vulkan loader. The loader is designed around some internals that PhysicalDevices will enumerate in similiar ways, and it has to trampoline PhysicalDevice handles to underlying driver pointers so that if an app enumerates once, and enumerates again later, the PhysicalDevice handles remain consistent for the first user. There is a lot of code, and I've no idea how hotplug GPUs might fail in such situations. I couldn't find a decent path forward without knowing a lot more about the Vulkan loader. I believe this is the proper solution, as we know the fd, we should be able to get things without doing a full enumeration then picking the answer using the fd info. I've asked Vulkan WG to take a look at this, but I still need to fix the bug.

Attempt 2:

Maybe I can just turn off device selection, like the current hacks do, but in a better manner. Enter VK_EXT_layer_settings. This extensions allows layers to expose a layer setting in the instance creation. I can have the device select layer expose a setting which says don't touch this instance. Then in the zink code where we have a file descriptor being passed in and create an instance, we set the layer setting to avoid device selection. This seems to work but it has some caveats, I need to consider, but I think should be fine.

zink uses a single VkInstance for it's device screen. This is shared between all pipe_screens. Now I think this is fine inside a compositor, since we shouldn't ever be loading zink via the non-fd path, and I hope for most use cases it will work fine, better than the current hacks and better than some other ideas we threw around. The code for this is in [1].

What else might be affected:

If you have a vulkan compositor, it might be worth setting the layer setting if the mesa device select layer is loaded, esp if you set the DISPLAY/WAYLAND_DISPLAY and do any sort of hotplug later. You might be safe if you EnumeratePhysicalDevices early enough, the reason it's a big problem in mutter is it doesn't use Vulkan, it uses OpenGL and we only enumerate Vulkan physical devices at runtime through zink, never at startup.

AMD and NVIDIA I think have proprietary device selection layers, these might also deadlock in similiar ways, I think we've seen some wierd deadlocks in NVIDIA driver enumerations as well that might be a similiar problem.

[1] https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38252

10 Nov 2025 3:16am GMT

04 Nov 2025

feedplanet.freedesktop.org

Sebastian Wick: Flatpak Happenings

Yesterday I released Flatpak 1.17.0. It is the first version of the unstable 1.17 series and the first release in 6 months. There are a few things which didn't make it for this release, which is why I'm planning to do another unstable release rather soon, and then a stable release still this year.

Back at LAS this year I talked about the Future of Flatpak and I started with the grim situation the project found itself in: Flatpak was stagnant, the maintainers left the project and PRs didn't get reviewed.

Some good news: things are a bit better now. I have taken over maintenance, Alex Larsson and Owen Taylor managed to set aside enough time to make this happen and Boudhayan Bhattcharya (bbhtt) and Adrian Vovk also got more involved. The backlog has been reduced considerably and new PRs get reviewed in a reasonable time frame.

I also listed a number of improvements that we had planned, and we made progress on most of them:

Besides the changes directly in Flatpak, there are a lot of other things happening around the wider ecosystem:

What I have also talked about at my LAS talk is the idea of a Flatpak-Next project. People got excited about this, but I feel like I have to make something very clear:

If we redid Flatpak now, it would not be significantly better than the current Flatpak! You could still not do nested sandboxing, you would still need a D-Bus proxy, you would still have a complex permission system, and so on.

Those problems require work outside of Flatpak, but have to integrate with Flatpak and Flatpak-Next in the future. Some of the things we will be doing include:

So if you're excited about Flatpak-Next, help us to improve the Flatpak ecosystem and make Flatpak-Next more feasible!

04 Nov 2025 8:28pm GMT

03 Nov 2025

feedplanet.freedesktop.org

Melissa Wen: Kworkflow at Kernel Recipes 2025

Franks drawing of Melissa Wen with Kernel Recipes mascots around

This was the first year I attended Kernel Recipes and I have nothing but say how much I enjoyed it and how grateful I'm for the opportunity to talk more about kworkflow to very experienced kernel developers. What I mostly like about Kernel Recipes is its intimate format, with only one track and many moments to get closer to experts and people that you commonly talk online during your whole year.

In the beginning of this year, I gave the talk Don't let your motivation go, save time with kworkflow at FOSDEM, introducing kworkflow to a more diversified audience, with different levels of involvement in the Linux kernel development.

At this year's Kernel Recipes I presented the second talk of the first day: Kworkflow - mix & match kernel recipes end-to-end.

The Kernel Recipes audience is a bit different from FOSDEM, with mostly long-term kernel developers, so I decided to just go directly to the point. I showed kworkflow being part of the daily life of a typical kernel developer from the local setup to install a custom kernel in different target machines to the point of sending and applying patches to/from the mailing list. In short, I showed how to mix and match kernel workflow recipes end-to-end.

As I was a bit fast when showing some features during my presentation, in this blog post I explain each slide from my speaker notes. You can see a summary of this presentation in the Kernel Recipe Live Blog Day 1: morning.


Introduction

First slide: Kworkflow by Melissa Wen

Hi, I'm Melissa Wen from Igalia. As we already started sharing kernel recipes and even more is coming in the next three days, in this presentation I'll talk about kworkflow: a cookbook to mix & match kernel recipes end-to-end.

Second slide: About Melissa Wen, the speaker of this talk

This is my first time attending Kernel Recipes, so lemme introduce myself briefly.

Slide 3: and what's this cookbook called Kwokflow? - with kworkflow logo and KR penguin

And what's this cookbook called kworkflow?

Kworkflow (kw)

Slide 4: text below

Kworkflow is a tool created by Rodrigo Siqueira, my colleague at Igalia. It's a single platform that combines software and tools to:

Slide 5: kworkflow is mostly a voluntary work

It's mostly done by volunteers, kernel developers using their spare time. Its features cover real use cases according to kernel developer needs.

Slide 6: Mix & Match the daily life of a kernel developer

Basically it's mixing and matching the daily life of a typical kernel developer with kernel workflow recipes with some secret sauces.

First recipe: A good GPU driver for my AMD laptop

Slide 7: Let's prepare our first recipe

So, it's time to start the first recipe: A good GPU driver for my AMD laptop.

Slide 8: Ingredients and Tools

Before starting any recipe we need to check the necessary ingredients and tools. So, let's check what you have at home.

With kworkflow, you can use:

Slide 9: kw device and kw remote

Slide 11: kw config

Slide 13: kw kernel-config-manager

Slide 15: Preparation

Now, with all ingredients and tools selected and well portioned, follow the right steps to prepare your custom kernel!

First step: Mix ingredients with kw build or just kw b

Slide 16: kw build

Second step: Bake it with kw deploy or just kw d

Slide 18: kw deploy

After compiling the custom kernel, we want to install it in the target machine. Check the name of the custom kernel built: 6.17.0-rc6 and with kw s SSH access the target machine and see it's running the kernel from the Debian distribution 6.16.7+deb14-amd64.

As with building settings, you can also pre-configure some deployment settings, such as compression type, path to device tree binaries, target machine (remote, local, vm), if you want to reboot the target machine just after deploying your custom kernel, and if you want to boot in the custom kernel when restarting the system after deployment.

If you didn't pre-configured some options, you can still customize as a command option, for example: kw d --reboot will reboot the system after deployment, even if I didn't set this in my preference.

With just running kw d --reboot I have installed the kernel in a given target machine and rebooted it. So when accessing the system again I can see it was booted in my custom kernel.

Third step: Time to taste with kw debug

Slide 20: kw debug

Cooking Problems

Slide 22: kw patch-hub

Oh no! That custom kernel isn't tasting good. Don't worry, as in many recipes preparations, we can search on the internet to find suggestions on how to make it tasteful, alternative ingredients and other flavours according to your taste.

With kw patch-hub you can search on the lore kernel mailing list for possible patches that can fix your kernel issue. You can navigate in the mailing lists, check series, bookmark it if you find it relevant and apply it in your local kernel tree, creating a different branch for tasting… oops, for testing. In this example, I'm opening the amd-gfx mailing list where I can find contributions related to the AMD GPU driver, bookmark and/or just apply the series to my work tree and with kw bd I can compile & install the custom kernel with this possible bug fix in one shot.

As I changed my kw config to reboot after deployment, I just need to wait for the system to boot to try again unloading the amdgpu driver with kw debug --dmesg --cm=modprobe -r amdgpu. From the dmesg output retrieved by kw for this command, the driver was unloaded, the problem is fixed by this series and the kernel tastes good now.

If I'm satisfied with the solution, I can even use kw patch-hub to access the bookmarked series and marking the checkbox that will reply the patch thread with a Reviewed-by tag for me.

Second Recipe: Raspberry Pi 4 with Upstream Kernel

Slide 25: Second Recipe RPi 4 with upstream kernel

As in all recipes, we need ingredients and tools, but with kworkflow you can get everything set as when changing scenarios in a TV show. We can use kw env to change to a different environment with all kw and kernel configuration set and also with the latest compiled kernel cached.

I was preparing the first recipe for a x86 AMD laptop and with kw env --use RPI_64 I use the same worktree but moved to a different kernel workflow, now for Raspberry Pi 4 64 bits. The previous compiled kernel 6.17.0-rc6-mainline+ is there with 1266 modules, not the 6.17.0-rc6 kernel with 285 modules that I just built&deployed. kw build settings are also different, now I'm targeting a arm64 architecture with a cross-compiled kernel using aarch64-linu-gnu- cross-compilation tool and my kernel image calls kernel8 now.

Slide 27: kw env

If you didn't plan for this recipe in advance, don't worry. You can create a new environment with kw env --create RPI_64_V2 and run kw init --template to start preparing your kernel recipe with the mirepoix ready.

I mean, with the basic ingredients already cut…

I mean, with the kw configuration set from a template.

And you can use kw remote to set the IP address of your target machine and kw kernel-config-manager to fetch/retrieve the .config file from your target machine. So just run kw bd to compile and install a upstream kernel for Raspberry Pi 4.

Third Recipe: The Mainline Kernel Ringing on my Steam Deck (Live Demo)

Slide 30: Third Recipe - The Mainline Kernel Ringing on my Steam Deck

Let's show you how easy is to build, install and test a custom kernel for Steam Deck with Kworkflow. It's a live demo, but I also recorded it because I know the risks I'm exposed to and something can go very wrong just because of reasons :)

Report: how was the live demo

For this live demo, I took my OLED Steam Deck to the stage. I explained that, if I boot mainline kernel on this device, there is no audio. So I turned it on and booted the mainline kernel I've installed beforehand. It was clear that there was no typical Steam Deck startup audio when the system was loaded.

Franks drawing of Melissa Wen doing a demo of kworkflow with the Steam Deck

As I started the demo in the kw environment for Raspberry Pi 4, I first moved to another environment previously used for Steam Deck. In this STEAMDECK environment, the mainline kernel was already compiled and cached, and all settings for accessing the target machine, compiling and installing a custom kernel were retrieved automatically.

My live demo followed these steps:

  1. With kw env --use STEAMDECK, switch to a kworkflow environment for Steam Deck kernel development.

  2. With kw b -i, shows that kw will compile and install a kernel with 285 modules named 6.17.0-rc6-mainline-for-deck.

  3. Run kw config to show that, in this environment, kw configuration changes to x86 architecture and without cross-compilation.

  4. Run kw device to display information about the Steam Deck device, i.e. the target machine. It also proves that the remote access - user and IP - for this Steam Deck was already configured when using the STEAMDECK environment, as expected.

  5. Using git am, as usual, apply a hot fix on top of the mainline kernel. This hot fix makes the audio play again on Steam Deck.

  6. With kw b, build the kernel with the audio change. It will be fast because we are only compiling the affected files since everything was previously done and cached. Compiled kernel, kw configuration and kernel configuration is retrieved by just moving to the "STEAMDECK" environment.

  7. Run kw d --force --reboot to deploy the new custom kernel to the target machine. The --force option enables us to install the mainline kernel even if mkinitcpio complains about missing support for downstream packages when generating initramfs. The --reboot option makes the device reboot the Steam Deck automatically, just after the deployment completion.

  8. After finishing deployment, the Steam Deck will reboot on the new custom kernel version and made a clear resonant or vibrating sound. [Hopefully]

Finally, I showed to the audience that, if I wanted to send this patch upstream, I just needed to run kw send-patch and kw would automatically add subsystem maintainers, reviewers and mailing lists for the affected files as recipients, and send the patch to the upstream community assessment. As I didn't want to create unnecessary noise, I just did a dry-run with kw send-patch -s --simulate to explain how it looks.

What else can kworkflow already mix & match?

In this presentation, I showed that kworkflow supported different kernel development workflows, i.e., multiple distributions, different bootloaders and architectures, different target machines, different debugging tools and automatize your kernel development routines best practices, from development environment setup and verifying a custom kernel in bare-metal to sending contributions upstream following the contributions-by-e-mail structure. I exemplified it with three different target machines: my ordinary x86 AMD laptop with Debian, Raspberry Pi 4 with arm64 Raspbian (cross-compilation) and the Steam Deck with SteamOS (x86 Arch-based OS). Besides those distributions, Kworkflow also supports Ubuntu, Fedora and PopOS.

Now it's your turn: Do you have any secret recipes to share? Please share with us via kworkflow.


Useful links

03 Nov 2025 9:30pm GMT

31 Oct 2025

feedplanet.freedesktop.org

Mike Blumenkrantz: Hibernate On

Take A Break

We've reached Q4 of another year, and after the mad scramble that has been crunch-time over the past few weeks, it's time for SGC to once again retire into a deep, restful sleep.

2025 saw a lot of ground covered:

It's been a real roller coaster ride of a year as always, but I can say authoritatively that fans of the blog, you need to take care of yourselves. You need to use this break time wisely. Rest. Recover. Train your bodies. Travel and broaden your horizons. Invest in night classes to expand your minds.

You are not prepared for the insanity that will be this blog in 2026.

31 Oct 2025 12:00am GMT

27 Oct 2025

feedplanet.freedesktop.org

Mike Blumenkrantz: Apitrace Goes Vroom

First Time

Today marks the first post of a type that I've wanted to have for a long while: a guest post. There are lots of graphics developers who work on cool stuff and don't want to waste time setting up blogs, but with enough cajoling they will write a single blog post. If you're out there thinking you just did some awesome work and you want the world to know the grimy, gory details, let me know.

The first victimrecipient of this honor is an individual famous for small and extremely sane endeavors such as descriptor buffers in Lavapipe, ray tracing in Lavapipe, and sparse support in Lavapipe. Also wrangling ray tracing for RADV.

Below is the debut blog post by none other than Konstantin Seurer.

What is apitrace?

Apitrace is a powerful tool for capturing and replaying traces of GL and DX applications. The problem is that it is not really suitable for performance testing. This blog post is about implementing a faster method for replaying traces.

About six weeks ago, Mike asked me if I wanted to work on this.

[6:58:58 pm] <zmike> on the topic of traces
[6:59:08 pm] <zmike> I have a longer-term project that could use your expertise
[6:59:19 pm] <zmike> it's low work but high complexity
[7:00:12 pm] <zmike> specifically I would like apitrace to be able to competently output C code from traces and to have this functionality merged upstream

low work

Sure. clueless.png

The state of glretrace

This first obvious step was measuring how glretrace currently performs. Mike kindly provided a couple of traces from his personal collection, and I immediately timed a trace of the only relevant OpenGL game:

$ time ./glretrace -b minecraft-perf.trace
/Users/Cortex/Downloads/graalvm-jdk-23.0.1+11.1/bin/java
Rendered 1261 frames in 10.4269 secs, average of 120.937 fps

real    0m10.554s
user    0m12.938s
sys     0m2.712s

This looks fine, but I have no idea how fast the application is supposed to run. Running the same trace with perf reveals that there is room for improvement.

trace_parse_time.png

2/3 of frametime is spent parsing the trace.

Implementation

An apitrace trace stores API call information in an object-oriented style. This makes basic codegen really easy because the objects map directly to the generated C/C++ code. However, not all API calls are made equal, and the countless special cases that I needed to handle are what made this project take so long.

glretrace has custom implementations for WSI API calls, and it would be a shame not to use them. The easiest way of doing that is generating a shared library instead of an executable and having glretrace load it. The shared library can then provide a bunch of callbacks for the call sequences we can do codegen for and Call objects for everything else.

Besides WSI, there are also arguments and return values that need special treatment. OpenGL allows the application to create all kinds of objects that are represented using IDs. Those IDs are assigned by the driver, and they can be different during replay. glretrace remaps them using std::maps which have non-trivial overhead. I initially did that as well for the codegen to get things up and running, but it is actually possible to emit global variables and have most of the remapping run logic during codegen.

Data streaming

With the main replay overhead being taken care of, a major amount of replay time is now spent loading texture and buffer data. In large traces, there can also be >10GiB of data, so loading everything upfront is not an option. I decided to create one thread for reading the data file and nproc decompression threads. The read thread will wait if enough data has been loaded to limit memory usage. Decompression threads are needed because decompression is slower than reading the compressed data.

codegen in action

The results speak for themselves:

$ ./glretrace --generate-c minecraft-perf minecraft-perf.trace
/Users/Cortex/Downloads/graalvm-jdk-23.0.1+11.1/bin/java
Rendered 0 frames in 79.4072 secs, average of 0 fps
$ cd minecraft-perf
$ echo "Invoke the superior build tool"
$ meson build --buildtype release
$ ninja -Cbuild
$ time ../glretrace build/minecraft-perf.so
info: Opening 'minecraft-perf.so'... (0.00668795 secs)
warning: Waited 0.0461142 secs for data (sequence = 19)
Rendered 1261 frames in 5.19587 secs, average of 242.693 fps

real    0m5.415s
user    0m5.429s
sys     0m4.983s

Nice.

Looking at perf most CPU time is now spent in driver code or streaming binary data for stuff like textures on a separate thread.

result_perf.png

If you are interested in trying this out yourself, feel free to build the upstream PR and report on bugs unintended features. It would also be nice to have DX support in the future, but that will be something for the dxvk developers unless I need something to procrastinate from doing RT work.

- Konstantin

27 Oct 2025 12:00am GMT

15 Oct 2025

feedplanet.freedesktop.org

Simon Ser: Status update, October 2025

Hi!

I skipped last month's status update because I hadn't collected a lot of interesting updates and I've dedicated my time to writing an announcement for the first vali release.

Earlier this month, I've taken the train to Vienna to attend XDC 2025. The conference was great, I really enjoyed discussing face-to-face with open-source graphics folks I usually only interact with online, and meeting new awesome people! Since I'm part of the X.Org Foundation board, it was also nice to see the fruit of our efforts. Many thanks to all organizers!

XDC 2025 main room

We've discussed many interesting topics: a new API for 2D acceleration hardware, adapting the Wayland linux-dmabuf protocol to better support multiple GPUs, some ways to address current Wayback limitations, ideas to improve libliftoff, Vulkan use in Wayland compositors, and a lot more.

On the wlroots side, I've worked on a patch to fallback to the renderer to apply gamma LUTs when the KMS driver doesn't support them (this also paves the way for applying color transforms in KMS). Félix Poisot has updated wlroots to support the gamma 2.2 transfer function and use it by default. llyyr has added support for the BT.1886 transfer function and fixed direct scanout for client using the gamma 2.2 transfer function.

I've sent a patch to add support for DisplayID v2 CTA-861 data blocks, required for handling some HDR screens. I've reviewed and merged a bunch of gamescope patches to avoid protocol errors with the color management protocol, fix nested mode under a Vulkan compositor, fix a crash on VT switch and modernize dependencies.

I've worked a bit on drm_info too. I've added a JSON schema to describe the shape of the JSON objects, made it so EDIDs are included in the JSON output as base64-encoded strings, and added the EDID make/model/serial + bus info to the pretty-printed output.

delthas has added soju support for user metadata, introduced a new work-in-progress metadata key to block users, and made it so soju cancels Web Push notifications if a client marks a message as read (to avoid opening notifications for a very short time when actively chatting with another user). Markus Cisler has revamped Goguma's message bubbles: they look much better now!

See you next month!

15 Oct 2025 10:00pm GMT

10 Oct 2025

feedplanet.freedesktop.org

Sebastian Wick: SO_PEERPIDFD Gets More Useful

A while ago I wrote about the limited usefulness of SO_PEERPIDFD. for authenticating sandboxed applications. The core problem was simple: while pidfds gave us a race-free way to identify a process, we still had no standardized way to figure out what that process actually was - which sandbox it ran in, what application it represented, or what permissions it should have.

The situation has improved considerably since then.

cgroup xattrs

Cgroups now support user extended attributes. This feature allows arbitrary metadata to be attached to cgroup inodes using standard xattr calls.

We can change flatpak (or snap, or any other container engine) to create a cgroup for application instances it launches, and attach metadata to it using xattrs. This metadata can include the sandboxing engine, application ID, instance ID, and any other information the compositor or D-Bus service might need.

Every process belongs to a cgroup, and you can query which cgroup a process belongs to through its pidfd - completely race-free.

Standardized Authentication

Remember the complexity from the original post? Services had to implement different lookup mechanisms for different sandbox technologies:

All of this goes away. Now there's a single path:

  1. Accept a connection on a socket
  2. Use SO_PEERPIDFD to get a pidfd for the client
  3. Query the client's cgroup using the pidfd
  4. Read the cgroup's user xattrs to get the sandbox metadata

This works the same way regardless of which sandbox engine launched the application.

A Kernel Feature, Not a systemd One

It's worth emphasizing: cgroups are a Linux kernel feature. They have no dependency on systemd or any other userspace component. Any process can manage cgroups and attach xattrs to them. The process only needs appropriate permissions and is restricted to a subtree determined by the cgroup namespace it is in. This makes the approach universally applicable across different init systems and distributions.

To support non-Linux systems, we might even be able to abstract away the cgroup details, by providing a varlink service to register and query running applications. On Linux, this service would use cgroups and xattrs internally.

Replacing Socket-Per-App

The old approach - creating dedicated wayland, D-Bus, etc. sockets for each app instance and attaching metadata to the service which gets mapped to connections on that socket - can now be retired. The pidfd + cgroup xattr approach is simpler: one standardized lookup path instead of mounting special sockets. It works everywhere: any service can authenticate any client without special socket setup. And it's more flexible: metadata can be updated after process creation if needed.

For compositor and D-Bus service developers, this means you can finally implement proper sandboxed client authentication without needing to understand the internals of every container engine. For sandbox developers, it means you have a standardized way to communicate application identity without implementing custom socket mounting schemes.

10 Oct 2025 3:04pm GMT

09 Oct 2025

feedplanet.freedesktop.org

Mike Blumenkrantz: Mesh Shaders In The Current Year

It Happened.

Just a quick post to confirm that the OpenGL/ES Working Group has signed off on the release of GL_EXT_mesh_shader.

Credits

This is a monumental release, the largest extension shipped for GL this decade, and the culmination of many, many months of work by AMD. In particular we all need to thank Qiang Yu (AMD), who spearheaded this initiative and did the vast majority of the work both in writing the specification and doing the core mesa implementation. Shihao Wang (AMD) took on the difficult task of writing actual CTS cases (not mandatory for EXT extensions in GL, so this is a huge benefit to the ecosystem).

Big thanks to both of you, and everyone else behind the scenes at AMD, for making this happen.

Also we have to thank the nvidium project and its author, Cortex, for single-handedly pushing the industry forward through the power of Minecraft modding. Stay sane out there.

Support

Minecraft mod support is already underway, so expect that to happen "soon".

The bones of this extension have already been merged into mesa over the past couple months. I opened a MR to enable zink support this morning since I have already merged the implementation.

Currently, I'm planning to wait until either just before the branch point next week or until RadeonSI merges its support to merge the zink MR. This is out of respect: Qiang Yu did a huge lift for everyone here, and ideally AMD's driver should be the first to be able to advertise that extension to reflect that. But the branchpoint is coming up in a week, and SGC will be going into hibernation at the end of the month until 2026, so this offer does have an expiration date.

In any case, we're done here.

09 Oct 2025 12:00am GMT