29 Jul 2025

feedplanet.freedesktop.org

Christian Schaller: Artificial Intelligence and the Linux Community

I have wanted to write this blog post for quite some time, but been unsure about the exact angle of it. I think I found that angle now where I will root the post in a very tangible concrete example.

So the reason I wanted to write this was because I do feel there is a palpable skepticism and negativity towards AI in the Linux community, and I understand that there are societal implications that worry us all, like how deep fakes have the potential to upend a lot of things from news disbursement to court proceedings. Or how malign forces can use AI to drive narratives in social media etc., is if social media wasn't toxic enough as it is. But for open source developers like us in the Linux community there is also I think deep concerns about tooling that deeply incurs into something that close to the heart of our community, writing code and being skilled at writing code. I hear and share all those concerns, but at the same time having spent time the last weeks using Claude.ai I do feel it is not something we can afford not to engage with. So I know people have probably used a lot of different AI tools in the last year, some being more cute than useful others being somewhat useful and others being interesting improvements to your Google search for instance. I think I shared a lot of those impressions, but using Claude this last week has opened my eyes to what AI enginers are going to be capable of going forward.

So my initial test was writing a python application for internal use at Red Hat, basically connecting to a variety of sources and pulling data and putting together reports, typical management fare. How simple it was impressed me though, I think most of us having to deal with pulling data from a new source know how painful it can be, with issues ranging from missing, outdated or hard to parse API documentation. I think a lot of us also then spend a lot of time experimenting to figure out the right API calls to make in order to pull the data we need. Well Claude was able to give me python scripts that pulled that data right away, I still had to spend some time with it to fine tune the data being pulled and ensuring we pulled the right data, but I did it in a fraction of the time I would have spent figuring that stuff out on my own. The one data source Claude struggled with Fedora's Bohdi, well once I pointed it to the URL with the latest documentation for that it figured out that it would be better to use the bohdi client library to pull data and once it had that figured out it was clear sailing.

So coming of pretty impressed by that experience I wanted to understand if Claude would be able to put together something programmatically more complex, like a GTK+ application using Vulkan. [Note: should have checked the code better, but thanks to the people who pointed this out. I told the AI to use Vulkan, which it did, but not in the way I expected, I expected it to render the globe using Vulkan, but it instead decided to ensure GTK used its Vulkan backend, an important lesson in both prompt engineering and checking the code afterwards).]So I thought what would be a good example of such an application and I also figured it would be fun if I found something really old and asked Claude to help me bring it into the current age. So I suddenly remembered xtraceroute, which is an old application orginally written in GTK1 and OpenGL showing your traceroute on a 3d Globe.

Screenshot of original xtraceroute

Screenshot of the original Xtraceroute application

I went looking for it and found that while it had been updated to GTK2 since last I looked at it, it had not been touched in 20 years. So I thought, this is a great testcase. So I grabbed the code and fed it into Claude, asking Claude to give me a modern GTK4 version of this application using Vulkan. Ok so how did it go? Well it ended up being an iterative effort, with a lot of back and forth between myself and Claude. One nice feature Claude has is that you can upload screenshots of your application and Claude will use it to help you debug. Thanks to that I got a long list of screenshots showing how this application evolved over the course of the day I spent on it.

First output of Claude

This screenshot shows Claudes first attempt of transforming the 20 year old xtraceroute application into a modern one using GTK4, Vulkan and also adding a Meson build system. My prompt to create this was feeding in the old code and asking Claude to come up with a GTK4 and Vulkan equivalent. As you can see the GTK4 UI is very simple, but ok as it is. The rendered globe leaves something to be desired though. I assume the old code had some 2d fall backcode, so Claude latched onto that and focused on trying to use the Cairo API to recreate this application, despite me telling it I wanted a Vulkan application. What what we ended up with was a 2d circle that I could spin around like a wheel of fortuen. The code did have some Vulkan stuff, but defaulted to the Cairo code.

Second attempt image

Second attempt at updating this application Anyway, I feed the screenshot of my first version back into Claude and said that the image was not a globe, it was missing the texture and the interaction model was more like a wheel of fortune. As you can see the second attempt did not fare any better, in fact we went from circle to square. This was also the point where I realized that I hadn't uploaded the textures into Claude, so I had to tell it to load the earth.png from the local file repository.

Third attempt by Claude

Third attempt from Claude.Ok, so I feed my second screenshot back into Claude and pointed out that it was no globe, in fact it wasn't even a circle and the texture was still missing. With me pointing out it needed to load the earth.png file from disk it came back with the texture loading. Well, I really wanted it to be a globe, so I said thank you for loading the texture, now do it on a globe.

This is the output of the 4th attempt. As you can see, it did bring back a circle, but the texture was gone again. At this point I also decided I didn't want Claude to waste anymore time on the Cairo code, this was meant to be a proper 3d application. So I told Claude to drop all the Cairo code and instead focus on making a Vulkan application.

Fifth attempt

So now we finally had something that started looking like something, although it was still a circle, not a globe and it got that weird division of 4 thing on the globe. Anyway, I could see it using Vulkan now and it was loading the texture. So I was feeling like we where making some decent forward movement. So I wrote a longer prompt describing the globe I wanted and how I wanted to interact with it and this time Claude did come back with Vulkan code that rendered this as a globe, thus I didn't end up screenshoting it unfortunately.

So with the working globe now in place, I wanted to bring in the day/night cycle from the original application. So I asked Claude to load the night texture and use it as an overlay to get that day/night effect. I also asked it to calculate the position of the sun to earth at the current time, so that it could overlay the texture in the right location. As you can see Claude did a decent job of it, although the colors was broken.

7th attempt

So I kept fighting with the color for a bit, Claude could see it was rendering it brown, but could not initally figure out why. I could tell the code was doing things mostly right so I also asked it to look at some other things, like I realized that when I tried to spin the globe it just twisted the texture. We got that fixed and also I got Claude to create some tests scripts that helped us figure out that the color issue was a RGB vs BRG issue, so as soon as we understood that then Claude was able to fix the code to render colors correctly. I also had a few iterations trying to get the scaling and mouse interaction behaving correctly.

10th attempt

So at this point I had probably worked on this for 4-5 hours, the globe was rendering nicely and I could interact with it using the mouse. Next step was adding the traceroute lines back. By default Claude had just put in code to render some small dots on the hop points, not draw the lines. Also the old method for getting the geocoordinates, but I asked Claude to help me find some current services which it did and once I picked one it on first try gave me code that was able to request the geolocation of the ip addresses it got back. To polish it up I also asked Claude to make sure we drew the lines following the globes curvature instead of just drawing straight lines.

Final version

Final version of the updated Xtraceroute application. It mostly works now, but I did realize why I always thought this was a fun idea, but less interesting in practice, you often don't get very good traceroutes back, probably due to websites being cached or hosted globally. But I felt that I had proven that with a days work Claude was able to help me bring this old GTK application into the modern world.

Conclusions

So I am not going to argue that Xtraceroute is an important application that deserved to be saved, in fact while I feel the current version works and proves my point I also lost motivation to try to polish it up due to the limitations of tracerouting, but the code is available for anyone who finds it worthwhile.

But this wasn't really about Xtraceroute, what I wanted to show here is how someone lacking C and Vulkan development skills can actually use a tool like Claude to put together a working application even one using more advanced stuff like Vulkan, which I know many more than me would feel daunting. I also found Claude really good at producing documentation and architecture documents for your application. It was also able to give me a working Meson build system and create all the desktop integration files for me, like the .desktop file, the metainfo file and so on. For the icons I ended up using Gemini as Claude do not do image generation at this point, although it was able to take a png file and create a SVG version of it (although not a perfect likeness to the original png).

Another thing I want to say is that the way I think about this, it is not that it makes coding skills less valuable, AIs can do amazing things, but you need to keep a close eye on them to ensure the code they create actually do what you want and that it does it in a sensible manner. For instance in my reporting application I wanted to embed a pdf file and Claude initial thought was to bring in webkit to do the rendering. That would have worked, but would have added a very big and complex dependency to my application, so I had to tell it that it could just use libpoppler to do it, something Claude agreed was a much better solution. The bigger the codebase the harder it also becomes for the AI to deal with it, but I think it hose circumstances what you can do is use the AI to give you sample code for the functionality you want in the programming language you want and then you can just work on incorporating that into your big application.

The other part here if course in terms of open source is how should contributors and projects deal with this? I know there are projects where AI generated CVEs or patches are drowning them and that helps nobody. But I think if we see AI as a developers tool and that the developer using the tool is responsible for the code generated, then I think that mindset can help us navigate this. So if you used an AI tool to create a patch for your favourite project, it is your responsibility to verify that patch before sending it in, and with that I don't mean just verifying the functionality it provides, but that the code is clean and readable and following the coding standards of said upstream project. Maintainers on the other hand can use AI to help them review and evaluate patches quicker and thus this can be helpful on both sides of the equation. I also found Claude and other AI tools like Gemini pretty good at generating test cases for the code they make, so this is another area where open source patch contributions can improve, by improving test coverage for the code.

I do also believe there are many areas where projects can greatly benefit from AI, for instance in the GNOME project a constant challenge for extension developers have been keeping their extensions up-to-date, well I do believe a tool like Claude or Gemini should be able to update GNOME Shell extensions quite easily. So maybe having a service which tries to provide a patch each time there is a GNOME Shell update might be a great help there. At the same time having a AI take a look at updated extensions and giving an first review of the update might help reduce the load on people doing code reviews on extensions and help flag problematic extensions.

I know for a lot of cases and situations uploading your code to a webservice like Claude, Gemini or Copilot is not something you want or can do. I know privacy is a big concern for many people in the community. My team at Red Hat has been working on a code assistant tool using the IBM Granite model, called Granite.code. What makes Granite different is that it relies on having the model run locally on your own system, so you don't send your code or data of somewhere else. This of course have great advantages in terms of improving privacy and security, but it has challenges too. The top end AI models out there at the moment, of which Claude is probably the best at the time of writing this blog post, are running on hardware with vast resources in terms of computing power and memory available. Most of us do not have those kind of capabilities available at home, so the model size and performance will be significantly lower. So at the moment if you are looking for a great open source tool to use with VS Code to do things like code completion I recommend giving Granite.code a look. If you on the other hand want to do something like I have described here you need to use something like Claude, Gemini or ChatGPT. I do recommend Claude, not just because I believe them to be the best at it at the moment, but they also are a company trying to hold themselves to high ethical standards. Over time we hope to work with IBM and others in the community to improve local models, and I am also sure local hardware will keep improving, so over time the experience you can get with a local model on your laptop at least has less of a gap than what it does today compared to the big cloud hosted models. There is also the middle of the road option that will become increasingly viable, where you have a powerful server in your home or at your workplace that can at least host a midsize model, and then you connect to that on your LAN. I know IBM is looking at that model for the next iteration of Granite models where you can choose from a wide variety of sizes, some small enough to be run on a laptop, others of a size where a strong workstation or small server can run them or of course the biggest models for people able to invest in top of the line hardware to run their AI.

Also the AI space is moving blazingly fast, if you are reading this 6 Months from now I am sure the capabilities of online and local models will have changed drastically already.

So to all my friends in the Linux community I ask you to take a look at AI and what it can do and then lets work together on improving it, not just in terms of capabilities, but trying to figure out things like societal challenges around it and sustainability concerns I also know a lot of us got.

Whats next for this code

As I mentioned I while I felt I got it to a point where I proved to myself it worked, I am not planning on working anymore on it. But I did make a cute little application for internal use that shows a spinning globe with all global Red Hat offices showing up as little red lights and where it pulls Red Hat news at the bottom. Not super useful either, but I was able to use Claude to refactor the globe rendering code from xtraceroute into this in just a few hours.

Red Hat Globe

Red Hat Offices Globe and news.

29 Jul 2025 4:24pm GMT

28 Jul 2025

feedplanet.freedesktop.org

Tomeu Vizoso: Rockchip NPU update 6: We are in mainline!

The kernel portion of the Linux driver for the Rockchip NPUs has been merged into the maintainer tree, and will be sent in the next pull request to Linus. The userspace portion of the driver has just been merged as well, in the main Mesa repository.


This means that in the next few weeks the two components of the Rocket driver will be in official releases of the Linux and Mesa projects, and Linux distributions will start to pick them up and package. Once that happens, we will have seamless accelerated inference on one more category of hardware.

It has been a bit over a year since I started working on the driver, though the actual feature implementation took just over two months of that. The rest of the time was spent waiting for reviews and reacting to excellent feedback from many contributors to the Linux kernel. The driver is now much better because of that frank feedback.

What I see in the near future for this driver is support for other Rockchip SoCs and some performance work, to match that of the proprietary driver. But of course, with it being open source, contributors can just start hacking on it and sending patches over for review and merging.

I'm now working on further improvements to the Etnaviv driver for the Vivante NPUs, and have started work with Arm engineers on a new driver for their Ethos line of NPUs.

So stay tuned for more news on accelerated inference on the edge in mainline Linux!


28 Jul 2025 7:02am GMT

27 Jul 2025

feedplanet.freedesktop.org

Bastien Nocera: Digitising CDs (aka using your phone as an image scanner)

I recently found, under the rain, next to a book swap box, a pile of 90's "software magazines" which I spent my evening cleaning, drying, and sorting in the days afterwards.

Magazine cover CDs with nary a magazine

Those magazines are a peculiar thing in France, using the mechanism of "Commission paritaire des publications et des agences de presse" or "Commission paritaire" for short. This structure exists to assess whether a magazine can benefit from state subsidies for the written press (whether on paper at the time, and also the internet nowadays), which include a reduced VAT charge (2.1% instead of 20%), reduced postal rates, and tax exemptions.

In the 90s, this was used by Diamond Editions[1] (a publisher related to tech shop Pearl, which French and German computer enthusiasts probably know) to publish magazines with just enough original text to qualify for those subsidies, bundled with the really interesting part, a piece of software on CD.

If you were to visit a French newsagent nowadays, you would be able to find other examples of this: magazines bundled with music CDs, DVDs or Blu-rays, or even toys or collectibles. Some publishers (including the infamous and now shuttered Éditions Atlas) will even get you a cheap kickstart to a new collection, with the first few issues (and collectibles) available at very interesting prices of a couple of euros, before making that "magazine" subscription-only, with each issue being increasingly more expensive (article from a consumer protection association).


Other publishers have followed suite.

I guess you can only imagine how much your scale model would end up costing with that business model (50 eurocent for the first part, 4.99€ for the second), although I would expect them to have given up the idea of being categorised as "written press".

To go back to Diamond Editions, this meant the eventual birth of 3 magazines: Presqu'Offert, BestSellerGames and StratéJ. I remember me or my dad buying a few of those, an older but legit and complete version of ClarisWorks, CorelDraw or a talkie version of a LucasArt point'n'click was certainly a more interesting proposition than a cut-down warez version full of viruses when budget was tight.

3 of the magazines I managed to rescue from the rain

You might also be interested in the UK "covertape wars".

Don't stress the technique

This brings us back to today and while the magazines are still waiting for scanning, I tried to get a wee bit organised and digitising the CDs.

Some of them will have printing that covers the whole of the CD, a fair few use the foil/aluminium backing of the CD as a blank surface, which will give you pretty bad results when scanning them with a flatbed scanner: the light source keeps moving with the sensor, and what you'll be scanning is the sensor's reflection on the CD.

My workaround for this is to use a digital camera (my phone's 24MP camera), with a white foam board behind it, so the blank parts appear more light grey. Of course, this means that you need to take the picture from an angle, and that the CD will appear as an oval instead of perfectly circular.

I tried for a while to use GIMP perspective tools, and "Multimedia" Mike Melanson's MobyCAIRO rotation and cropping tool. In the end, I settled on Darktable, which allowed me to do 4-point perspective deskewing, I just had to have those reference points.

So I came up with a simple "deskew" template, which you can print yourself, although you could probably achieve similar results with grid paper.

My janky setup
The resulting picture

After opening your photo with Darktable, and selecting the "darkroom" tab, go to the "rotate and perspective tool", select the "manually defined rectangle" structure, and adjust the rectangle to match the centers of the 4 deskewing targets. Then click on "horizontal/vertical fit". This will give you a squished CD, don't worry, and select the "specific" lens model and voilà.

Tools at the ready

Targets acquired


Straightened but squished

You can now export the processed image (I usually use PNG to avoid data loss at each step), open things up in GIMP and use the ellipse selection tool to remove the background (don't forget the center hole), the rotate tool to make the writing straight, and the crop tool to crop it to size.

And we're done!


The result of this example is available on Archive.org, with the rest of my uploads being made available on Archive.org and Abandonware-Magazines for those 90s magazines and their accompanying CDs.

[1]: Full disclosure, I wrote a couple of articles for Linux Pratique and Linux Magazine France in the early 2000s, that were edited by that same company.

27 Jul 2025 7:39pm GMT

25 Jul 2025

feedplanet.freedesktop.org

Hari Rana: GNOME Calendar: A New Era of Accessibility Achieved in 90 Days

Note

Please consider supporting my effort in making GNOME apps accessible for everybody. Thanks!

Introduction

There is no calendaring app that I love more than GNOME Calendar. The design is slick, it works extremely well, it is touchpad friendly, and best of all, the community around it is just full of wonderful developers, designers, and contributors worth collaborating with, especially with the recent community growth and engagement over the past few years. Georges Stavracas and Jeff Fortin Tam are some of the best maintainers I have ever worked with. I cannot express how thankful I am of Jeff's underappreciated superhuman capabilities to voluntarily coordinate huge initiatives and issue trackers.

One of Jeff's many initiatives is gnome-calendar#1036: the accessibility initiative, which is a big and detailed list of issues related to accessibility. In my opinion, GNOME Calendar's biggest problem was the lack of accessibility support, which made the app completely unusable for people exclusively using a keyboard, or people relying on assistive technologies.

This article will explain in details about the fundamental issues that held back accessibility in GNOME Calendar since the very beginning of its existence (12 years at a minimum), the progress we have made with accessibility as well as our thought process in achieving it, and the now and future of accessibility in GNOME Calendar.

Calendaring Complications

On a desktop or tablet form factor, GNOME Calendar has a month view and a week view, both of which are a grid comprising of cells representing a time frame. In the month view, each row is a week, and each cell is a day. In the week view, the time frame within cells varies on the zooming level.

There are mainly two reasons GNOME Calendar was inaccessible: firstly, the accessibility tree does not cover the logically and structurally complicated workflow and design that is a typical calendaring app; and secondly, the significant negative implications of accessibility due to reducing as much overhead as possible.

Accessibility Trees Are Insufficient for Calendaring Apps

The accessibility tree is rendered insufficient for calendaring apps, mainly because events are extremely versatile. Tailoring the entire interface and experience around that versatility pushes us to explore alternate and custom structures.

Events are highly flexible, because they are time-based. An event can last a couple of minutes, but it can as well last for hours, days, weeks, or even months. It can start in the middle of a day and end on the upcoming day; it can start by the end of a week and end at the beginning of the upcoming week. Essentially, events are limitless.

Since events can last more than a day, cell widgets cannot hold any event widget, because otherwise event widgets would not be capable of spanning across cells. As such, event widgets are overlaid on top of cell widgets and positioned based on the coordinates, width, and height of each widget. However, because cell widgets cannot hold a meaningful link with event widgets, there is no way to easily ensure there is a link between an event widget and a cell widget.

As a consequence, the visual representation of GNOME Calendar is fundamentally incompatible with accessibility trees. GNOME Calendar's month and week views are visually 2.5 dimensional: A grid layout by itself is structurally two-dimensional, but overlaying event widgets that is capable of spanning across cells adds an additional layer. Conversely, accessibility trees are fundamentally two-dimensional, so GNOME Calendar's visual representation cannot be sufficiently adapted into a two-dimensional logical tree.

In summary, accessibility trees are insufficient for calendaring apps, because the versatility and high requirements of events prevents us from linking cell widgets with event widgets, so event widgets are instead overlaid on top, consequently making the visual representation 2.5 dimensional; however, the additional layer makes it fundamentally impossible to adapt to a two-dimensional accessibility tree.

Negative Implications of Accessibility due to Maximizing Performance

Unlike the majority of apps, GNOME Calendar's layout and widgetry consist of custom widgets and complex calculations according to several factors, such as:

Due to these complex calculations, along with the fact that it is also possible to have tens, hundreds, or even thousands of events in a calendar app, calendar apps always rely on maximizing performance as much as possible, while being at the mercy of the framework or toolkit.

One way to minimize that problem is by creating custom widgets that are minimal and only fulfill the purpose we absolutely need. However, this comes at the cost of needing to reimplement most functionality, including most, if not all accessibility features and semantics, such as keyboard focus, which severely impacted accessibility in GNOME Calendar.

While GTK's widgets are great for general purpose use-cases and do not have any performance impact with limited instances of them, performance starts to deteriorate on weaker systems when there are hundreds, if not thousands of instances in the view, because they contain a lot of functionality that event widgets may not need.

In the case of the GtkButton widget, it has a custom multiplexer, it applies different styles for different child types, it implements the GtkActionable interface for custom actions, and more technical characteristics. Other functionality-based widgets will have more capabilities that might impact performance with hundreds of instances.

To summarize, GNOME Calendar reduces overhead by creating minimal custom widgets that fulfill a specific purpose. This unfortunately severely impacted accessibility throughout the app and made it unusable with a keyboard, as some core functionalities, accessibility features and semantics were never (re)implemented.

Improving the Existing Experience

Despite being inaccessible as an app altogether, not every aspect was inaccessible in GNOME Calendar. Most areas throughout the app worked with a keyboard and/or assistive technologies, but they needed some changes to improve the experience. For this reason, this section is reserved specifically for mentioning the aspects that underwent a lot of improvements.

Improving Focus Rings

The first major step was to improve the focus ring situation throughout GNOME Calendar. Since the majority of widgets are custom widgets, many of them require to manually apply focus rings. !563 addresses that by declaring custom CSS properties, to use as a base for focus rings. !399 tweaks the style of the reminders popover in the event editor dialog, with the addition of a focus ring.

We changed the behavior of the event notes box under the "Notes" section in the event editor dialog. Every time the user focuses on the event notes box, the focus ring appears and outlines the entire box until the user leaves focus. This was accomplished by subclassing AdwPreferencesRow to inherit its style, then applying the .focused class whenever the user focuses on the notes.

Improving the Calendar Grid

The calendar grid on the sidebar suffered from several issues when it came to keyboard navigation, namely:

While the calendar grid can be interacted with a keyboard, the keyboard experience was far from desired. !608 addresses these issues by overriding the Gtk.Widget.focus () virtual method. Pressing or Shift+ skips the entire grid, and the grid is wrapped to allow focusing between the first and last columns with and , while notifying the user when out of bounds.

Improving the Calendar List Box

Note

The calendar list box holds a list of available calendars, all of which can be displayed or hidden from the week view and month view. Each row is a GtkListBoxRow that holds a GtkCheckButton.

The calendar list box had several problems in regards to keyboard navigation and the information each row provided to assistive technologies.

The user was required to press a second time to get to the next row in the list. To elaborate: pressing once focused the row; pressing it another time moved focus to the check button within the row (bad); and finally pressing the third time focused the next row.

Row widgets had no actual purpose besides toggling the check button upon activation. Similarly, the only use for a check button widget inside each row was to display the "check mark" icon if the calendar was displayed. This meant that the check button widget held all the desired semantics, such as the "checkbox" role and the "checked" state; but worst of all, it was getting focus. Essentially, the check button widget was handling responsibilities that should have been handled by the row.

Both inconveniences were addressed by !588. The check button widget was replaced with a check mark icon using GtkImage, a widget that does not grab focus. The accessible role of the row widget was changed to "checkbox", and the code was adapted to handle the "checked" state.

Implementing Accessibility Functionality

Accessibility is often absolute: there is no 'in-between' state; either the user can access functionality, or they cannot, which can potentially make the app completely unusable. This section goes in depth with the widgets that were not only entirely inaccessible but also rendered GNOME Calendar completely unusable with a keyboard and assistive technology.

Making the Event Widget Accessible

Note

GcalEventWidget, the name of the event widget within GNOME Calendar, is a colored rectangular toggle button containing the summary of an event.

Activating it displays a popover that displays additional detail for that event.

GcalEventWidget subclasses GtkWidget.

The biggest problem in GNOME Calendar, which also made it completely impossible to use the app with a keyboard, was the lack of a way to focus and activate event widgets with a keyboard. Essentially, one would be able to create events, but there would be no way to access them in GNOME Calendar.

Quite literally, this entire saga began all thanks to a dream I had, which was to make GcalEventWidget subclass GtkButton instead of GtkWidget directly. The thought process was: GtkButton already implements focus and activation with a keyboard, so inheriting it should therefore inherit focus and activation behavior.

In merge request !559, the initial implementation indeed subclassed GtkButton. However, that implementation did not go through, due to the reason outlined in § Negative Implications of Accessibility due to Maximizing Performance.

Despite that, the initial implementation instead significantly helped us figure out exactly what were missing with GcalEventWidget: specifically, setting Gtk.Widget:receives-default and Gtk.Widget:focusable properties to "True". Gtk.Widget:receives-default makes it so the widget can be activated how ever desired, and Gtk.Widget:focusable allows it to become focusable with a keyboard. So, instead of subclassing GtkButton, we instead reimplemented GtkButton's functionality in order to maintain performance.

While preliminary support for keyboard navigation was added into GcalEventWidget, accessible semantics for assistive technologies like screen readers were severely lacking. This was addressed by !587, which sets the role to "toggle-button", to convey that GcalEventWidget is a toggle button. The merge request also indicates that the widget has a popup for the event popover, and has the means to update the "pressed" state of the widget.

In summary, we first made GcalEventWidget accessible with a keyboard by reimplementing some of GtkButton's functionality. Then, we later added the means to appropriately convey information to assistive technologies. This was the worst offender, and was the primary reason why GNOME Calendar was unusable with a keyboard, but we finally managed to solve it!

Making the Month and Year Spin Buttons Accessible

Note

GcalMultiChoice is the name of the custom spin button widget used for displaying and cycling through months and/or years.

It comprises of a "decrement" button to the start, a flat toggle button in the middle that contains a label that displays the value, and an "increment" button to the end. Only the button in the middle can gain keyboard focus throughout GcalMultiChoice.

In some circumstances, GcalMultiChoice can display a popover for increased granularity.

GcalMultiChoice was not interactable with a keyboard, because:

  1. it did not react to and keys; and
  2. the "decrement" and "increment" buttons were not focusable.

For a spin button widget, the "decrement" and "increment" buttons should generally remain unfocusable, because and keys already accomplish that behavior. Furthermore, GtkSpinButton's "increment" (+) and "decrement" (-) buttons are not focusable either, and the Date Picker Spin Button Example by the ARIA Authoring Practices Guide (APG) avoids that functionality as well.

However, since GcalMultiChoice did not react to and keys, having the "decrement" and "increment" buttons be focusable would have been a somewhat acceptable workaround. Unfortunately, since those buttons were not focusable, and and keys were not supported, it was impossible to increment or decrement values in GcalMultiChoice with a keyboard without resorting to workarounds.

Additionally, GcalMultiChoice lacked the semantics to communicate with assistive technologies. So, for example, a screen reader would never say anything meaningful.

All of the above problems remained problems until merge request !603. For starters, it implements GtkAccessible and GtkAccessibleRange, and then implements keyboard navigation.

Implementing GtkAccessible and GtkAccessibleRange

The merge request implements the GtkAccessible interface to retrieve information from the flat toggle button.

Fundamentally, since the toggle button was the only widget capable of gaining keyboard focus throughout GcalMultiChoice, this caused two distinct problems.

The first issue was that assistive technologies only retrieved semantic information from the flat toggle button, such as the type of widget (accessible role), its label, and its description. However, the toggle button was semantically just a toggle button; since it contained semantics and provided information to assistive technologies, the information it provided was actually misleading, because it only provided information as a toggle button, not a spin button!

So, the solution to this is to strip the semantics from the flat toggle button. Setting its accessible role to "none" makes assistive technologies ignore its information. Then, setting the accessible role of the top-level (GcalMultiChoice) to "spin-button" gives semantic meaning to assistive technologies, which allows the widget to appropriately convey these information, when focused.

This led to the second issue: Assistive technologies only retrieved information from the flat toggle button, not from the top-level. Generally, assistive technologies retrieve information from the focused widget. Since the toggle button was the only widget capable of gaining focus, it was also the only widget providing information to them; however, since its semantics were stripped, it had no information to share, and thus assistive technologies would retrieve absolutely nothing.

The solution to this is to override the Gtk.Accessible.get_platform_state () virtual method, which allows us to bridge communication between the states of child widgets and the top-level widget. In this case, both GcalMultiChoice and the flat toggle button share the state-if the flat toggle button is focused, then GcalMultiChoice is considered focused; and since GcalMultiChoice is focused, assistive technologies can then retrieve its information and state.

The last issue that needed to be addressed was that GcalMultiChoice was still not providing any of the values to assistive technologies. The solution to this is straightforward: implementing the GtkAccessibleRange interface, which makes it necessary to set values for the following accessible properties: "value-max", "value-min", "value-now", and "value-text".

After all this effort, GcalMultiChoice now provides correct semantics to assistive technologies. It appropriately reports its role, the current textual value, and whether it contains a popover.

To summarize:

Providing Top-Level Semantics to a Child Widget As Opposed to the Top-Level Widget Is Discouraged

As you read through the previous section, you may have asked yourself: "Why go through all of those obstacles and complications when you could have just re-assigned the flat toggle button as "spin-button" and not worry about the top-level's role and focus state?"

Semantics should be provided by the top-level, because they are represented by the top-level. What makes GcalMultiChoice a spin button is not just the flat toggle button, but it is the combination of all the child widgets/objects, event handlers (touch, key presses, and other inputs), accessibility attributes (role, states, relationships), widget properties, signals, and other characteristics. As such, we want to maintain that consistency for practically everything, including the state. The only exception to this is widgets whose sole purpose is to contain one or more elements, such as GtkBox.

This is especially important for when we want it to communicate with other widgets and APIs, such as the Gtk.Widget::state-flags-changed signal, the Gtk.Widget.is_focus () method, and other APIs where it is necessary to have the top-level represent data accurately and behave predictably. In the case of GcalMultiChoice, we set accessible labels at the top-level. If we were to re-assign the flat toggle button's role as "spin-button", and set the accessible label to the top-level, assistive technologies would only retrieve information from the toggle button while ignoring the labels defined at the top-level.

For the record, GtkSpinButton also overrides Gtk.Accessible.get_platform_state ():


1
2
3
4
5
6
7
8
9
10
11
12
13
14
static gboolean
gtk_spin_button_accessible_get_platform_state (GtkAccessible              *self,
                                               GtkAccessiblePlatformState  state)
{
  return gtk_editable_delegate_get_accessible_platform_state (GTK_EDITABLE (self), state);
}

static void
gtk_spin_button_accessible_init (GtkAccessibleInterface *iface)
{
  

  iface->get_platform_state = gtk_spin_button_accessible_get_platform_state;
}

To be fair, assigning the "spin-button" role to the flat toggle button is unlikely to cause major issues, especially for an app. Re-assigning the flat toggle button was my first instinct. The initial implementation did just that as well. I was completely unaware of the Gtk.Accessible.get_platform_state () virtual method before finalizing the merge request, so I initially thought that was the correct way to do. Even if the toggle button had the "spin-button" role instead of the top-level, it would not have stopped us from implementing workarounds, such as a getter method that retrieves the flat toggle button that we can then use to manipulate it.

In summary, we want to provide semantics at the top-level, because they are structurally part of it. This comes with the benefit of making the widget easier to work with, because APIs can directly communicate with it, instead of resorting to workarounds.

The Now and Future of Accessibility in GNOME Calendar

All these accessibility improvements will be available on GNOME 49, but you can download and install the pre-release on the "Nightly GNOME Apps" DLC Flatpak remote on nightly.gnome.org.

In the foreseeable future, I want to continue working on !564, to make the month view itself accessible with a keyboard, as seen in the following:

A screen recording demoing keyboard navigation within the month view. Focus rings appear and disappear as the user moves focus between cells. Going out of bounds in the vertical axis scrolls the view to the direction, and going out of bounds in the horizontal axis moves focus to the logical sibling.

However, it is already adding 640 lines of code, and I can only see it increasing overtime. We also want to make cells in the week view accessible, but this will also be a monstrous merge request, just like the above merge request.

Most importantly, we want (and need) to collaborate and connect with people who rely on assistive technologies to use their computer, especially when everybody working on GNOME Calendar does not rely on assistive technologies themselves.

Conclusion

I am overwhelmingly satisfied of the progress we have made with accessibility on GNOME Calendar in six months. Just a year ago, if I was asked about what needs to be done to incorporate accessibility features in GNOME Calendar, I would have shamefully said "dude, I don't know where to even begin"; but as of today, we somehow managed to turn GNOME Calendar into an actual, usable calendaring app for people who rely on assistive technologies and/or a keyboard.

Since this is still Disability Pride Month, and GNOME 49 is not out yet, I encourage you to get the alpha release of GNOME Calendar on the "Nightly GNOME Apps" Flatpak remote at nightly.gnome.org. The alpha release is in a state where the gays with disabilities can organize and do crimes using GNOME Calendar 😎 /j

25 Jul 2025 12:00am GMT

24 Jul 2025

feedplanet.freedesktop.org

Dave Airlie (blogspot): ramalama/mesa : benchmarks on my hardware and open source vs proprietary

One of my pet peeves around running local LLMs and inferencing is the sheer mountain of shit^W^W^W complexity of compute stacks needed to run any of this stuff in an mostly optimal way on a piece of hardware.

CUDA, ROCm, and Intel oneAPI all to my mind scream over-engineering on a massive scale at least for a single task like inferencing. The combination of closed source, over the wall open source, and open source that is insurmountable for anyone to support or fix outside the vendor, screams that there has to be a simpler way. Combine that with the pytorch ecosystem and insanity of deploying python and I get a bit unstuck.

What can be done about it?

llama.cpp to me seems like the best answer to the problem at present, (a rust version would be a personal preference, but can't have everything). I like how ramalama wraps llama.cpp to provide a sane container interface, but I'd like to eventually get to the point where container complexity for a GPU compute stack isn't really needed except for exceptional cases.

On the compute stack side, Vulkan exposes most features of GPU hardware in a possibly suboptimal way, but with extensions all can be forgiven. Jeff Bolz from NVIDIA's talk at Vulkanised 2025 started to give me hope that maybe the dream was possible.

The main issue I have is Jeff is writing driver code for the NVIDIA proprietary vulkan driver which reduces complexity but doesn't solve my open source problem.

Enter NVK, the open source driver for NVIDIA GPUs. Karol Herbst and myself are taking a look at closing the feature gap with the proprietary one. For mesa 25.2 the initial support for VK_KHR_cooperative_matrix was landed, along with some optimisations, but there is a bunch of work to get VK_NV_cooperative_matrix2 and a truckload of compiler optimisations to catch up with NVIDIA.

But since mesa 25.2 was coming soon I wanted to try and get some baseline figures out.

I benchmarked on two systems (because my AMD 7900XT wouldn't fit in the case). Both Ryzen CPUs. The first I used system I put in an RTX5080 then a RTX6000 Ada and then the Intel A770. The second I used for the RX7900XT. The Intel SYCL stack failed to launch unfortunately inside ramalama and I hacked llama.cpp to use the A770 MMA accelerators.

ramalama bench hf://unsloth/Qwen3-8B-GGUF:UD-Q4_K_XL

I picked this model at random, and I've no idea if it was a good idea.


Some analysis:

The token generation workload is a lot less matmul heavy than prompt processing, it also does a lot more synchronising. Jeff has stated CUDA wins here mostly due to CUDA graphs and most of the work needed is operation fusion on the llama.cpp side. Prompt processing is a lot more matmul heavy, extensions like NV_coopmat2 will help with that (NVIDIA vulkan already uses it in the above), but there may be further work to help close the CUDA gap. On AMD radv (open source) Vulkan is already better at TG than ROCm, but behind in prompt processing. Again coopmat2 like extensions should help close the gap there.

NVK is starting from a fair way behind, we just pushed support for the most basic coopmat extension and we know there is a long way to go, but I think most of it is achievable as we move forward and I hope to update with new scores on a semi regular basis. We also know we can definitely close the gap on the NVIDIA proprietary Vulkan driver if we apply enough elbow grease and register allocation :-)

I think it might also be worth putting some effort into radv coopmat2 support, I think if radv could overtake ROCm for both of these it would remove a large piece of complexity from the basic users stack.

As for Intel I've no real idea, I hope to get their SYCL implementation up and running, and maybe I should try and get my hands on a B580 card as a better baseline. When I had SYCL running once before I kinda remember it being 2-4x the vulkan driver, but there's been development on both sides.

(The graphs were generated by Gemini.)

24 Jul 2025 10:19pm GMT

18 Jul 2025

feedplanet.freedesktop.org

Simon Ser: Status update, July 2025

Hi!

Sway's patch to add HDR support has finally be merged! It can be enabled via output <name> hdr on, and requires the Vulkan renderer (which can be selected via WLR_RENDERER=vulkan). Still, lots remains to be done to improve tone mapping and compositing. Help is welcome if you have ideas!

I've also added support for toplevel tags to Sway. Toplevel tags provide a stable key to select a particular window of a multi-window application: for instance, the favorites window of a Web browser might carry the tag "favorites". Once support is added to various clients, this should be a more robust way to target windows than title regular expressions.

David Turner has contributed support for color-representation-v1 to wlroots. Thanks to this protocol, clients can describe how channels of a buffer should be interpreted, in particular pre-multiplied vs. straight alpha, YUV matrix coefficients and full vs. limited range. The renderer and backends bits haven't been merged yet, but work-in-progress patches have been posted.

Wayland 1.24 has been released, with a new global to free up wl_registry objects, a new "repeated" state for keyboard keys, and other new utility functions. grim 1.5 adds support for ext-image-copy-capture-v1, the new screen capture protocol. grim can now capture individual toplevel windows.

In IRC news, accessibility for the gamja Web client has been improved with ARIA attributes. A pending patch for Goguma adds support for user and channel avatars (via the metadata-2 extension). I've sent a draft for a new user query extension to synchronize opened direct message conversations across clients.

Last, qugu2427 has contributed go-smtp support for DELIVERBY (to ask the server to deliver a message before a timestamp) and MT-PRIORITY (to indicate a priority level for a message).

See you next month!

18 Jul 2025 10:00pm GMT

13 Jul 2025

feedplanet.freedesktop.org

Sebastian Wick: Blender HDR and the reference white issue

The latest alpha of the upcoming Blender 5.0 release comes with High Dynamic Range (HDR) support for Linux on Wayland which will, if everything works out, make it into the final Blender 5.0 release on October 1, 2025. The post on the developer forum comes with instructions on how to enable the experimental support and how to test it.

If you are using Fedora Workstation 42, which ships GNOME version 48, everything is already included to run Blender with HDR. All that is required is an HDR compatible display and graphics driver, and turning on HDR in the Display Settings.

It's been a lot of personal blood, sweat and tears, paid for by Red Hat across the Linux graphics stack for the last few years to enable applications like Blender to add HDR support. From kernel work, like helping to get the HDR mode working on Intel laptops, and improving the Colorspace and HDR_OUTPUT_METADATA KMS properties, to creating a new library for EDID and DisplayID parsing, and helping with wiring things up in Vulkan.

I designed the active color management paradigm for Wayland compositors, figured out how to properly support HDR, created two wayland protocols to let clients and compositors communicate the necessary information for active color management, and created documentation around all things color in FOSS graphics. This would have also been impossible without Pekka Paalanen from Collabora and all the other people I can't possibly list exhaustively.

For GNOME I implemented the new API design in mutter (the GNOME Shell compositor), and helped my colleagues to support HDR in GTK.

Now that everything is shipping, applications are starting to make use of the new functionality. To see why Blender targeted Linux on Wayland, we will dig a bit into some of the details of HDR!

HDR, Vulkan and the reference white level

Blender's HDR implementation relies on Vulkan's VkColorSpaceKHR, which allows applications to specify the color space of their swap chain, enabling proper HDR rendering pipeline integration. The key color space in question is VK_COLOR_SPACE_HDR10_ST2084_EXT, which corresponds to the HDR10 standard using the ST.2084 (PQ) transfer function.

However, there's a critical challenge with this Vulkan color space definition: it has an undefined reference white level.

Reference white indicates the luminance or a signal level at which a diffuse white object (such as a sheet of paper, or the white parts of a UI) appears in an image. If images with different reference white levels end up at different signal levels in a composited image, the result is that "white" in one of the images is still being perceived as white, while the "white" from the other image is now being perceived as gray. If you ever scrolled through Instagram on an iPhone or played an HDR game on Windows, you will probably have noticed this effect.

The solution to this issue is called anchoring. The reference white level of all images needs to be normalized in order for "white" ending up on the same signal level in the composited image.

Another issue with the reference white level specific to PQ is the prevalent myth, that the absolute luminance of a PQ signal must be replicated on the actual display a user is viewing the content at. PQ is a bit of a weird transfer characteristic because any given signal level corresponds to an absolute luminance with the unit cd/m² (also known as nit). However, the absolute luminance is only meaningful for the reference viewing environment! If an image is being viewed in the reference viewing environment of ITU-R BT.2100, (essentially a dark room) and the image signal of 203 nits is being shown at 203 nits on the display, it makes the image appear as the artist intended. The same is not true when the same image is being viewed on a phone with the summer sun blasting on the screen from behind.

PQ is no different from other transfer characteristics in that the reference white level needs to be anchored, and that the anchoring point does not have to correspond to the luminance values that the image encodes.

Coming back to the Vulkan color space VK_COLOR_SPACE_HDR10_ST2084_EXT definition: "HDR10 (BT2020) color space, encoded according to SMPTE ST2084 Perceptual Quantizer (PQ) specification". Neither ITU-R BT.2020 (primary chromaticity) nor ST.2084 (transfer characteristics), nor the closely related ITU-R BT.2100 define the reference white level. In practice, the reference level of 203 cd/m² from ITU-R BT.2408 ("Suggested guidance for operational practices in high dynamic range television production") is used. Notable, this is however not specified in the Vulkan definition of VK_COLOR_SPACE_HDR10_ST2084_EXT.

The consequences of this? On almost all platforms, VK_COLOR_SPACE_HDR10_ST2084_EXT implicitly means that the image the application submits to the presentation engine (what we call the compositor in the Linux world) is assumed to have a reference white level of 203 cd/m², and the presentation engine adjusts the signal in such a way that the reference white level of the composited image ends up at a signal value that is appropriate for the actual viewing environment of the user. On GNOME, the way to control this currently is the "HDR Brightness" slider in the Display Settings, but will become the regular screen brightness slider in the Quick Settings menu.

On Windows, the misunderstanding that a PQ signal value must be replicated one to one on the actual display has been immortalized in the APIs. It was only until support for HDR was added to laptops that this decision was revisited, but changing the previous APIs was already impossible at this point. Their solution was exposing the reference white level in the Win32 API and tasking applications to continuously query the level and adjust the image to match the new level. Few applications actually do this, with most games providing a built-in slider instead.

The reference white level of VK_COLOR_SPACE_HDR10_ST2084_EXT on Windows is essentially a continuously changing value that needs to be queried from Windows APIs outside of Vulkan.

This has two implications:

While the cross-platform issue is solvable, and something we're working on, the way Windows works also means that the cross-platform API might become harder to use because we cannot change the underlying Windows mechanisms.

No Windows Support

The result is that Blender currently does not support HDR on Windows.

Jeroen-Bakker saying <figcaption>

Jeroen-Bakker explaining the lack of Windows support

</figcaption>

The design of the Wayland color-management protocol, and the resulting active color-management paradigm of Wayland compositors was a good choice, making it easy for developers to do the right thing, while also giving them more control if they so chose.

Looking forward

We have managed to transition the compositor model from a dumb blitter to a component which takes an active part in color management, we have image viewers and video players with HDR support and now we have tools for producing HDR content! While it is extremely exciting for me that we have managed to do this properly, we also have a lot of work ahead of us, some of which I will hopefully tell you about in a future blog post!

13 Jul 2025 2:26pm GMT

04 Jul 2025

feedplanet.freedesktop.org

Hans de Goede: Recovering a FP2 which gives "flash write failure" errors

This blog post describes my successful os re-install on a fairphone 2 which was giving "flash write failure" errors when flashing it with fastboot, with the flash_FP2_factory.sh script. I'm writing down my recovery steps for this in case they are useful for anyone else.

I believe that this is caused by the bootloader code which implements fastboot not having the ability to retry recoverable eMMC errors. It is still possible to write the eMMC from Linux which can retry these errors.

So we can recover by directly fastboot-ing a recovery.img and then flashing things over adb.

( See step by step instructions... )

comment count unavailable comments

04 Jul 2025 4:19pm GMT

01 Jul 2025

feedplanet.freedesktop.org

Dave Airlie (blogspot): nvk: blackwell support

Blog posts are like buses sometimes...

I've spent time over the last month enabling Blackwell support on NVK, the Mesa vulkan driver for NVIDIA GPUs. Faith from Collabora, the NVK maintainer has cleaned up and merged all the major pieces of this work and landed them into mesa this week. Mesa 25.2 should ship with a functioning NVK on blackwell. The code currently in mesa main passes all tests in the Vulkan CTS.

Quick summary of the major fun points:

Ben @ NVIDIA had done the initial kernel bringup in to r570 firmware in the nouveau driver. I worked with Ben on solidifying that work and ironing out a bunch of memory leaks and regressions that snuck in.

Once the kernel was stable, there were a number of differences between Ada and Blackwell that needed to be resolved. Thanks to Faith, Mel and Mohamed for their help, and NVIDIA for providing headers and other info.

I did most of the work on a GB203 laptop and a desktop 5080.

1. Instruction encoding: a bunch of instructions changed how they were encoded. Mel helped sort out most of those early on.

2. Compute/QMD: the QMD which is used to launch compute shaders, has a new encoding. NVIDIA released the official QMD headers which made this easier in the end.

3. Texture headers: texture headers were encoded different from Hopper on, so we had to use new NVIDIA headers to encode those properly

4. Depth/Stencil: NVIDIA added support for separate d/s planes and this also has some knock on effects on surface layouts.

5. Surface layout changes. NVIDIA attaches a memory kind to memory allocations, due to changes in Blackwell, they now use a generic kind for all allocations. You now longer know the internal bpp dependent layout of the surfaces. This means changes to the dma-copy engine to provide that info. This means we have some modifier changes to cook with NVIDIA over the next few weeks at least for 8/16 bpp surfaces. Mohamed helped get this work and host image copy support done.

6. One thing we haven't merged is bound texture support. Currently blackwell is using bindless textures which might be a little slower. Due to changes in the texture instruction encoding, you have to load texture handles to intermediate uniform registers before using them as bound handles. This causes a lot of fun with flow control and when you can spill uniform registers. I've written a few efforts at using bound textures, so we understand how to use them, just have some compiler issues to maybe get it across the line.

7. Proper instruction scheduling isn't landed yet. I have a spreadsheet with all the figures, and I started typing, so will try and get that into an MR before I take some holidays.

01 Jul 2025 10:20am GMT

Dave Airlie (blogspot): radv: VK_KHR_video_encode_av1 support

I should have mentioned this here a week ago. The Vulkan AV1 encode extension has been out for a while, and I'd done the initial work on enabling it with radv on AMD GPUs. I then left it in a branch, which Benjamin from AMD picked up and fixed a bunch of bugs, and then we both got distracted. I realised when doing VP9 that it hasn't landed, so did a bit of cleanup. Then David from AMD picked it up and carried it over the last mile and it got merged last week.

So radv on supported hw now supports all vulkan decode/encode formats currently available.

01 Jul 2025 9:27am GMT

Mike Blumenkrantz: Behind Schedule

Timelines

It's hot out. I know this because Big Triangle allowed me a peek through my three-sided window for good behavior, and all the pixels were red. Sure am glad I'm inside.

Today's a new day in a new month, which means it's time to talk about new GL stuff. I'm allowed to do that once in a while, even though GL stuff is never actually new. In this post we're going to be looking at GL_NV_timeline_semaphore, an extension everyone has definitely heard of.

Mesa has supported GL_EXT_external_objects for a long while, and it's no exaggeration to say that this is the reference implementation: there are no proprietary drivers of which I'm aware that can pass the super-strict piglit tests we've accumulated over the years. Yes, that includes Green Triangle. Also Red Triangle, but we knew that already-it's in the name.

This MR adds support for importing and using Vulkan timeline semaphores into GL, which further improves interop-reliant workflows by eliminating binary semaphore requirements. Zink supports it anywhere that additionally supports VK_KHR_timeline_semaphore, which is to say that any platform capable of supporting the base external objects spec will also support this.

For testing, we get to have even more fun with the industry-standard ping-pong test originally contributed by @gfxstrand. This verifies that timeline operations function as expected on every side of the API divide.

Next up: more optimizations. How fast is too fast?

01 Jul 2025 12:00am GMT

25 Jun 2025

feedplanet.freedesktop.org

Tollef Fog Heen: Pronoun support in userdir-ldap

Debian uses LDAP for storing information about users, hosts and other objects. The wrapping around this is called userdir-ldap, or ud-ldap for short. It provides a mail gateway, web UI and a couple of schemas for different object types.

Back in late 2018 and early 2019, we (DSA) removed support for ISO5218 in userdir-ldap, and removed the corresponding data. This made some people upset, since they were using that information, as imprecise as it was, to infer people's pronouns. ISO5218 has four values for sex, unknown, male, female and N/A. This might have been acceptable when the standard was new (in 1976), but it wasn't acceptable any longer in 2018.

A couple of days ago, I finally got around to adding support to userdir-ldap to let people specify their pronouns. As it should be, it's a free-form text field. (We don't have localised fields in LDAP, so it probably makes sense for people to put the English version of their pronouns there, but the software does not try to control that.)

So far, it's only exposed through the LDAP gateway, not in the web UI.

If you're a Debian developer, you can set your pronouns using

echo "pronouns: he/him" | gpg --clearsign | mail changes@db.debian.org

I see that four people have already done so in the time I've taken to write this post.

25 Jun 2025 8:00pm GMT

23 Jun 2025

feedplanet.freedesktop.org

Hans de Goede: Is Copilot useful for kernel patch review?

Patch review is an important and useful part of the kernel development process, but it also a time-consuming part. To see if I could save some human reviewer time I've been pushing kernel patch-series to a branch on github, creating a pull-request for the branch and then assigning it to Copilot for review. The idea being that In would fix any issues Co-pilot catches before posting the series upstream saving a human reviewer from having to catch the issues.

I've done this for 5 patch-series: one, two, three, four, five, totalling 53 patches in total. click the number to see the pull-request and Copilot's reviews.

Unfortunately the results are not great on 53 patches Co-pilot had 4 low-confidence comments which were not useful and 3 normal comments. 2 of the no comments were on the power-supply fwnode series one was about spelling degrees Celcius as degrees Celsius instead which is the single valid remark. The other remark was about re-assigning a variable without freeing it first, but Copilot missed that the re-assignment was to another variable since this happened in a different scope. The third normal comment (here) was about as useless as they can come.

To be fair these were all patch-series written by me and then already self-reviewed and deemed ready for upstream posting before I asked Copilot to review them.

As another experiment I did one final pull-request with a couple of WIP patches to add USBIO support from Intel. Copilot generated 3 normal comments here all 3 of which are valid and one of them catches a real bug. Still given the WIP state of this case and the fact that my own review has found a whole lot more then just this, including the need for a bunch if refactoring, the results of this Copilot review are also disappointing IMHO.

Co-pilot also automatically generates summaries of the changes in the pull-requests, at a first look these look useful for e.g. a cover-letter for a patch-set but they are often full with half-truths so at a minimum these need some very careful editing / correcting before they can be used.

My personal conclusion is that running patch-sets through Copilot before posting them on the list is not worth the effort.

comment count unavailable comments

23 Jun 2025 1:46pm GMT

Tvrtko Ursulin: Fair(er) DRM GPU scheduler

Introduction #

The DRM GPU scheduler is a shared Direct Rendering Manager (DRM) Linux Kernel level component used by a number of GPU drivers for managing job submissions from multiple rendering contexts to the hardware. Some of the basic functions it can provide are dependency resolving, timeout detection, and most importantly for this article, scheduling algorithms whose essential purpose is picking the next queued unit of work to execute once there is capacity on the GPU.

Different kernel drivers use the scheduler in slightly different ways - some simply need the dependency resolving and timeout detection part, while the actual scheduling happens in the proprietary firmware, while others rely on the scheduler's algorithms for choosing what to run next. The latter ones is what the work described here is suggesting to improve.

More details about the other functionality provided by the scheduler, including some low level implementation details, are available in the generated kernel documentation repository[1].

Basic concepts and terminology #

Three DRM scheduler data structures (or objects) are relevant for this topic: the scheduler, scheduling entities and jobs.

First we have a scheduler itself, which usually corresponds with some hardware unit which can execute certain types of work. For example, the render engine can often be single hardware instance in a GPU and needs arbitration for multiple clients to be able to use it simultaneously.

Then there are scheduling entities, or in short entities, which broadly speaking correspond with userspace rendering contexts. Typically when an userspace client opens a render node, one such rendering context is created. Some drivers also allow userspace to create multiple contexts per open file.

Finally there are jobs which represent units of work submitted from userspace into the kernel. These are typically created as a result of userspace doing an ioctl(2) operation, which are specific to the driver in question.

Jobs are usually associated with entities and entities are then executed by schedulers. Each scheduler instance will have a list of runnable entities (entities with least one queued job) and when the GPU is available to execute something it will need to pick one of them.

Typically every userspace client will submit at least one such job per rendered frame and the desktop compositor may issue one or more to render the final screen image. Hence, on a busy graphical desktop, we can find dozens of active entities submitting multiple GPU jobs, sixty or more times per second.

The current scheduling algorithm #

In order to select the next entity to run, the scheduler defaults to the First In First Out (FIFO) mode of operation where selection criteria is the job submit time.

The FIFO algorithm in general has some well known disadvantages around the areas of fairness and latency, and also because selection criteria is based on job submit time, it couples the selection with the CPU scheduler, which is also not desirable because it creates an artifical coupling between different schedulers, different sets of tasks (CPU processes and GPU tasks), and different hardware blocks.

This is further amplified by the lack of guarantee that clients are submitting jobs with equal pacing (not all clients may be synchronised to the display refresh rate, or not all may be able to maintain it), the fact their per frame submissions may consist of unequal number of jobs, and last but not least the lack of preemption support. The latter is true both for the DRM scheduler itself, but also for many GPUs in their hardware capabilities.

Apart from uneven GPU time distribution, the end result of the FIFO algorithm picking the sub-optimal entity can be dropped frames and choppy rendering.

Round-robin backup algorithm #

Apart from the default FIFO scheduling algorithm, the scheduler also implements the round-robin (RR) strategy, which can be selected as an alternative at kernel boot time via a kernel argument. Round-robin, however, suffers from its own set of problems.

Whereas round-robin is typically considered a fair algorithm when used in systems with preemption support and ability to assign fixed execution quanta, in the context of GPU scheduling this fairness property does not hold. Here quanta are defined by userspace job submissions and, as mentioned before, the number of submitted jobs per rendered frame can also differ between different clients.

The final result can again be unfair distribution of GPU time and missed deadlines.

In fact, round-robin was the initial and only algorithm until FIFO was added to resolve some of these issue. More can be read in the relevant kernel commit. [2]

Priority starvation issues #

Another issue in the current scheduler design are the priority queues and the strict priority order execution.

Priority queues serve the purpose of implementing support for entity priority, which usually maps to userspace constructs such as VK_EXT_global_priority and similar. If we look at the wording for this specific Vulkan extension, it is described like this[3]:

The driver implementation *will attempt* to skew hardware resource allocation in favour of the higher-priority task. Therefore, higher-priority work *may retain similar* latency and throughput characteristics even if the system is congested with lower priority work.

As emphasised, the wording is giving implementations leeway to not be entirely strict, while the current scheduler implementation only executes lower priorities when the higher priority queues are all empty. This over strictness can lead to complete starvation of the lower priorities.

Fair(er) algorithm #

To solve both the issue of the weak scheduling algorithm and the issue of priority starvation we tried an algorithm inspired by the Linux kernel's original Completely Fair Scheduler (CFS)[4].

With this algorithm the next entity to run will be the one with least virtual GPU time spent so far, where virtual GPU time is calculated from the the real GPU time scaled by a factor based on the entity priority.

Since the scheduler already manages a rbtree[5] of entities, sorted by the job submit timestamp, we were able to simply replace that timestamp with the calculated virtual GPU time.

When an entity has nothing more to run it gets removed from the tree and we store the delta between its virtual GPU time and the top of the queue. And when the entity re-enters the tree with a fresh submission, this delta is used to give it a new relative position considering the current head of the queue.

Because the scheduler does not currently track GPU time spent per entity this is something that we needed to add to make this possible. It however did not pose a significant challenge, apart having a slight weakness with the up to date utilisation potentially lagging slightly behind the actual numbers due some DRM scheduler internal design choices. But that is a different and wider topic which is out of the intended scope for this write-up.

The virtual GPU time selection criteria largely decouples the scheduling decisions from job submission times, to an extent from submission patterns too, and allows for more fair GPU time distribution. With a caveat that it is still not entirely fair because, as mentioned before, neither the DRM scheduler nor many GPUs support preemption, which would be required for more fairness.

Solving the priority starvation #

Because priority is now consolidated into a single entity selection criteria we were also able to remove the per priority queues and eliminate priority based starvation. All entities are now in a single run queue, sorted by the virtual GPU time, and the relative distribution of GPU time between entities of different priorities is controlled by the scaling factor which converts the real GPU time into virtual GPU time.

Code base simplification #

Another benefit of being able to remove per priority run queues is a code base simplification. Going further than that, if we are able to establish that the fair scheduling algorithm has no regressions compared to FIFO and RR, we can also remove those two which further consolidates the scheduler. So far no regressions have indeed been identified.

Real world examples #

As an first example we set up three demanding graphical clients, one of which was set to run with low priority (VK_QUEUE_GLOBAL_PRIORITY_LOW_EXT).

One client is the Unigine Heaven benchmark[6] which is simulating a game, while the other two are two instances of the deferredmultisampling Vulkan demo from Sascha Willems[7], modified to support running with the user specified global priority. Those two are simulating very heavy GPU load running simultaneouosly with the game.

All tests are run on a Valve Steam Deck OLED with an AMD integrated GPU.

First we try the current FIFO based scheduler and we monitor the GPU utilisation using the gputop[8] tool. We can observe two things:

  1. That the distribution of GPU time between the normal priority clients is not equal.
  2. That the low priority client is not getting any GPU time.

FIFO scheduling uneven GPU distribution and low priority starvation

Switching to the CFS inspired (fair) scheduler the situation changes drastically:

  1. GPU time distribution between normal priority clients is much closer together.
  2. Low priority client is not starved, but receiving a small share of the GPU.

New scheduler even GPU distribution and no low priority starvation

Note that the absolute numbers are not static but represent a trend.

This proves that the new algorithm can make the low priority useful for running heavy GPU tasks in the background, similar to what can be done on the CPU side of things using the nice(1) process priorities.

Synthetic tests #

Apart from experimenting with real world workloads, another functionality we implemented in the scope of this work is a collection of simulated workloads implemented as kernel unit tests based on the recently merged DRM scheduler mock scheduler unit test framework[9][10]. The idea behind those is to make it easy for developers to check for scheduling regressions when modifying the code, without the need to set up sometimes complicated testing environments.

Let us look at a few examples on how the new scheduler compares with FIFO when using those simulated workloads.

First an easy, albeit exaggerated, illustration of priority starvation improvements.

Solved low priority starvation

Here we have a normal priority client and a low priority client submitting many jobs asynchronously (only waiting for the submission to finish after having submitted the last job). We look at the number of outstanding jobs (queue depth - qd) on the Y axis and the passage of time on the X axis. With the FIFO scheduler (blue) we see that the low priority client is not making any progress whatsoever, all until the all submission of the normal client have been completed. Switching to the CFS inspired scheduler (red) this improves dramatically and we can see the low priority client making slow but steady progress from the start.

Second example is about fairness where two clients are of equal priority:

Fair GPU time distribution

Here the interesting observation is that the new scheduler graphed lines are much more straight. This means that the GPU time distribution is more equal, or fair, because the selection criteria is decoupled from the job submission time but based on each client's GPU time utilisation.

For the final set of test workloads we will look at the rate of progress (aka frames per second, or fps) between different clients.

In both cases we have one client representing a heavy graphical load, and one representing an interactive, lightweight client. They are running in parallel but we will only look at the interactive client in the graphs. Because the goal is to look at what frame rate the interactive client can achieve when competing for the GPU. In other words we use that as a proxy for assessing user experience of using the desktop while there is simultaneous heavy GPU usage from another client.

The interactive client is set up to spend 1ms of GPU time in every 10ms period, resulting in an effective GPU load of 10%.

First test is with a heavy client wanting to utilise 75% of the GPU by submitting three 2.5ms jobs back to back, repeating that cycle every 10ms.

Interactive client vs heavy load

We can see that the average frame rate the interactive client achieves with the new scheduler is much higher than under the current FIFO algorithm.

For the second test we made the heavy GPU load client even more demanding by making it want to completely monopolise the GPU. It is now submitting four 50ms jobs back to back, and only backing off for 1us before repeating the loop.

Interactive client vs very heavy load

Again the new scheduler is able to give significantly more GPU time to the interactive client compared to what FIFO is able to do.

Conclusions #

From all the above it appears that the experiment was successful. We were able to simplify the code base, solve the priority starvation and improve scheduling fairness and GPU time allocation for interactive clients. No scheduling regressions have been identified to date.

The complete patch series implementing these changes is available at[11].

Potential for further refinements #

Because this work has simplified the scheduler code base and introduced entity GPU time tracking, it also opens up the possibilities for future experimenting with other modern algorithms. One example could be an EEVDF[12] inspired scheduler, given that algorithm has recently improved upon the kernel's CPU scheduler and is looking potentially promising for it is combining fairness and latency in one algorithm.

Connection with the DRM scheduling cgroup controller proposal #

Another interesting angle is that, as this work implements scheduling based on virtual GPU time, which as a reminder is calculated by scaling the real time by a factor based on entity priority, it can be tied really elegantly to the previously proposed DRM scheduling cgroup controller.

There we had group weights already which can now be used when scaling the virtual time and lead to a simple but effective cgroup controller. This has already been prototyped[13], but more on that in a following blog post.

References #


  1. https://docs.kernel.org/gpu/drm-mm.html#gpu-scheduler ↩︎

  2. https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h=v6.16-rc2&id=08fb97de03aa2205c6791301bd83a095abc1949c ↩︎

  3. https://registry.khronos.org/vulkan/specs/latest/man/html/VK_EXT_global_priority.html ↩︎

  4. https://en.wikipedia.org/wiki/Completely_Fair_Scheduler ↩︎

  5. https://en.wikipedia.org/wiki/Red-black_tree ↩︎

  6. https://benchmark.unigine.com/heaven ↩︎

  7. https://github.com/SaschaWillems/Vulkan ↩︎

  8. https://gitlab.freedesktop.org/drm/igt-gpu-tools/-/blob/master/tools/gputop.c?ref_type=heads ↩︎

  9. https://gitlab.freedesktop.org/tursulin/kernel/-/commit/486bdcac6121cfc5356ab75641fc702e41324e27 ↩︎

  10. https://gitlab.freedesktop.org/tursulin/kernel/-/commit/50898d37f652b1f26e9dac225ecd86b3215a4558 ↩︎

  11. https://gitlab.freedesktop.org/tursulin/kernel/-/tree/drm-sched-cfs?ref_type=heads ↩︎

  12. https://lwn.net/Articles/925371/ ↩︎

  13. https://lore.kernel.org/dri-devel/20250502123256.50540-1-tvrtko.ursulin@igalia.com/ ↩︎

23 Jun 2025 12:00am GMT

20 Jun 2025

feedplanet.freedesktop.org

Simon Ser: Status update, June 2025

Hi all!

This month, two large patch series have been merged into wlroots! The first one is toplevel capture, which will allow tools such as grim and xdg-desktop-portal-wlr to capture the contents of a specific window. The wlroots side is super simple because wlroots just sends an event when a client requests to capture a toplevel. Producing frames for a particular toplevel from scratch would be pretty cumbersome to implement for a compositor, so wlroots also exposes a helper to create a capture source from an arbitrary scene-graph node. The compositor can pass the toplevel's scene-graph node to this helper to implement toplevel capture. This is pretty flexible and leaves a lot of freedom to the compositor, making it easy to customize the capture result and to add support other kinds of capture targets! This also handles well popups (which need a composition step) and off-screen toplevels (which would otherwise stop rendering). The grim and xdg-desktop-portal-wlr side are not ready yet, but hopefully they shouldn't be too much work. Still missing are cursors and a using a fully transparent background color (right now the background is black).

The other large patch series is color management support (part 1 was merged a while back, part 2, part 3 and part 4 just got merged). This was very challenging because one needs to learn a lot about colors before even understanding how color management should be implemented from a high-level architectural point-of-view. Sway isn't quite ready yet, we're missing one last piece of the puzzle to finish up the scene-graph integration. Thanks a lot to Kenny Levinsen, M. Stoeckl and Félix Poisot for going through this big pile of patches and spotting all of the bugs!

Sway 1.11 finally got released, with all of the wlroots 0.19 niceties. I've also started the Wayland 1.24 release cycle, hopefully the final release can go out very soon. Speaking of releases, I've cut libdrm 2.4.125 with updated kernel headers, an upgraded CI and a GitLab repository rename ("drm" was very confusing and got mixed up with the kernel side). Last, drm_info 2.8.0 adds Apple and MediaTek format modifiers and support for the IN_FORMATS_ASYNC property (contributed by Intel).

David Turner has contributed three optimization patches for libliftoff, with more in the pipeline. Leandro Ribeiro and Sebastian Wick have upstreamed libdisplay-info support for HDR10+ and Dolby Vision vendor-specific video blocks, with HDMI, HDMI Forum and HDMI Forum Sink Capability on the way (yes, these are all separate blocks!).

I've migrated the wayland-devel mailing list to a new server powered by Mailman 3 and public-inbox. The old Mailman 2 setup has started showing its age more than a decade ago, and it was about time we started a migration. I've started making plans for migrating other mailing lists, hopefully we'll be able to decommission Mailman 2 in the coming months. Next we'll need to migrate the postfix server over too, but one step at a time.

delthas has plumbed replies and reactions in Goguma's compact mode. I've taken some time to clean up soju's docs: the Getting started page has been revamped, the contrib/ directory has an index page, and man pages are linkified on the website. Let me know if you have ideas to improve our docs further!

As part of $dayjob I took part of Hackdays 2025, a hackathon organized by DINUM to work on La Suite (open-source productivity software). With my team, we worked on adding support for importing Notion documents into Docs. It was great meeting a lot of European open-source enthusiasts, and I hope our work can eventually be merged!

Phew, this status update ended up being larger than I expected! Perhaps thanks to getting the wlroots and Sway releases out of the way, and spending less time on triaging issues and investigating bugs. Perhaps thanks to a lot of stuff getting merged, after slowly accumulating and growing patches for months. Either way, see you next month for another status update!

20 Jun 2025 10:00pm GMT

19 Jun 2025

feedplanet.freedesktop.org

Peter Hutterer: libinput and tablet tool eraser buttons

This is, to some degree, a followup to this 2014 post. The TLDR of that is that, many a moon ago, the corporate overlords at Microsoft that decide all PC hardware behaviour decreed that the best way to handle an eraser emulation on a stylus is by having a button that is hardcoded in the firmware to, upon press, send a proximity out event for the pen followed by a proximity in event for the eraser tool. Upon release, they dogma'd, said eraser button shall virtually move the eraser out of proximity followed by the pen coming back into proximity. Or, in other words, the pen simulates being inverted to use the eraser, at the push of a button. Truly the future, back in the happy times of the mid 20-teens.

In a world where you don't want to update your software for a new hardware feature, this of course makes perfect sense. In a world where you write software to handle such hardware features, significantly less so.

Anyway, it is now 11 years later, the happy 2010s are over, and Benjamin and I have fixed this very issue in a few udev-hid-bpf programs but I wanted something that's a) more generic and b) configurable by the user. Somehow I am still convinced that disabling the eraser button at the udev-hid-bpf level will make users that use said button angry and, dear $deity, we can't have angry users, can we? So many angry people out there anyway, let's not add to that.

To get there, libinput's guts had to be changed. Previously libinput would read the kernel events, update the tablet state struct and then generate events based on various state changes. This of course works great when you e.g. get a button toggle, it doesn't work quite as great when your state change was one or two event frames ago (because prox-out of one tool, prox-in of another tool are at least 2 events). Extracing that older state change was like swapping the type of meatballs from an ikea meal after it's been served - doable in theory, but very messy.

Long story short, libinput now has a internal plugin system that can modify the evdev event stream as it comes in. It works like a pipeline, the events are passed from the kernel to the first plugin, modified, passed to the next plugin, etc. Eventually the last plugin is our actual tablet backend which will update tablet state, generate libinput events, and generally be grateful about having fewer quirks to worry about. With this architecture we can hold back the proximity events and filter them (if the eraser comes into proximity) or replay them (if the eraser does not come into proximity). The tablet backend is none the wiser, it either sees proximity events when those are valid or it sees a button event (depending on configuration).

This architecture approach is so successful that I have now switched a bunch of other internal features over to use that internal infrastructure (proximity timers, button debouncing, etc.). And of course it laid the ground work for the (presumably highly) anticipated Lua plugin support. Either way, happy times. For a bit. Because for those not needing the eraser feature, we've just increased your available tool button count by 100%[2] - now there's a headline for tech journalists that just blindly copy claims from blog posts.

[1] Since this is a bit wordy, the libinput API call is just libinput_tablet_tool_config_eraser_button_set_button()
[2] A very small number of styli have two buttons and an eraser button so those only get what, 50% increase? Anyway, that would make for a less clickbaity headline so let's handwave those away.

19 Jun 2025 1:44am GMT