14 Sep 2024
planet.freedesktop.org
Hans de Goede: Fedora plymouth boot splash not showing on systems with AMD GPUs
Recently there have been a number of reports (bug 2183743, bug 2276698, bug 2283839, bug 2312355) about the plymouth boot splash not showing properly on PCs using AMD GPUs.
The problem without plymouth and AMD GPUs is that the amdgpu driver is a really really big driver, which easily takes up to 10 seconds to load on older PCs. The delay caused by this may cause plymouth to timeout while waiting for the GPU to be initialized, causing it to fallback to the 3 dot text-mode boot splash.
There are 2 workaround for this depending on the PCs configuration:
1. With older AMD GPUs the radeon driver is actually used to drive the GPU but even though it is unused the amdgpu driver still loads slowing things down.
To check if this is the case for your PC start a terminal in a graphical login session and run: "lsmod | grep -E '^radeon|^amdgpu'" this will output something like this:
amdgpu 17829888 0
radeon 2371584 37
The second number after each is the usage count. As you can see in this example the amdgpu driver is not used. In this case you can disable the loading of the amdgpu driver by adding "modprobe.blacklist=amdgpu" to your kernel commandline:
sudo grubby --update-kernel=ALL --args="modprobe.blacklist=amdgpu"
2. If the amdgpu driver is actually used on your PC then plymouth not showing can be worked around by telling plymouth to use the simpledrm drm/kms device created from the EFI framebuffer early on boot, rather then waiting for the real GPU driver to load. Note this depends on your PC booting in EFI mode. To do this run:
sudo grubby --update-kernel=ALL --args="plymouth.use-simpledrm"
After using 1 of these workarounds plymouth should show normally again on boot (and booting should be a bit faster).
comments
14 Sep 2024 1:38pm GMT
Planet GNOME
Hans de Goede: Fedora plymouth boot splash not showing on systems with AMD GPUs
Recently there have been a number of reports (bug 2183743, bug 2276698, bug 2283839, bug 2312355) about the plymouth boot splash not showing properly on PCs using AMD GPUs.
The problem without plymouth and AMD GPUs is that the amdgpu driver is a really really big driver, which easily takes up to 10 seconds to load on older PCs. The delay caused by this may cause plymouth to timeout while waiting for the GPU to be initialized, causing it to fallback to the 3 dot text-mode boot splash.
There are 2 workaround for this depending on the PCs configuration:
1. With older AMD GPUs the radeon driver is actually used to drive the GPU but even though it is unused the amdgpu driver still loads slowing things down.
To check if this is the case for your PC start a terminal in a graphical login session and run: "lsmod | grep -E '^radeon|^amdgpu'" this will output something like this:
amdgpu 17829888 0
radeon 2371584 37
The second number after each is the usage count. As you can see in this example the amdgpu driver is not used. In this case you can disable the loading of the amdgpu driver by adding "modprobe.blacklist=amdgpu" to your kernel commandline:
sudo grubby --update-kernel=ALL --args="modprobe.blacklist=amdgpu"
2. If the amdgpu driver is actually used on your PC then plymouth not showing can be worked around by telling plymouth to use the simpledrm drm/kms device created from the EFI framebuffer early on boot, rather then waiting for the real GPU driver to load. Note this depends on your PC booting in EFI mode. To do this run:
sudo grubby --update-kernel=ALL --args="plymouth.use-simpledrm"
After using 1 of these workarounds plymouth should show normally again on boot (and booting should be a bit faster).
comments
14 Sep 2024 1:38pm GMT
13 Sep 2024
Planet GNOME
Juan Pablo Ugarte: Introducing Casilda – A Wayland compositor widget!
I am pleased to introduce the first stable release of Casilda!
A simple Wayland compositor widget for Gtk 4 which can be used to embed other processes windows in your Gtk 4 application.
It was originally created for Cambalache's workspace using wlroots, a modular library to create Wayland compositors.
Following Wayland tradition, this library is named after my hometown in Santa Fe, Argentina
License
Casilda is distributed under the GNU Lesser General Public License version 2.1 only.
Where to get it?
Source code lives on GNOME gitlab here
git clone https://gitlab.gnome.org/jpu/casilda.git
Manual installation
This is a regular meson package and can be installed the usual way.
# Configure project in _build directory meson setup --wipe --prefix=~/.local _build . # Build and install in ~/.local ninja -C _build install
How to use it
To add a Wayland compositor to your application all you have to do is create a CasildaCompositor widget. You can specify which UNIX socket the compositor will listen for clients connections or let it will choose one automatically.
compositor = casilda_compositor_new ("/tmp/casilda-example.sock"); gtk_window_set_child (GTK_WINDOW (window), GTK_WIDGET (compositor));
Once the compositor is running you can connect to it by specifying the socket in WAYLAND_DISPLAY environment variable.
export GDK_BACKEND=wayland export WAYLAND_DISPLAY=/tmp/casilda-example.sock gtk4-demo
API
The api is pretty simple CasildaCompositor has two properties, socket and bg-color.
- socket: The unix socket file to connect to this compositor (string)
- bg-color: Compositor background color (GdkRGBA)
Matrix channel
Have any question? come chat with us at #cambalache:gnome.org
Mastodon
Follow me in Mastodon @xjuan to get news related to Casilda and Cambalache development.
Happy coding!
13 Sep 2024 5:36pm GMT
Alice Mikhaylenko: Libadwaita 1.6
Well, it's time for another release.
Last cycle wasn't particularly exciting, only featuring the new dialogs and a few smaller changes, but this one should be more interesting. So let's look at what's new.
Bottom sheet
Last cycle libadwaita got new dialogs, which can be presented as bottom sheets on mobile, and I mentioned that they will also be available as a standalone widget in future - so AdwBottomSheet
exists and is public now.
As a standalone widget, bottom sheets work a bit differently from dialogs - they are persistent instead of being destroyed upon closing, more like the sidebar of AdwOverlaySplitView
.
They also have a few new features, such as a drag handle, or a bottom bar presentation. This is useful for apps like music players.
AdwHeaderBar
also integrates with bottom sheets - it hides the title when used in a bottom sheet with a drag handle.
Spinner
Libadwaita also has a new spinner widget - AdwSpinner
. It both refreshes visuals and addresses various problems with GtkSpinner
.
GtkSpinner
is a really simple widget. Both the spinner itself and the animation are set in CSS. The spinner is just a symbolic icon, and the animation is a CSS animation. This approach has a few problems, however.
First, the old spinner has a gradient. Symbolic icons don't actually support gradients, so it has to resort to dithering, as Jakub Steiner explained in his blog a few years ago. This works well if the spinner is small enough (16×16 - 32×32), but becomes very noticeable at larger sizes. This means that the spinner didn't work well for loading screens, or status pages.
Meanwhile, CSS animations are entirely disabled when system animations are off. Usually that makes sense, except here it means the spinner freezes, defeating the entire point of having it (indicating that the app isn't frozen during long operations).
And, while CSS animations are pretty sophisticated, you can only do so much with a single element - so it's literally a spinning icon. elementary OS does a more interesting thing - it spins it in steps, while the icon consists of 12 dashes, so it looks like they change color instead. Even then, more complex animations are impossible.
AdwSpinner
avoids all of these issues. Since it's in libadwaita and not in GTK, it can be more opinionated with regard to styling, so instead of using an icon and CSS, it's just custom drawing. And since it's not using CSS animations, it can keep spinning with animations off, and can animate in a more involved way than a simple spinning icon.
It still has a size limit - 64×64 pixels. While it can scale further, we don't really need larger sizes and capping the size makes it easier to use - to make a loading screen using GtkSpinner
, you have to set the :halign
and :valign
properties to CENTER
, as well as :width-request
and :height-request
properties to 32. If you fail to do these steps, the spinner will either be too large, or too small respectively:
Meanwhile if you just put an AdwSpinner
into a large bin, it will look right by default.
Oh, and GtkSpinner
is invisible by default and you have to set the :spinning
property to true as well. This made sense back in the age of foot and dinosaur spinners, where the spinner would stay in place when not animating, but that's not really a thing anymore.
(though Nautilus wasn't actually using GtkSpinner
)
It also didn't help that until this cycle, GtkSpinner
would continue to consume CPU cycles even when not visible if the :spinning
property is left enabled, so you had to start the spinner in the ::map
signal and stop it in ::unmap
. That is fixed now, but it was a major source of lag in, say, Epiphany in the past (which had a spinner in every tab, another spinner in every mobile tab switcher row and another one in the floating bar that shows URLs on hover, copied from Nautilus).
Spinner paintable
In addition to AdwSpinner
, there's also AdwSpinnerPaintable
. It can be used with GtkImage
, any other place that accepts paintables (such as status pages) or just manually drawn. It is a bit more awkward to use than the widget, as it needs to reference another widget so it can animate (since paintables cannot access the frame clock), but it allows to use spinners in contexts that wouldn't be possible otherwise.
AdwStatusPage
even has a special style for spinner paintable - similar to the .compact
style, but applied automatically.
Button row
Another widget we have now is AdwButtonRow
- a list row that looks more or less like a button. It has a label, optionally icons on either side, and can use destructive and suggested style classes.
This pattern isn't new - it has been used in mockups for a while (at least as early as 2021) - but it varied quite a bit between different mockups and implementation and so having a standard widget for it wasn't viable. This cycle Jamie Gravendeel and kramo took time to standardize the existing designs into a tangible proposal - so it exists as a standard widget now.
Most of the time these rows aren't meant to be linked together, so AdwPreferencesGroup
has a new property :separate-rows
. When enabled, the rows within will appear separately. This is mostly useful for button rows, but also e.g. entry rows. When not using AdwPreferencesGroup
, the same effect can be achieved by using the .boxed-list-separate
style class instead of .boxed-list
.
Multi-layout view
Libadwaita 1.4 introduced AdwBreakpoint
, which allowed to easily set properties on window size changes. However, a lot of apps need layout changes that can't be expressed via simple properties - say, switching between a sidebar and a bottom sheet. While it is possible to do it programmatically anyway, it's fairly involved and not a lot of apps went to those lengths.
Back then I also prototyped a widget for automatically reparenting children between different layouts via using a property mentioned a future widget for automatically reparenting children between different layouts, and now it's finished and available for use as AdwMultiLayoutView
.
It has changed somewhat since the prototype, e.g. it doesn't dynamically create or destroy layouts anymore, just parents/unparents them, but the gist is still the same:
- Put multiple
AdwLayout
s into a multi-layout view - Put one or more
AdwLayoutSlot
into each layout, give them IDs - Define children matching those IDs
Then those children will be placed into the slots for the current layout. When you switch the layout, they will be reparented into slots from that layout instead.
So now it's possible to define completely different layouts for desktop and mobile entirely via UI files.
CSS variables and colors
I've already talked about this in a lot of detail in my last blog post, but GTK has a lot of new CSS goodies, and libadwaita 1.6 makes full use of them.
To recap: GTK now supports CSS variables, as well as color-mix()
, relative colors, as well as new color spaces, most importantly Oklab and Oklch.
Libadwaita now provides CSS variables for all of its old named colors, with a docs page to go with it, as well as new variables: --dim-opacity
, --disabled-opacity
, --border-opacity
and --window-radius
.
This also allowed to have matching focus ring color on .destructive-action
buttons, as well as matching accent color for the .error
, .warning
and .success
style classes. And because overriding accent color for a specific widget is now possible, .opaque
button style class has been deprecated in favor of overriding accent colors on .suggested-action
. Meanwhile, the white accent color of .osd
is now more reliable and automatically works for custom widgets, instead of trying (and often failing) to manually override it for every standard widget.
I mentioned that it might be possible to generate standalone accent/error/etc colors from their respective background colors. However, the question was how to make that automatic, so at the time we didn't actually integrate that. Now it is integrated, though it's not completely automatic - only for :root
.
Specifically, there's a new variable: --standalone-color-oklab
, corresponding to the correct color transformation for the current style.
So, when overriding accent color for a specific widget, there is a bit of boilerplate to copy:
my-widget { --accent-bg-color: var(--accent-purple); --accent-color: oklab(from var(--accent-bg-color) var(--standalone-color-oklab)); }
It's still an improvement over calculating the color manually, both for light and dark styles (which a lot of apps didn't do at all, resulting in poor contrast), so still worth it. Maybe one day we'll be able to make it completely automatic - e.g. by ensuring that using variables with wildcards doesn't regress performance.
Meanwhile adw_rgba_to_standalone()
allows to do the same thing programmatically.
Accent colors
Another big feature is system accent color support. While it's not a strictly libadwaita change, this is the developer-facing part, so it makes sense to talk about it here.
Behind the scenes it's using the settings portal which provides a standardized key for the system accent color. Many other environments support it as well, so libadwaita apps will follow their accent color preferences too, while non-GNOME apps that follow the preference will follow it on GNOME too. Note that while the portal exposes arbitrary sRGB colors, libadwaita will pick the closest color from a list of nine colors, as visible on the screenshot above. This is done in the Oklch color space, mostly based on hue, so should work even for really dull colors.
Accent colors are also supported when running on Windows and macOS, and like with the color scheme and high contrast, the libadwaita page in GTK inspector allows to toggle the system accent color now.
Apps are still free to set their own accent color. CSS always takes priority over the system accent.
A lot of people helped push this over the finish line, with particular thanks to Jamie Murphy, kramo and Jamie Gravendeel.
API
AdwStyleManager
provides new properties for fetching the system color - :accent-color
and :accent-color-rgb
, as well as :system-supports-accent-colors
for querying whether the system has accent color preferences - same as for color scheme.
The :accent-color
property returns a color from the AdwAccentColor
enum, so that individual colors can be special cased (say, when using bitmap assets). This color can be converted both to background color RGBA (using adw_accent_color_to_rgba()
) and to standalone color (adw_accent_color_to_standalone_rgba()
).
All of these colors use white foreground color, so there's no API for fetching it, at least for now.
Note that :accent-color-rgba
will still return the system color even if the app overrides its accent color using CSS. It only exists for convenience and is equivalent to calling adw_accent_color_to_rgba()
on the :accent-color
value.
While we still don't have a general replacement for deprecated gtk_style_context_lookup_color()
, the new accent color API can replace at least some of its uses.
On CSS side, there are new variables corresponding to each accent color: --accent-blue
for blue and so on. Additionally, every system color, along with their standalone colors for both light and dark, is documented and can be used as a reference.
Destructive buttons
Having accent color that's not always blue means having to rethink other style choices. In particular, .destructive-action
buttons were just a red version of .suggested-action
, same as in GTK3. This was already questionable from accessibility perspective, but breaks entirely with accent colors, since suggested buttons would look exactly same as a destructive ones with red accent. And so .destructive-action
has a distinct style now, less prominent than suggested.
Alert dialogs
Another area that needed updates was AdwAlertDialog
- it was also using color for differentiating suggested and destructive buttons.
Coincidentally, the alert dialog style went almost unchanged from GTK3 days, and looked rather out of place with the rest of the platform. So kramo came up with an updated design.
AdwMessageDialog
and GtkAlertDialog
received the same style, or at least an approximation - it's not possible to replicate it entirely in GTK dialogs. Even though neither is recommended for use (when using libadwaita, anyway - nothing wrong with using GtkAlertDialog
in plain GTK), regressing apps that aren't fully up to date with the platform wouldn't be very good.
Adapting apps
Accent colors are supported automatically, and in most cases apps don't need any changes to make use of them. However, here's a checklist to ensure it works well:
- Make use of the accent color variables in custom CSS, like
--accent-bg-color
. Using the old named colors like@accent_bg_color
works as well. Don't assume accent color will be blue. - Conversely, don't use accent color when you mean blue. We have variables like
--blue-3
for that - or even--accent-blue
. - When using accent color in custom drawing (say, drawing a graph), make sure to redraw it when
AdwStyleManager:accent-color
value changes - same as for color scheme and high contrast. -
Deprecations
Last cycle we introduced new dialog widgets that are based on
AdwDialog
rather thanGtkWindow
. However, that happened right at the end of the cycle, without giving apps a lot of time to port their existing dialogs. Because of that, the old widgets (AdwMessageDialog
,AdwPreferencesWindow
,AdwAboutWindow
) weren't deprecated and I mentioned that they will be deprecated in future instead. So, they are now.If you haven't migrated to the new dialogs yet, see the migration guide for how to do so.
Other changes
As always, there are smaller changes that don't warrant their own sections, so let's look at those:
AdwWindow
andAdwApplicationWindow
now have a default minimum size (360×200 px), meaning you don't have to set it manually to use breakpoints or dialogs anymore. Apps can still override it if they need a different size, but it works out of the box now.AdwComboRow
now has the:header-factory
and:search-match-mode
properties, followingGtkDropDown
. So it's now possible to, say, have separators in the dropdown list.AdwEntryRow
got the:max-length
property, matchingGtkEntry
.AdwPreferencesPage
description now can be centered, using the:description-centered
property.- Documentation now lists available style classes for each widget, in addition to the centralized list of style classes.
- Markus Göllnitz made the
.navigation-sidebar
style class supportGtkFlowBox
andGtkGridView
, as seen in Papers. - Property rows now support the
.monospace
style class. GtkTextView
now supports the.inline
style class, removing its background and resetting its foreground color. This allows to use it in contexts like cards.
Future
As usual, there are changes that didn't make it this cycle and will land the next cycle instead. Most notably, the old toggle groups branch by Maximiliano is finally finished and will land early next cycle.
Big thanks to STF for funding a lot of this work (GTK CSS improvements, bottom sheets, finishing multi-layout view and toggle groups, general maintenance), as well as people organizing the initiative and all contributors who made this release happen.
13 Sep 2024 3:32pm GMT
06 Sep 2024
planet.freedesktop.org
Mike Blumenkrantz: Architechair
What Am I Even Doing
It was some time ago that I created my first MR touching WSI stuff.
That was also the first time I broke Mesa.
Did I learn anything?
The answer is no, but then again it would have to be given the topic of this sleep-deprived post.
Maybe I'm The Problem
WSI has a lot of issues, but most of them stem from its mere existence. If people stopped wanting to see the triangles, we would all have easier lives and performance would go through the fucking roof. That's ignoring the raw sweat and verbiage dedicated to insane ideas like determining the precise time at which the triangles should be made visible on a display or literally how are colors.
I'm nowhere near as smart as the people arguing about these things: I'm the guy who plays jenga with the tower constructed from popsicle sticks, marshmallow fluff, and wishful thinking. That's why, a while ago, I declared war on DRI interfaces and then also definitely won that war without any issues. In fact, it works perfectly.
But why did I embark upon this journey which required absolutely no fixups?
The answer lies in architecture. In the before-times, DRI (a massively overloaded acronym that no longer means anything) allowed Xorg to plug directly into Mesa to utilize hardware-accelerated rendering. It sidestepped the GL API in favor of a contract with Mesa that certain API would never change. And that was great for Xorg since it provided an optimal path to do xserver stuff. But it was (eventually) terrible for Mesa.
Renegotiation
When an API contract is made, it remains binding forever. A case when the contract is broken is called a Bug Report. Mesa has no bugs, however, except for the ones I didn't cause, and so this DRI contract that enables Xorg to shortcut more sensible APIs like EGL remains identical to this day, decades later. What is not identical, however, is Mesa.
In those intervening years, Mesa has developed into an entire ecosystem for driver development and other, less sane ideas. Gallium was created and then became the only method for implementing GL drivers. EGL and GBM are things now. But still, that DRI contract remains binding. Xorg must work. Like that one reviewer who will suggest changes for every minuscule flaw in your stupid, idiotic, uneducated, cretinous abuse of whitespace, it is not going away.
DRIL was the method by which Mesa could finally unshackle itself. The only parts of DRI still used by Xorg are for determining rendertarget capabilities, effectively eglGetConfigs
. So @ajax and I punted out the relevant API into a stub which is mostly just a wrapper around eglGetConfigs
. This enabled change and cleanup in every part of the codebase that was previously immutable.
Bidirectional Hell
As anyone who has tried to debug Mesa's DRI frontend knows, it sucks. It's one of the worst pieces of code to debug. A significant reason for this is (was) how the DRI callback system perpetuated circular architecture.
At the time of DRIL's merge, a user of GLX/EGL/GBM would engage with this sort of control flow:
- GLX/EGL/GBM API call
- direct API internals
- function pointer into
gallium/frontends/dri
- DRI frontend
- function pointer back to GLX/EGL/GBM
- <loop back to 2 until operation completes>
- return to user
In terms of functionality, it was functional. But debugging at a glance was impossible, and trying to eyeball any execution path required the type of PhD held by fewer than five people globally. The cyclical back-and-forth function pointering was a vertical cliff of a learning curve for anyone who didn't already know how things worked, and even things as "simple" as eglInitialize
went through several impenetrable cycles of idiot-looping to determine success or failure. The absolute state of it made adding new features a nightmarish and daunting prospect, and reviewing any changes had, at best, even odds of breaking things because of how difficult it is to test this stuff.
Better Now?
Maybe.
The juiciest refactoring is over, and now function pointering only occurs when the DRI frontend needs to access API-specific data for its drawables. It's actually possible to follow execution just by reading the code. Not that it's necessarily easy, but it's possible.
There's still a lot of work to be done here. There's still some corner case bugs with DRIL, there's probably EGL issues that have yet to be discovered because much of that code is still fairly opaque, and half the codebase is still prefixed with dri2_
.
At the least, I think it's now possible to work on WSI in Mesa and have some idea what's going on. Or maybe I've just been down in the abyss for so long that I'm the one staring back.
Onward
I've been cooking. I mean like really cooking. Expect big things related to the number 3 later this month.
* UPDATE: At the urging of my legal team, I've been advised to mention that no part of this post, blog, or site has any association with, bearing on, or endorsement from Half Life 3.
06 Sep 2024 12:00am GMT
04 Sep 2024
planet.freedesktop.org
Tvrtko Ursulin: DRM scheduling cgroup controller
Introduction #
The topic of a Direct Rendering Manager (DRM) cgroup controller is something which has been proposed a few times in the past, but so far is still missing from the Linux graphics stack. Some of those attempts were focusing on controlling the GPU memory usage aspect, while some were concerned with scheduling. As I am continuing to explore this area as part of my work at Igalia, in this post we will discuss one possible way of implementing the latter.
General problem statement which we are trying to address is the fact many GPUs (and their respective kernel drivers) can simultaneously schedule workloads from different clients and that there are use-cases where having external control over scheduling decisions would be beneficial.
But first to clarify what we mean by "external control". By that term we refer to the scheduling decisions being influenced from the outside of the actual process doing the rendering. If we were to draw a parallel to CPU scheduling, that would be the difference between a process (or a thread) issuing a system call such as setpriority(2) or nice(2) itself ("internal control"), versus its scheduling priority being modified by an external entity such as the user issuing the renice(1) shell command, launching the executable via the nice(1) shell command, or even using the CPU scheduling cgroup controller ("external control").
This has two benefits. Firstly, it is the user who typically knows which tasks are higher priority and which should run in the background and therefore be as much as it is possible isolated from starving the foreground tasks from resources. Secondly, external control can be applied on any process in an unified manner, without the need for applications to individually expose the means to control their scheduling priority.
If we now return back to the world of GPU scheduling we find ourselves in a landscape where internal scheduling control is possible with many GPU drivers, but the external control is not. To improve on that there are some technical and conceptual challenges, because GPUs are not as nice and uniform in their scheduling needs and capabilities as CPUs are, but if we would be able to come up with something reasonable even if not perfect, it could bring improvements to the user experience in a variety of scenarios.
Past attempts - Priority based controllers #
The earliest attempt I can remember was from 2018, by Matt Roper[1], who proposed to implement a driver-specific priority based controller. The RFC limited itself to i915 (kernel driver for Intel GPUs) and, although the priority-based setup is well established in the world of CPU scheduling, and it is easy to understand its effects, the proposal did not gain much traction.
Because of the aforementioned advantages, when I proposed my version of the controller in 2022[2], it also included a slightly different version of a priority-based controller. In contrast to the earlier one, this proposal was in principle driver-agnostic and the priority levels were also abstracted.
The proposal was also accompanied by benchmark results showing that the approach was effective in allowing users on Linux to launch GPU tasks in the background, while leaving more GPU bandwidth to the foreground task than when not using the controller. Similarly on ChromeOS, when wired into the focused versus un-focused window cgroup management, it was able to demonstrate relatively more GPU time given to the foreground window.
Current proposal - Weight based controller #
Anticipating the potential lack of sufficient support for this approach the same RFC also included a second controller which takes a different route. It abstracts things one step further and implements a weight based controller based on GPU utilisation[3].
The basic idea is that the GPU time budget is split based on relative group weights across the cgroup hierarchy, and that the controller notifies the individual DRM drivers when their clients are over budget. From there it is left for the individual drivers to know how to best manage this situation, depending on the specific scheduling capabilities of the driver and the GPU hardware.
The user interface completely mimics the exiting CPU and IO cgroup controllers with the single drm.weight control file. The weights carry no absolute meaning and are only relative within a single group of siblings. Their only purpose is to split out the time budget between them.
Visually one potential cgroup configuration could look like this:
The DRM cgroup controller then executes a periodic scanning task which queries each DRM client for its GPU usage and notifies drivers when clients are over their allocated budget.
If we expand the concept with runtime adjustment of group weights based on window focus status, with two graphically active clients such as a game and a web browser, we can end up with the following two scenarios:
Here we show the actual GPU utilisation of each group together with their drm.weight. On the left hand side the web browser is the focused window, with the weights 100-to-10 in its favour.
The compositor is not using its full 200 / (200 + 100) so a portion is passed on to the desktop group to the extent of the full 80% required. Inside the desktop group the game is currently using 70%, while its actual allocation is 80% * (10 / (100 + 10)) = 7.27%. Therefore it is currently consuming is more than the budget and the corresponding DRM driver will be notified by the controller and will be able to do something about it.
After the user has given focus to the game window, relative weights will be adjusted and so will the budgets. Now the web browser will be over budget and therefore it can be throttled down, limiting the effect of its background activity on the foreground game window.
First driver implementation - i915 #
Back when I started developing this idea Intel GPU's were my main focus, which is why i915 was the first driver I wired up with the controller.
There I implemented a rather simple approach of dynamically adjusting the scheduling priority of the throttled contexts, to the amount proportional to how much client is over budget in relative terms.
Implementation would also cross-check against the physical engine utilisation, since in i915 we have easy access to that metric, and only throttle if the latter is close to being fully utilised. (Why this makes sense could be an interesting digression relating to the fact that a single cgroup can in theory contain multiple GPUs and multiple clients using a mix of those GPUs. But lets leave that for later.)
One of the scenarios I used to test how well this works is to run two demanding GPU clients, each in its own cgroup, tweak their relative weights, and see what happens. The results were encouraging and are shown in the following table.
We can see that, when a clients group weight was decreased, the GPU bandwidth it was receiving also went down, as a consequence of the lowered context priority after receiving the over-budget notification.
This is a suitable moment to mention how the DRM cgroup controller does not promise perfect control, that is, achieving the actual GPU sharing ratios as expressed by group-relative weights. As we have mentioned before, GPU scheduling is not nearly at the same level of quality and granularity as in the CPU world, so the goal it sets is simply to improve things - do something which has a positive impact on user experience. At the same time, the mechanism and control interface proposed does not preclude individual drivers doing as good job as they can. Or even a future possibility of replacing the inner workings with a controller with something smarter, with no need to change the user space control interface.
Going back to the initial i915 implementation, the second test I have done was attempting to wire up with the background/foreground window focus handling in ChromeOS. There I experimented with a game (Android VM) running in parallel with a WebGL demo in a browser. At a certain point after both clients were running I lowered the weight of the background game and on the below screenshot we can see how the FPS metric in a browser jumped up.
This illustrates how having the controller can indeed improve the user experience. The user's focus will be at the foreground window and therefore it does make sense to prioritise GPU access to that client for better interactiveness and smoother rendering there. In fact, in this example the actual FPS jumped from around 48-49 to 60fps. Meaning that throttling the background client has allowed the foreground one to match its rendering to display's refresh rate.
Second implementation - amdgpu #
AMD's kernel module was the next interesting driver which I wired up with the controller.
The fact that its scheduling is built on top of the DRM scheduler with only three distinct priority levels mandated a different approach to throttling. We keep a sorted list of "most offending" clients (most out of budget, or most borrowed unused budget from the sibling group), with the idea that the top client on that list gets throttled by lowering its scheduling priority. That was relatively straightforward to implement and sounded like it could potentially satisfy the most basic use case of background task isolation.
To test the runtime behaviour we set up two sibling cgroups and vary their relative scheduling weights. In one cgroup we run glxgears with vsync turned off and log its frame rate over time, while in the second group we run glmark2.
Let us first have a look on how glxgears frame rate varies during this test, depending on three different scheduling weight ratios between the cgroups. Scheduling weight ratio is expressed as glxgears:glmark2 ie. 10:1 means glxgears scheduling weight was ten times as much as configured for glmark2.
We can observe that, as the glmark2 is progressing through its various sub-benchmarks, glxgears frame rate is changing too. But it was overall higher in the runs where the scheduling weight ratio was in its favour. That is a positive result showing that even a simple implementation seems to be having the desired effect, at least to some extent.
For the second test we can look from the perspective of glmark2, checking how the benchmark score change depending on the ratio of scheduling weights.
Again we see that the scores are generally improving when the scheduling weight ratio is increased in favour of the benchmark.
However, in neither case the change of the result is proportional to actual ratios. This is because the primitive implementation is not able to precisely limit the "background" client, but is only able to achieve some throttling. Also, there is an inherent delay in how fast the controller can react given the control loop is based on periodic scanning. This period is configurable and was set to two seconds for the above tests.
Conclusion #
Hopefully this write-up has managed to demonstrate two main points:
-
First, that a generic and driver agnostic approach to DRM scheduling cgroup controller can improve user experience and enable new use cases. While at the same time following the established control interface as it exists for CPU and IO control, which makes it future-proof and extendable;
-
Secondly, that even relatively basic driver implementations can be somewhat effective in providing positive control effects.
It also probably needs to be re-iterated that neither the driver implementations or the cgroup controller implementation itself are limited by the user interface proposed. Both could be independently improved under the hood in the future.
What is next? There is more work to be done such as conducting more detailed testing, polishing the implementation and potentially attempting to wire up more drivers to the controller. Further advocacy work in the DRM community too.
References #
04 Sep 2024 12:00am GMT