18 Sep 2020


Mike Blumenkrantz: Long Week

Blog Returns

Once again, I ended up not blogging for most of the week. When this happens, there's one of two possibilities: I'm either taking a break or I'm so deep into some code that I've forgotten about everything else in my life including sleep.

This time was the latter. I delved into the deepest parts of zink and discovered that the driver is, in fact, functioning only through a combination of sheer luck and a truly unbelievable amount of driver stalls that provide enough forced synchronization and slow things down enough that we don't explode into a flaming mess every other frame.


I've fixed all of the crazy things I found, and, in the process, made some sizable performance gains that I'm planning to spend a while blogging about in considerable depth next week.

And when I say sizable, I'm talking in the range of 50-100% fps gains.

But it's Friday, and I'm sure nobody wants to just see numbers or benchmarks. Let's get into something that's interesting on a technical level.


Yes, samplers.

In Vulkan, samplers have a lot of rules to follow. Specifically, I'm going to be examining part of the spec that states "If a VkImageView is sampled with VK_FILTER_LINEAR as a result of this command, then the image view's format features must contain VK_FORMAT_FEATURE_SAMPLED_IMAGE_FILTER_LINEAR_BIT".

This is a problem for zink. Gallium gives us info about the sampler in the struct pipe_context::create_sampler_state hook, but the created sampler won't actually be used until draw time. As a result, there's no way to know which image is going to be sampled, and thus there's no way to know what features the sampled image's format flags will contain. This only becomes known at the time of draw.

The way I saw it, there were two options:

In theory, the first option is probably more performant in the best case scenario where a sampler is only ever used with a single image, as it would then only ever create a single sampler object.

Unfortunately, this isn't realistic. Just as an example, u_blitter creates a number of samplers up front, and then it also makes assumptions about filtering based on ideal operations which may not be in sync with the underlying Vulkan driver's capabilities. So for these persistent samplers, the first option may initially allow the sampler to be created with LINEAR filtering, but it may later then be used for an image which can't support it.

So I went with the second option. Now any time a LINEAR sampler is created by gallium, we're actually creating both types so that the appropriate one can be used, ensuring that we can always comply with the spec and avoid any driver issues.


18 Sep 2020 12:00am GMT

14 Sep 2020


Mike Blumenkrantz: Draw Parameters

Hoo Boy

Let's talk about ARB_shader_draw_parameters. Specifically, let's look at gl_BaseVertex.

In OpenGL, this shader variable's value depends on the parameters passed to the draw command, and the value is always zero if the command has no base vertex.

In Vulkan, the value here is only zero if the first vertex is zero.

The difference here means that for arrayed draws without base vertex parameters, GL always expects zero, and Vulkan expects first vertex.



The easiest solution here would be to just throw a shader key at the problem, producing variants of the shader for use with indexed vs non-indexed draws, and using NIR passes to modify the variables for the non-indexed case and zero the value. It's quick, it's easy, and it's not especially great for performance since it requires compiling the shader multiple times and creating multiple pipeline objects.

This is where push constants come in handy once more.

Avid readers of the blog will recall the last time I used push constants was for TCS injection when I needed to generate my own TCS and have it read the default inner/outer tessellation levels out of a push constant.

Since then, I've created a struct to track the layout of the push constant:

struct zink_push_constant {
   unsigned draw_mode_is_indexed;
   float default_inner_level[2];
   float default_outer_level[4];

Now just before draw, I update the push constant value for draw_mode_is_indexed:

if (ctx->gfx_stages[PIPE_SHADER_VERTEX]->nir->info.system_values_read & (1ull << SYSTEM_VALUE_BASE_VERTEX)) {
   unsigned draw_mode_is_indexed = dinfo->index_size > 0;
   vkCmdPushConstants(batch->cmdbuf, gfx_program->layout, VK_SHADER_STAGE_VERTEX_BIT,
                      offsetof(struct zink_push_constant, draw_mode_is_indexed), sizeof(unsigned),

And now the shader can be made aware of whether the draw mode is indexed.

Now comes the NIR, as is the case for most of this type of work.

static bool
lower_draw_params(nir_shader *shader)
   if (shader->info.stage != MESA_SHADER_VERTEX)
      return false;

   if (!(shader->info.system_values_read & (1ull << SYSTEM_VALUE_BASE_VERTEX)))
      return false;

   return nir_shader_instructions_pass(shader, lower_draw_params_instr, nir_metadata_dominance, NULL);

This is the future, so I'm now using Eric Anholt's recent helper function to skip past iterating over the shader's function/blocks/instructions, instead just passing the lowering implementation as a parameter and letting the helper create the nir_builder for me.

static bool
lower_draw_params_instr(nir_builder *b, nir_instr *in, void *data)
   if (in->type != nir_instr_type_intrinsic)
      return false;
   nir_intrinsic_instr *instr = nir_instr_as_intrinsic(in);
   if (instr->intrinsic != nir_intrinsic_load_base_vertex)
      return false;

I'm filtering out everything except for nir_intrinsic_load_base_vertex here, which is the instruction for loading gl_BaseVertex.

   b->cursor = nir_after_instr(&instr->instr);

I'm modifying instructions after this one, so I set the cursor after.

   nir_intrinsic_instr *load = nir_intrinsic_instr_create(b->shader, nir_intrinsic_load_push_constant);
   load->src[0] = nir_src_for_ssa(nir_imm_int(b, 0));
   nir_intrinsic_set_range(load, 4);
   load->num_components = 1;
   nir_ssa_dest_init(&load->instr, &load->dest, 1, 32, "draw_mode_is_indexed");
   nir_builder_instr_insert(b, &load->instr);

I'm loading the first 4 bytes of the push constant variable that I created according to my struct, which is the draw_mode_is_indexed value.

   nir_ssa_def *composite = nir_build_alu(b, nir_op_bcsel,
                                          nir_build_alu(b, nir_op_ieq, &load->dest.ssa, nir_imm_int(b, 1), NULL, NULL),
                                          nir_imm_int(b, 0),

This adds a new ALU instruction of type bcsel, AKA the ternary operator (condition ? true : false). The condition here is another ALU of type ieq, AKA integer equals, and I'm testing whether the loaded push constant value is equal to 1. If true, this is an indexed draw, so I continue using the loaded gl_BaseVertex value. If false, this is not an indexed draw, so I need to use zero instead.

   nir_ssa_def_rewrite_uses_after(&instr->dest.ssa, nir_src_for_ssa(composite), composite->parent_instr);

With my bcsel composite gl_BaseVertex value constructed, I can now rewrite all subsequent uses of gl_BaseVertex in the shader to use the composite value, which will automatically swap between the Vulkan gl_BaseVertex and zero based on the value of the push constant without the need to rebuild the shader or make a new pipeline.

   return true;

And now the shader gets the expected value and everything works.

Billy Mays

It's also worth pointing out here that gl_DrawID from the same extension has a similar problem: gallium doesn't pass multidraws in full to the driver, instead iterating for each draw, which means that the shader value is never what's expected either. I've employed a similar trick to jam the draw index into the push constant and read that back in the shader to get the expected value there too.


14 Sep 2020 12:00am GMT

10 Sep 2020


Adam Jackson: worse is better: making late buffer swaps tear

In an ideal world, every frame your application draws would appear on the screen exactly on time. Sadly, as anyone living in the year 2020 CE can attest, this is far from an ideal world. Sometimes the scene gets more complicated and takes longer to draw than you estimated, and sometimes the OS scheduler just decides it has more important things to do than pay attention to you.

When this happens, for some applications, it would be best if you could just get the bits on the screen as fast as possible rather than wait for the next vsync. The Present extension for X11 has a option to let you do exactly this:

If 'options' contains PresentOptionAsync, and the 'target-msc'
is less than or equal to the current msc for 'window', then
the operation will be performed as soon as possible, not
necessarily waiting for the next vertical blank interval.

But you don't use Present directly, usually, usually Present is the mechanism for GLX and Vulkan to put bits on the screen. So, today I merged some code to Mesa to enable the corresponding features in those APIs, namely GLX_EXT_swap_control_tear and VK_PRESENT_MODE_FIFO_RELAXED_KHR. If all goes well these should be included in Mesa 21.0, with a backport to 20.2.x not out of the question. As the GLX extension name suggests, this can introduce some visual tearing when the buffer swap does come in late, but for fullscreen games or VR displays that can be an acceptable tradeoff in exchange for reduced stuttering.

10 Sep 2020 7:44pm GMT

10 Nov 2011

feedPlanet GNOME

Brad Taylor: Reports of Snowy’s death have been greatly exaggerated

Browsing foundation-list recently, I was honored to see Snowy (and Tomboy Online) hosting mentioned as one of the GNOME CEO goals (scroll to the bottom) for 2010! Unfortunately, the pace of Snowy's development has slowed in the last few months, due in part to both Sandy and my schedules. Despite that, we wouldn't want Stormy to get a bad reputation because of our slacking, so we're going to change that.

We're hosting an IRC meeting in the #snowy channel on irc.gimp.net on Saturday, 23 Jan 2010 at 11:00 AM EDT (16:00 GMT, other time zones) to get ourselves organized, and to recruit your help.

So, if you are a graphic designer that wants to help beautify an awesome open source project, if you're a hacker who knows or wants to learn Django, or even if you're just interested in Snowy, stop on by!

See you there!

10 Nov 2011 4:55am GMT

Brad Taylor: Mono Accessibility 2.0 unleashed!

Today, I'm proud to announce the 2.0 release of the Mono Accessibility project. Spanning a year of intensive work and fixing over 500 bugs, this is truly our best release ever.

This release enables all types of users to access System.Windows.Forms and Silverlight applications from Linux using Orca and other ATK-based Assistive Technologies (ATs), as well as access Linux applications from UI Automation (UIA) based ATs.

What's changed since version 1.0?


What is Mono Accessibility:

The Mono Accessibility project enables Winforms and Silverlight applications to be fully accessible on Linux, and allows Assistive Technologies (ATs) like screen readers and test automation tools that depend on UI Automation APIs to work on Linux.

Mono Accessibility is released under the MIT/X11 license.

Get it!

Mono Accessibility is available for a variety of Linux distributions, including:

A Note About at-spi2

Accessing GTK+ applications with the UIA Client API requires the most recent development version of the new dbus-based at-spi2, which is known to cause system instability.

In Fedora, at-spi2 repeatedly causes GDM to segfault. If you do not need this feature, do not install the latest at-spi2 and atk, or our packages which depend on them, which are at-spi-sharp and AtspiUiaSource.

We are working hard to identify these issues and hope to aid the GNOME Accessibility Team in stabilizing at-spi2 in the near future.

Find out more

Navigate to our homepage for all the latest information, and ways to contact us.

10 Nov 2011 4:55am GMT

Calum Benson: What’s new on the Solaris 11 desktop?

This entry is cross-posted from my Oracle blog… clearly, seasoned GNOME blog readers will be less excited about GNOME 2.30, compiz and Firefox 6 than my audience over there, many of whom have been using GNOME 2.6 on Solaris 10 for the past 7 years :)

Much has been written today about the enterprise and cloud features of Oracle Solaris 11, which was launched today, but what's new for those of us who just like to have the robustness and security of Solaris on our desktop machines? Here are a few of the Solaris 11 desktop highlights:

Solaris 11 is free to download and use for most non-commercial purposes (but IANAL, so do check the OTN License Agreement on the download page first - it's short and sweet, as these things go), and you can download various flavours, including a Live CD and a USB install image, right here.

10 Nov 2011 12:08am GMT

09 Nov 2011

feedPlanet KDE

Cool new stuff in CMake 2.8.6 (2): pkg-config compatible mode added for use e.g. with autotools

After introducing the automoc feature in my last blog, here comes the next part of this series. More will follow.

The new --find-package mode of CMake

Typically, in projects which are built using autotools or handwritten Makefiles, the tool pkg-config is used to find whether and where some library, used by the software, is installed on the current system, and prints the respective command line options for the compiler to stdout.

Since CMake 2.8.6, also CMake can be used additionally to or instead of pkg-config in such projects to find installed libraries.

With version 2.8.6 CMake features the new command line flag --find-package. When called in this mode, CMake produces results compatible to pkg-config, and can thus be used in a similar way.

E.g. to get the compiler command line arguments for compiling an object file, it can be called like this:

   $ cmake --find-package -DNAME=LibXml2 -DLANGUAGE=C -DCOMPILER_ID=GNU -DMODE=COMPILE

To get the flags needed for linking, do

   $ cmake --find-package -DNAME=LibXml2 -DLANGUAGE=C -DCOMPILER_ID=GNU -DMODE=LINK
   -rdynamic -lxml2

As result, the flags are printed to stdout, as you can see.

The required parameters are

So, you can insert calls like the above in your hand-written Makefiles.
For using CMake in autotools-based projects, you can use cmake.m4, which is now also installed by CMake.
This is used similar to the pkg-config m4-macro, just that it uses CMake internally instead of pkg-config. So your configure.in could look something like this:

   PKG_CHECK_MODULES(XFT, xft >= 2.1.0, have_xft=true, have_xft=false)
    if test $have_xft = "true"; then


This will define the variables LibXml2_CFLAGS and LibXml2_LIBS, which can then be used in the Makefile.in/Makefiles.

What does that mean for developers of CMake-based libraries ?

You don't have to install pkg-config pc-files anymore, just install a Config.cmake file for CMake, and both CMake-based and also autotools-based or any other projects can make use of your library without problems.
Documentation how this is done can be found here:

What does that mean for developers working on e.g. autotools-based projects, and using a project built with CMake ?

Take a look at the cmake_find_package() m4-macro installed since CMake 2.8.6 in share/aclocal/cmake.m4, it contains documentation, and will help you using that library.
Thanks go to Matthias Kretz of Phonon fame, now working on HPC stuff, who wrote the cmake.m4 from scratch (which was necessary since it had to be BSD-licensed in order to be included in CMake).


Internally, CMake basically executes a find_package() with the given name, turns the results into the command line options for the compiler and prints them to stdout.
This means it works basically for all packages for which a FindFoo.cmake file exists or which install a FooConfig.cmake file.
There is one issue though: FindFoo.cmake files, which execute try_compile() or try_run() commands internally, are not supported, since this would required setting up and testing the compiler toolchain completely.
It works best for libraries which install a FooConfig.cmake file, since in these cases nothing has to be detected, all the information is already there.

All this stuff is still very new, and has not yet seen wide real world testing.
So, if you use it and find issues, or have suggestions how to improve it, please let me know.


09 Nov 2011 9:15pm GMT

Debugging nepomuk/virtuoso’s CPU usage

There was a lot of bug fixing regarding nepomuk and its indexing. However you might still get a high CPU-usage. Reporting this is a bit useless unless you can at least give some info about what's happening.

So what you can do is query virtuoso's status. On openSUSE it works like this: first find the .ini file currently in usage to get the port virtuoso is using, connect to virtuoso and finally query virtuoso for its status and running statements. The latter are unfortunately truncated so I would appreciate some hint on how to get around that.

ps aux | grep virtuoso

finds /usr/bin/virtuoso-t +foreground +configfile /tmp/virtuoso_T18122.ini +wait

cat /tmp/virtuoso_T18122.ini | grep Port

finds ServerPort=1111

isql-vt -H localhost -P 1111 -U dba -P dba

which connects to virtuoso


which shows you some info and which queries are keeping the process busy. isql-vt is part of the virtuoso-server package but it might be that only recent packages have it compiled and older packages lack the tools.

Further you can do the following:

If you have any other hints regarding this piece of software feel free to mention them and I will add them to the post.

09 Nov 2011 3:10pm GMT

Help KDE e.V. secure funding for a sprint with just a few clicks

Some weeks ago Lydia blogged about a German bank giving away 1000 euros for each 1000 associations who can get the most votes. Well, until four days ago we were at postion 320, now we are at 735 and falling. Please, read Lydia's post about how to vote and help KDE e.V., it is just a few clicks.

It surprises me KDE e.V. have had only 3652 votes so far. If each people voted three times, which is allowed by the rules, it would give ~1218 people. Is there only 1218 people using KDE in the world?

Just the poll page is in German, but the poll is not limited to German citizens. Anybody can vote and according to the rules you can vote three times with your e-mail.

09 Nov 2011 12:22pm GMT