29 Oct 2020


Mike Blumenkrantz: Invalidation


I've got a lot of exciting stuff in the pipe now, but for today I'm just going to talk a bit about resource invalidation: what it is, when it happens, and why it's important.

Let's get started.

What is invalidation?

Resource invalidation occurs when the backing buffer of a resource is wholly replaced. Consider the following scenario under zink:

On a sane/competent driver, the second glBufferData call will trigger invalidation, which means that A.buffer will be replaced entirely, while A is still the driver resource used by Gallium to represent target.

When does invalidation occur?

Resource invalidation can occur in a number of scenarios, but the most common is when unsetting a buffer's data, as in the above example. The other main case for it is replacing the data of a buffer that's in use for another operation. In such a case, the backing buffer can be replaced to avoid forcing a sync in the command stream which will stall the application's processing. There's some other cases for this as well, like glInvalidateFramebuffer and glDiscardFramebufferEXT, but the primary usage that I'm interested in is buffers.

Why is invalidation important?

The main reason is performance. In the above scenario without invalidation, the second glBufferData call will write null to the whole buffer, which is going to be much more costly than just creating a new buffer.

That's it

Now comes the slightly more interesting part: how does invalidation work in zink?

Currently, as of today's mainline zink codebase, we have struct zink_resource to represent a resource for either a buffer or an image. One struct zink_resource represents exactly one VkBuffer or VkImage, and there's some passable lifetime tracking that I've written to guarantee that these Vulkan objects persist through the various command buffers that they're associated with.

Each struct zink_resource is, as is the way of Gallium drivers, also a struct pipe_resource, which is tracked by Gallium. Because of this, struct zink_resource objects themselves cannot be invalidated in order to avoid breaking Gallium, and instead only the inner Vulkan objects themselves can be replaced.

For this, I created struct zink_resource_object, which is an object that stores only the data that directly relates to the Vulkan objects, leaving struct zink_resource to track the states of these objects. Their lifetimes are separate, with struct zink_resource being bound to the Gallium tracker and struct zink_resource_object persisting for either the lifetime of struct zink_resource or its command buffer usage-whichever is longer.


The code for this mechanism isn't super interesting since it's basically just moving some parts around. Where it gets interesting is the exact mechanics of invalidation and how struct zink_resource_object can be injected into an in-use resource, so let's dig into that a bit.

Here's what the pipe_context::invalidate_resource hook looks like:

static void
zink_invalidate_resource(struct pipe_context *pctx, struct pipe_resource *pres)
   struct zink_context *ctx = zink_context(pctx);
   struct zink_resource *res = zink_resource(pres);
   struct zink_screen *screen = zink_screen(pctx->screen);

   if (pres->target != PIPE_BUFFER)

This only handles buffer resources, but extending it for images would likely be little to no extra work.

   if (res->valid_buffer_range.start > res->valid_buffer_range.end)

Zink tracks the valid data segments of its buffers. This conditional is used to check for an uninitialized buffer, i.e., one which contains no valid data. If a buffer has no data, it's already invalidated, so there's nothing to be done here.


Invalidating means the buffer will no longer have any valid data, so the range tracking can be reset here.

   if (!get_all_resource_usage(res))

If this resource isn't currently in use, unsetting the valid range is enough to invalidate it, so it can just be returned right away with no extra work.

   struct zink_resource_object *old_obj = res->obj;
   struct zink_resource_object *new_obj = resource_object_create(screen, pres, NULL, NULL);
   if (!new_obj) {
      debug_printf("new backing resource alloc failed!");

Here's the old internal buffer object as well as a new one, created using the existing buffer as a template so that it'll match.

   res->obj = new_obj;
   res->access_stage = 0;
   res->access = 0;

struct zink_resource is just a state tracker for the struct zink_resource_object object, so upon invalidate, the states are unset since this is effectively a brand new buffer.

   zink_resource_rebind(ctx, res);

This is the tricky part, and I'll go into more detail about it below.

   zink_descriptor_set_refs_clear(&old_obj->desc_set_refs, old_obj);

If this resource was used in any cached descriptor sets, the references to those sets need to be invalidated so that the sets won't be reused.

   zink_resource_object_reference(screen, &old_obj, NULL);

Finally, the old struct zink_resource_object is unrefed, which will ensure that it gets destroyed once its current command buffer has finished executing.

Simple enough, but what about that zink_resource_rebind() call? Like I said, that's where things get a little tricky, but because of how much time I spent on descriptor management, it ends up not being too bad.

This is what it looks like:

zink_resource_rebind(struct zink_context *ctx, struct zink_resource *res)
   assert(res->base.target == PIPE_BUFFER);

Again, this mechanism is only handling buffer resource for now, and there's only one place in the driver that calls it, but it never hurts to be careful.

   for (unsigned shader = 0; shader < PIPE_SHADER_TYPES; shader++) {
      if (!(res->bind_stages & BITFIELD64_BIT(shader)))
      for (enum zink_descriptor_type type = 0; type < ZINK_DESCRIPTOR_TYPES; type++) {
         if (!(res->bind_history & BITFIELD64_BIT(type)))

Something common to many Gallium drivers is this idea of "bind history", which is where a resource will have bitflags set when it's used for a certain type of binding. While other drivers have a lot more cases than zink does due to various factors, the only thing that needs to be checked for my purposes is the descriptor type (UBO, SSBO, sampler, shader image) across all the shader stages. If a given resource has the flags set here, this means it was at some point used as a descriptor of this type, so the current descriptor bindings need to be compared to see if there's a match.

         uint32_t usage = zink_program_get_descriptor_usage(ctx, shader, type);
         while (usage) {
            const int i = u_bit_scan(&usage);

This is a handy mechanism that returns the current descriptor usage of a shader as a bitfield. So for example, if a vertex shader uses UBOs in slots 0, 1, and 3, usage will be 11, and the loop will process i as 0, 1, and 3.

            struct zink_resource *cres = get_resource_for_descriptor(ctx, type, shader, i);
            if (res != cres)

Now the slot of the descriptor type can be compared against the resource that's being re-bound. If this resource is the one that's currently bound to the specified slot of the specified descriptor type, then steps can be taken to perform additional operations necessary to successfully replace the backing storage for the resource, mimicking the same steps taken when initially binding the resource to the descriptor slot.

            switch (type) {
            case ZINK_DESCRIPTOR_TYPE_SSBO: {
               struct pipe_shader_buffer *ssbo = &ctx->ssbos[shader][i];
               util_range_add(&res->base, &res->valid_buffer_range, ssbo->buffer_offset,
                              ssbo->buffer_offset + ssbo->buffer_size);

For SSBO descriptors, the only change needed is to add valid range for the bound region as . This region is passed to the shader, so even if it's never written to, it might be, and so it can be considered a valid region.

               struct zink_sampler_view *sampler_view = zink_sampler_view(ctx->sampler_views[shader][i]);
               zink_descriptor_set_refs_clear(&sampler_view->desc_set_refs, sampler_view);
               zink_buffer_view_reference(ctx, &sampler_view->buffer_view, NULL);
               sampler_view->buffer_view = get_buffer_view(ctx, res, sampler_view->base.format,
                                                           sampler_view->base.u.buf.offset, sampler_view->base.u.buf.size);

Sampler descriptors require a new VkBufferView be created since the previous one is no longer valid. Again, the references for the existing bufferview need to be invalidated now since that descriptor set can no longer be reused from the cache, and then the new VkBufferView is set after unrefing the old one.

            case ZINK_DESCRIPTOR_TYPE_IMAGE: {
               struct zink_image_view *image_view = &ctx->image_views[shader][i];
               zink_descriptor_set_refs_clear(&image_view->desc_set_refs, image_view);
               zink_buffer_view_reference(ctx, &image_view->buffer_view, NULL);
               image_view->buffer_view = get_buffer_view(ctx, res, image_view->base.format,
                                                         image_view->base.u.buf.offset, image_view->base.u.buf.size);
               util_range_add(&res->base, &res->valid_buffer_range, image_view->base.u.buf.offset,
                              image_view->base.u.buf.offset + image_view->base.u.buf.size);

Images are nearly identical to the sampler case, the difference being that while samplers are read-only like UBOs (and therefore reach this point already having valid buffer ranges set), images are more like SSBOs and can be written to. Thus the valid range must be set here like in the SSBO case.


Eagle-eyed readers will note that I've omitted a UBO case, and this is because there's nothing extra to be done there. UBOs will already have their valid range set and don't need a VkBufferView.


            invalidate_descriptor_state(ctx, shader, type);

Finally, the incremental decsriptor state hash for this shader stage and descriptor type is invalidated. It'll be recalculated normally upon the next draw or compute operation, so this is a quick zero-setting operation.


That's everything there is to know about the current state of resource invalidation in zink!

29 Oct 2020 12:00am GMT

28 Oct 2020


Adam Jackson: on abandoning the X server

There's been some recent discussion about whether the X server is abandonware. As the person arguably most responsible for its care and feeding over the last 15 years or so, I feel like I have something to say about that.

The thing about being the maintainer of a public-facing project for nearly the whole of your professional career is it's difficult to separate your own story from the project. So I'm not going to try to be dispassionate, here. I started working on X precisely because free software had given me options and capabilities that really matter, and I feel privileged to be able to give that back. I can't talk about that without caring about it.

So here's the thing: X works extremely well for what it is, but what it is is deeply flawed. There's no shame in that, it's 33 years old and still relevant, I wish more software worked so well on that kind of timeframe. But using it to drive your display hardware and multiplex your input devices is choosing to make your life worse.

It is, however, uniquely well suited to a very long life as an application compatibility layer. Though the code happens to implement an unfortunate specification, the code itself is quite well structured, easy to hack on, and not far off from being easily embeddable.

The issue, then, is how to get there. And I don't have any real desire to get there while still pretending that the xfree86 hardware-backed server code is a real thing. Sorry, I guess, but I've worked on xfree86-derived servers for very nearly as long as XFree86-the-project existed, and I am completely burnt out on that on its own merits, let alone doing that and also being release manager and reviewer of last resort. You can only apply so much thrust to the pig before you question why you're trying to make it fly at all.

So, is Xorg abandoned? To the extent that that means using it to actually control the display, and not just keep X apps running, I'd say yes. But xserver is more than xfree86. Xwayland, Xwin, Xephyr, Xvnc, Xvfb: these are projects with real value that we should not give up. A better way to say it is that we can finally abandon xfree86.

And if that sounds like a world you'd like to see, please, come talk to us, let's make it happen. I'd be absolutely thrilled to see someone take this on, and I'm happy to be your guide through the server internals.

28 Oct 2020 3:01pm GMT

24 Oct 2020


Mike Blumenkrantz: Catching Up

Never Seen Before

A rare Saturday post because I spent so much time this week intending to blog and then somehow not getting around to it. Let's get to the status updates, and then I'm going to dive into the more interesting of the things I worked on over the past few days.

Zink has just hit another big milestone that I've just invented: as of now, my branch is passing 97% of piglit tests up through GL 4.6 and ES 3.2, and it's a huge improvement from earlier in the week when I was only at around 92%. That's just over 1000 failure cases remaining out of ~41,000 tests. For perspective, a table.

IRIS zink-mainline zink-wip
Passed Tests 43508 21225 40190
Total Tests 43785 22296 41395
Pass Rate 99.4% 95.2% 97.1%

As always, I happen to be running on Intel hardware, so IRIS and ANV are my reference points.

It's important to note here that I'm running piglit tests, and this is very different from CTS; put another way, I may be passing over 97% of the test cases I'm running, but that doesn't mean that zink is conformant for any versions of GL or ES, which may not actually be possible at present (without huge amounts of awkward hacks) given the persistent issues zink has with provoking vertex handling. I expect this situation to change in the future through the addition of more Vulkan extensions, but for now I'm just accepting that there's some areas where zink is going to misrender stuff.

What Changed?

The biggest change that boosted the zink-wip pass rate was my fixing 64bit vertex attributes, which in total had been accounting for ~2000 test failures.

Vertex attributes, as we all know since we're all experts in the graphics field, are the inputs for vertex shaders, and the data types for these inputs can vary just like C data types. In particular, with GL 4.1, ARB_vertex_attrib_64bit became a thing, which allows 64bit values to be passed as inputs here.

Once again, this is a problem for zink.

It comes down to the difference between GL's implicit handling methodology and Vulkan's explicit handling methodology. Consider the case of a dvec4 data type. Conceptually, this is a data type which is 4x64bit values, requiring 32bytes of storage. A vec4 uses 16bytes of storage, and this equates to a single "slot" or "location" within the shader inputs, as everything there is vec4-aligned. This means that, by simple arithmetic, a dvec4 requires two slots for its storage, one for the first two members, and another for the second two, both consuming a single 16byte slot.

When loading a dvec4 in GL(SL), a single variable with the first location slot is used, and the driver will automatically use the second slot when loading the second half of the value.

When loading a dvec4 in (SPIR)Vulkan, two variables with consecutive, explicit location slots must be used, and the driver will load exactly the input location specified.

This difference requires that for any dvec3 or dvec4 vertex input in zink, the value and also the load have to be split along the vec4 boundary for things to work.

Gallium already performs this split on the API side, allowing zink to already be correctly setting things up in the VkPipeline creation, so I wrote a NIR pass to fix things on the shader side.

Shader Rewriting

Yes, it's been at least a week since I last wrote about a NIR pass, so it's past time that I got back into that.

Going into this, the idea here is to perform the following operations within the vertex shader:

Simple, right?

Here we go.

static bool
lower_64bit_vertex_attribs_instr(nir_builder *b, nir_instr *instr, void *data)
   if (instr->type != nir_instr_type_deref)
      return false;
   nir_deref_instr *A_deref = nir_instr_as_deref(instr);
   if (A_deref->deref_type != nir_deref_type_var)
      return false;
   nir_variable *A = nir_deref_instr_get_variable(A_deref);
   if (A->data.mode != nir_var_shader_in)
      return false;
   if (!glsl_type_is_64bit(A->type) || !glsl_type_is_vector(A->type) || glsl_get_vector_elements(A->type) < 3)
      return false;

First, it's necessary to filter out all the instructions that aren't what should be rewritten. As above, only dvec3 and dvec4 types are targeted here (dmat* types are reduced to dvec types prior to this point), so anything other than a A_deref of variables with those types is ignored.

   /* create second variable for the split */
   nir_variable *B = nir_variable_clone(A, b->shader);
   /* split new variable into second slot */
   nir_shader_add_variable(b->shader, B);

B matches A except in its type and slot location, which will always be one greater than the slot location of A, so A can be cloned here to simplify the process of creating B.

   unsigned total_num_components = glsl_get_vector_elements(A->type);
   /* new variable is the second half of the dvec */
   B->type = glsl_vector_type(glsl_get_base_type(A->type), glsl_get_vector_elements(A->type) - 2);
   /* clamp original variable to a dvec2 */
   A_deref->type = A->type = glsl_vector_type(glsl_get_base_type(A->type), 2);

A and B need their types modified to not cross the vec4/slot boundary. A is always a dvec2, which has 2 components, and B will always be the remaining components.

   /* create A_deref instr for new variable */
   b->cursor = nir_after_instr(instr);
   nir_deref_instr *B_deref = nir_build_deref_var(b, B);

Now B_deref has been added thanks to the nir_builder helper function which massively simplifies the process of setting up all the instruction parameters.

   nir_foreach_use_safe(A_deref_use, &A_deref->dest.ssa) {

NIR is SSA-based, and all uses of an SSA value are tracked for the purposes of ensuring that SSA values are truly assigned only once as well as ease of rewriting them in the case where a value needs to be modified, just as this pass is doing. This use-tracking comes along with a simple API for iterating over the uses.

      nir_instr *A_load_instr = A_deref_use->parent_instr;
      assert(A_load_instr->type == nir_instr_type_intrinsic &&
             nir_instr_as_intrinsic(A_load_instr)->intrinsic == nir_intrinsic_load_deref);

The only use of A_deref should be A_load, so really iterating over the A_deref uses is just a quick, easy way to get from there to the A_load instruction.

      /* this is a load instruction for the A_deref, and we need to split it into two instructions that we can
       * then zip back into a single ssa def */
      nir_intrinsic_instr *A_load = nir_instr_as_intrinsic(A_load_instr);
      /* clamp the first load to 2 64bit components */
      A_load->num_components = A_load->dest.ssa.num_components = 2;

A_load must be clamped to a single slot location to avoid crossing the vec4 boundary, so this is done by changing the number of components to 2, which matches the now-changed type of A.

      b->cursor = nir_after_instr(A_load_instr);
      /* this is the second load instruction for the second half of the dvec3/4 components */
      nir_intrinsic_instr *B_load = nir_intrinsic_instr_create(b->shader, nir_intrinsic_load_deref);
      B_load->src[0] = nir_src_for_ssa(&B_deref->dest.ssa);
      B_load->num_components = total_num_components - 2;
      nir_ssa_dest_init(&B_load->instr, &B_load->dest, B_load->num_components, 64, NULL);
      nir_builder_instr_insert(b, &B_load->instr);

This is B_load, which loads a number of components that matches the type of B. It's inserted after A_load, though the before/after isn't important in this case. The key is just that this instruction is added before the next one.

      nir_ssa_def *def[4];
      /* createa new dvec3/4 comprised of all the loaded components from both variables */
      def[0] = nir_vector_extract(b, &A_load->dest.ssa, nir_imm_int(b, 0));
      def[1] = nir_vector_extract(b, &A_load->dest.ssa, nir_imm_int(b, 1));
      def[2] = nir_vector_extract(b, &B_load->dest.ssa, nir_imm_int(b, 0));
      if (total_num_components == 4)
         def[3] = nir_vector_extract(b, &B_load->dest.ssa, nir_imm_int(b, 1));
      nir_ssa_def *C_load = nir_vec(b, def, total_num_components);

Now that A_load and B_load both exist and are loading the corrected number of components, these components can be extracted and reassembled into a larger type for use in the shader, specifically the original dvec3 or dvec4 which is being used. nir_vector_extract performs this extraction from a given instruction by taking an index of the value to extract, and then the composite value is created by passing the extracted components to nir_vec as an array.

      /* use the assembled dvec3/4 for all other uses of the load */
      nir_ssa_def_rewrite_uses_after(&A_load->dest.ssa, nir_src_for_ssa(C_load), C_load->parent_instr);

Since this is all SSA, the NIR helpers can be used to trivially rewrite all the uses of the loaded value from the original A_load instruction to now use the assembled C_load value. It's important that only the uses after C_load has been created (i.e., nir_ssa_def_rewrite_uses_after) are those that are rewritten, however, or else the shader will also rewrite the original A_load value with C_load, breaking the shader entirely with an SSA-impossible as well as generally-impossible C_load = vec(C_load + B_load) assignment.


   return true;

Progress has occurred, so the pass returns true to reflect that.

Now those large attributes are loaded according to Vulkan spec, and everything is great because, as expected, ANV has no bugs here.

24 Oct 2020 12:00am GMT

10 Nov 2011

feedPlanet GNOME

Brad Taylor: Reports of Snowy’s death have been greatly exaggerated

Browsing foundation-list recently, I was honored to see Snowy (and Tomboy Online) hosting mentioned as one of the GNOME CEO goals (scroll to the bottom) for 2010! Unfortunately, the pace of Snowy's development has slowed in the last few months, due in part to both Sandy and my schedules. Despite that, we wouldn't want Stormy to get a bad reputation because of our slacking, so we're going to change that.

We're hosting an IRC meeting in the #snowy channel on irc.gimp.net on Saturday, 23 Jan 2010 at 11:00 AM EDT (16:00 GMT, other time zones) to get ourselves organized, and to recruit your help.

So, if you are a graphic designer that wants to help beautify an awesome open source project, if you're a hacker who knows or wants to learn Django, or even if you're just interested in Snowy, stop on by!

See you there!

10 Nov 2011 4:55am GMT

Brad Taylor: Mono Accessibility 2.0 unleashed!

Today, I'm proud to announce the 2.0 release of the Mono Accessibility project. Spanning a year of intensive work and fixing over 500 bugs, this is truly our best release ever.

This release enables all types of users to access System.Windows.Forms and Silverlight applications from Linux using Orca and other ATK-based Assistive Technologies (ATs), as well as access Linux applications from UI Automation (UIA) based ATs.

What's changed since version 1.0?


What is Mono Accessibility:

The Mono Accessibility project enables Winforms and Silverlight applications to be fully accessible on Linux, and allows Assistive Technologies (ATs) like screen readers and test automation tools that depend on UI Automation APIs to work on Linux.

Mono Accessibility is released under the MIT/X11 license.

Get it!

Mono Accessibility is available for a variety of Linux distributions, including:

A Note About at-spi2

Accessing GTK+ applications with the UIA Client API requires the most recent development version of the new dbus-based at-spi2, which is known to cause system instability.

In Fedora, at-spi2 repeatedly causes GDM to segfault. If you do not need this feature, do not install the latest at-spi2 and atk, or our packages which depend on them, which are at-spi-sharp and AtspiUiaSource.

We are working hard to identify these issues and hope to aid the GNOME Accessibility Team in stabilizing at-spi2 in the near future.

Find out more

Navigate to our homepage for all the latest information, and ways to contact us.

10 Nov 2011 4:55am GMT

Calum Benson: What’s new on the Solaris 11 desktop?

This entry is cross-posted from my Oracle blog… clearly, seasoned GNOME blog readers will be less excited about GNOME 2.30, compiz and Firefox 6 than my audience over there, many of whom have been using GNOME 2.6 on Solaris 10 for the past 7 years :)

Much has been written today about the enterprise and cloud features of Oracle Solaris 11, which was launched today, but what's new for those of us who just like to have the robustness and security of Solaris on our desktop machines? Here are a few of the Solaris 11 desktop highlights:

Solaris 11 is free to download and use for most non-commercial purposes (but IANAL, so do check the OTN License Agreement on the download page first - it's short and sweet, as these things go), and you can download various flavours, including a Live CD and a USB install image, right here.

10 Nov 2011 12:08am GMT