21 Feb 2019

feedplanet.freedesktop.org

Hans de Goede: Fedora 30 Flicker Free boot is now fully testable

Fedora 30 now contains all changes changes for a fully Flicker Free Boot. Last week a new version of plymouth landed which implements the new theme for this and also includes a much improved offline-updates experience, following this design.

At boot the display will seamlessly transit from the firmware boot-splash into the new plymouth theme, which uses the firmware boot-splash as background:



If you are using full-disk encryption then the unlock dialog will look like this:



Last, but not least the new plymouth theme looks like this in offline-updates mode:



ATM the texts in the offline-updates theme are not translated. They are rendered using pango + cairo, so we have the capability to make this fully translatable into all languages, I just need to add gettext support to plymouth for this. I plan to do this next week.

This all assumes that you are booting your machine with UEFI and your firmware supports the BGRT extension (which almost all firmware does). Otherwise you will get a dark-grey background instead of the firmware boot-splash.

If you are running rawhide and are seeing a totally different boot-theme then you likely have changed your plymouth theme away from the "charge" default at one point in time; and in that case you are not automatically updated to the new plymouth theme. To switch to the new theme run:

sudo plymouth-set-default-theme -R bgrt

If you do not like the firmware-splash being used as background, you can use the new theme on a dark-grey background instead by running:

sudo plymouth-set-default-theme -R spinner

21 Feb 2019 1:07pm GMT

18 Feb 2019

feedplanet.freedesktop.org

Neil Roberts: VkRunner at FOSDEM

I attended FOSDEM again this year thanks to funding from Igalia. This time I gave a talk about VkRunner in the graphics dev room. It's now available on Igalia's YouTube channel below:

I thought this might be a good opportunity to give a small status update of what has happened since my last blog post nearly a year ago.

Test suite integration

The biggest news is that VkRunner is now integrated into Khronos' Vulkan CTS test suite and Mesa's Piglit test suite. This means that if you work on a feature or a bugfix in your Vulkan driver and you want to make sure it doesn't get regressed, it's now really easy to add a VkRunner test for it and have it collected in one of these test suites. For Piglit all that is needed is to give the test script a .vk_shader_test extension and drop it anywhere under the tests/vulkan folder and it will automatically be picked up by the Piglit framework. As an added bonus, these tests are also run automatically on Intel's CI system, so if your test is related to i965 in Mesa you can be sure it will not be regressed.

On the Khronos CTS side the integration is currently a little less simple. Along with help from Samuel Iglesias, we have merged a branch into master that lays the groundwork for adding VkRunner tests. Currently there are only proof-of-concept tests to show how the tests could work. Adding more tests still requires tweaking the C++ code so it's not quite as simple as we might hope.

API

When VkRunner is built, in now also builds a static library containing a public API. This can be used to integrate VkRunner into a larger test suite. Indeed, the Khronos CTS integration takes advantage of this to execute the tests using the VkDevice created by the test suite itself. This also means it can execute multiple tests quickly without having to fork an external process.

The API is intended to be very highlevel and is as close to possible as just having simple functions to ask VkRunner to execute a test script and return an enum reporting whether the test succeeded or not. There is an example of its usage in the README.

Precompiled shader scripts

One of the concerns raised when integrating VkRunner into CTS is that it's not ideal to have to run glslang as an external process in order to compile the shaders in the scripts to SPIR-V. To work around this, I added the ability to have scripts with binary shaders. In this case the 32-bit integer numbers of the compiled SPIR-V are just listed in ASCII in the shader test instead of the GLSL source. Of course writing this by hand would be a pain, so the VkRunner repo includes a Python script to precompile a bunch of shaders in a batch. This can be really useful to run the tests on an embedded device where installing glslang isn't practical.

However, in the end for the CTS integration we took a different approach. The CTS suite already has a mechanism to precompile all of the shaders for all tests. We wanted to take advantage of this also when compiling the shaders from VkRunner tests. To make this work, Samuel added some functions to the VkRunner API to query the GLSL in a VkRunner shader script and then replace them with binary equivalents. That way the CTS suite can use these functions to replace the shaders with its cached compiled versions.

UBOs, SSBOs and compute shaders

One of the biggest missing features mentioned in my last post was UBO and SSBO support. This has now been fixed with full support for setting values in UBOs and SSBOs and also probing the results of writing to SSBOs. Probing SSBOs is particularily useful alongside another added feature: compute shaders. Thanks to this we can run our shaders as compute shaders to calculate some results into an SSBO and probe the buffer to see whether it worked correctly. Here is an example script to show how that might look:

[compute shader]
#version 450

/* UBO input containing an array of vec3s */
layout(binding = 0) uniform inputs {
        vec3 input_values[4];
};

/* A matrix to apply to these values. This is stored in a push
 * constant. */
layout(push_constant) uniform transforms {
        mat3 transform;
};

/* An SSBO to store the results */
layout(binding = 1) buffer outputs {
        vec3 output_values[];
};

void
main()
{
        uint i = gl_WorkGroupID.x;

        /* Transform one of the inputs */
        output_values[i] = transform * input_values[i];
}

[test]
# Set some input values in the UBO
ubo 0 subdata vec3 0 \
  3 4 5 \
  1 2 3 \
  1.2 3.4 5.6 \
  42 11 9

# Create the SSBO
ssbo 1 1024

# Store a matrix uniform to swap the x and y
# components of the inputs
push mat3 0 \
  0 1 0 \
  1 0 0 \
  0 0 1

# Run the compute shader with one instance
# for each input
compute 4 1 1

# Check that we got the expected results in the SSBO
probe ssbo vec3 1 0 ~= \
  4 3 5 \
  2 1 3 \
  3.4 1.2 5.6 \
  11 42 9

Extensions in the requirements section

The requirements section can now contain the name of any extension. If this is done then VkRunner will check for the availability of the extension when creating the device and enable it. Otherwise it will report that the test was skipped. A lot of the Vulkan extensions also add an extended features struct to be used when creating the device. These features can also be queried and enabled for extentions that VkRunner knows about simply by listing the name of the feature in that struct. For example if shaderFloat16 in listed in the requirements section, VkRunner will check for the VK_KHR_shader_float16_int8 extension and the shaderFloat16 feature within its extended feature struct. This makes it really easy to test optional features.

Cross-platform support

I spent a fair bit of time making sure VkRunner works on Windows including compiling with Visual Studio. The build files have been converted to CMake which makes building on Windows even easier. It also compiles for Android thanks to patches from Jaebaek Seo. The repo contains Android build files to build the library and the vkrunner executable. This can be run directly on a device using adb.

User interface

There is a branch containing the beginnings of a user interface for editing VkRunner scripts. It presents an editor widget via GTK and continuously runs the test script in the background as you are editing it. It then displays the results in an image and reports any errors in a text field. The test is run in a separate process so that if it crashes it doesn't bring down the user interface. I'm not sure whether it makes sense to merge this branch into master, but in the meantime it can be a convenient way to fiddle with a test when it fails and it's not obvious why.

And more…

Lots of other work has been going on in the background. The best way to get to more details on what VkRunner can do is to take a look at the README. This has been kept up-to-date as the source of documentation for writing scripts.

18 Feb 2019 5:23pm GMT

14 Feb 2019

feedplanet.freedesktop.org

Peter Hutterer: Adding entries to the udev hwdb

In this blog post, I'll explain how to update systemd's hwdb for a new device-specific entry. I'll focus on input devices, as usual.

What is the hwdb and why do I need to update it?

The hwdb is a binary database sitting at /etc/udev/hwdb.bin and /usr/lib/udev/hwdb.d. It is usually used to apply udev properties to specific devices, those properties are then picked up by other processes (udev builtins, libinput, ...) to apply device-specific behaviours. So you'll need to update the hwdb if you need a specific behaviour from the device.

One of the use-cases I commonly deal with is that some touchpad announces wrong axis ranges or resolutions. With the correct hwdb entry (see the example later) udev can correct these at device initialisation time and every process sees the right axis ranges.

The database is compiled from the various .hwdb files you have sitting on your system, usually in /etc/udev/hwdb.d and /usr/lib/hwdb.d. The terser format of the hwdb files makes them easier to update than, say, writing a udev rule to set those properties.

The full process goes something like this:

On its own, the hwdb is merely a lookup tool though, it does not modify devices. Think of it as a virtual filing cabinet, something will need to look at it, otherwise it's just dead weight.

An example for such a udev rule from 60-evdev.rules contains:


IMPORT{builtin}="hwdb --subsystem=input --lookup-prefix=evdev:", \
RUN{builtin}+="keyboard", GOTO="evdev_end"

The IMPORT statement translates as "look up the hwdb, import the properties". The RUN statement runs the "keyboard" builtin which may change the device based on the various udev properties now set. The GOTO statement goes to skip the rest of the file.

So again, on its own the hwdb doesn't do anything, it merely prints to-be udev properties to stdout, udev captures those and applies them to the device. And then those properties need to be processed by some other process to actually apply changes.

hwdb file format

The basic format of each hwdb file contains two types of entries, match lines and property assignments (indented by one space). The match line defines which device it is applied to.

For example, take this entry from 60-evdev.hwdb:


# Lenovo X230 series
evdev:name:SynPS/2 Synaptics TouchPad:dmi:*svnLENOVO*:pn*ThinkPad*X230*
EVDEV_ABS_01=::100
EVDEV_ABS_36=::100

The match line is the one starting with "evdev", the other two lines are property assignments. Property values are strings, any interpretation to numeric values or others is to be done in the process that requires those properties. Noteworthy here: the hwdb can overwrite previously set properties, but it cannot unset them.

The match line is not specified by the hwdb beyond "it's a glob". The format to use is defined by the udev rule that invokes the hwdb builtin. Usually the format is:


someprefix:search criteria:

For example, the udev rule that applies for the match above is this one in 60-evdev.rules:


KERNELS=="input*", \
IMPORT{builtin}="hwdb 'evdev:name:$attr{name}:$attr{[dmi/id]modalias}'", \
RUN{builtin}+="keyboard", GOTO="evdev_end"

What does this rule do? $attr entries get filled in by udev with the sysfs attributes. So on your local device, the actual lookup key will end up looking roughly like this:


evdev:name:Some Device Name:dmi:bvnWhatever:bvR112355:bd02/01/2018:...

If that string matches the glob from the hwdb, you have a match.

Attentive readers will have noticed that the two entries from 60-evdev.rules I posted here differ. You can have multiple match formats in the same hwdb file. The hwdb doesn't care, it's just a matching system.

We keep the hwdb files matching the udev rules names for ease of maintenance so 60-evdev.rules keeps the hwdb files in 60-evdev.hwdb and so on. But this is just for us puny humans, the hwdb will parse all files it finds into one database. If you have a hwdb entry in my-local-overrides.hwdb it will be matched. The file-specific prefixes are just there to not accidentally match against an unrelated entry.

Applying hwdb updates

The hwdb is a compiled format, so the first thing to do after any changes is to run


$ systemd-hwdb update

This command compiles the files down to the binary hwdb that is actually used by udev. Without that update, none of your changes will take effect.

The second thing is: you need to trigger the udev rules for the device you want to modify. Either you do this by physically unplugging and re-plugging the device or by running


$ udevadm trigger

or, better, trigger only the device you care about to avoid accidental side-effects:


$ udevadm trigger /sys/class/input/eventXYZ

In case you also modified the udev rules you should re-load those too. So the full quartet of commands after a hwdb update is:


$ systemd-hwdb update
$ udevadm control --reload-rules
$ udevadm trigger
$ udevadm info /sys/class/input/eventXYZ

That udevadm info command lists all assigned properties, these should now include the modified entries.

Adding new entries

Now let's get down to what you actually want to do, adding a new entry to the hwdb. And this is where it also get's tricky to have a generic guide because every hwdb file has its own custom match rules.

The best approach is to open the .hwdb files and the matching .rules file and figure out what the match formats are and which one is best. For USB devices there's usually a match format that uses the vendor and product ID. For built-in devices like touchpads and keyboards there's usually a dmi-based match format (see /sys/class/dmi/id/modalias). In most cases, you can just take an existing entry and copy and modify it.

My recommendation is: add an extra property that makes it easy to verify the new entry is applied. For example do this:


# Lenovo X230 series
evdev:name:SynPS/2 Synaptics TouchPad:dmi:*svnLENOVO*:pn*ThinkPad*X230*
EVDEV_ABS_01=::100
EVDEV_ABS_36=::100
FOO=1

Now run the update commands from above. If FOO=1 doesn't show up, then you know it's the hwdb entry that's not yet correct. If FOO=1 does show up in the udevadm info output, then you know the hwdb matches correctly and any issues will be in the next layer.

Increase the value with every change so you can tell whether the most recent change is applied. And before your submit a pull request, remove the FOOentry.

Oh, and once it applies correctly, I recommend restarting the system to make sure everything is in order on a freshly booted system.

Troubleshooting

The reason for adding hwdb entries is always because we want the system to handle a device in a custom way. But it's hard to figure out what's wrong when something doesn't work (though 90% of the time it's a typo in the hwdb match).

In almost all cases, the debugging sequence is the following:

If the answer to all these is "yes" and it still doesn't work, you may have found a bug. But 99% of the time, at least one of those is a sound "no. oops.".

Your hwdb match may run into issues with some 'special' characters. If your device has e.g. an ® in its device name (some Microsoft devices have this), a bug in systemd caused the match to fail. That bug is fixed now but until it's available in your distribution, replace with an asterisk ('*') in your match line.

Greybeards who have been around since before 2014 (systemd v219) may remember a different tool to update the hwdb: udevadm hwdb --update. This tool still exists, but it does not have the exact same behaviour as systemd-hwdb update. I won't go into details but the hwdb generated by the udevadm tool can provide unexpected matches if you have multiple matches with globs for the same device. A more generic glob can take precedence over a specific glob and so on. It's a rare and niche case and fixed since systemd v233 but the udevadm behaviour remained the same for backwards-compatibility.

Happy updating and don't forget to add Database Administrator to your CV when your PR gets merged.

14 Feb 2019 1:40am GMT

05 Feb 2019

feedplanet.freedesktop.org

Nicolai Hähnle: FOSDEM talk on TableGen

Video and slides for my talk in the LLVM devroom on TableGen are now available here.

Now I only need the time and energy to continue my blog series on the topic...

05 Feb 2019 10:03am GMT

01 Feb 2019

feedplanet.freedesktop.org

Hans de Goede: i915.fastboot=1 is now enabled in Rawhide / F30 for Skylake and newer

i've just landed a big milestone for the Flicker Free Boot work I'm doing for Fedors 30. Starting with todays rawhide kernel build, version 5.0.0-0.rc4.git3.1, the fastboot option for the i915 Intel display driver is enabled by default on systems with a Skylake CPU/iGPU and newer, as well as on Valleyview and Cherryview (Bay- and Cherry-Trail) systems.

This means that the last modeset / flicker during boot of UEFI systems using the integrated Intel GPU for display output is now gone.

01 Feb 2019 3:27pm GMT

30 Jan 2019

feedplanet.freedesktop.org

Samuel Iglesias: VkRunner is integrated into VK-GL-CTS and piglit

One of the greatest features from piglit was the easy development of OpenGL tests based on GLSL shaders plus some simple commands through shader_runner command. I even wrote about it.

However, Vulkan ecosystem was missing a tool like that but for SPIR-V shader tests… until last year!

Vulkan Logo

VkRunner is a tool written by Neil Roberts, which is very inspired on shader_runner. VkRunner was the result of the Igalia work to enable ARB_gl_spirv extension for Intel's i965 driver on Mesa, where there was a need to test driver's code against a good number of shaders to be sure that it was fine.

VkRunner uses a script language to define the requirements needed to run the test, such as the needed extension and features, the shaders to be run and a series of commands to run it. It will then parse everything and execute the equivalent Vulkan commands to do so under the hood, like shader_runner did for OpenGL in piglit.

This is an example of how a Vkrunner looks like:

[compute shader]
#version 450

layout(std140, push_constant) uniform push_constants {
        float in_value;
};

layout(std140, binding = 0) buffer ssbo {
        float out_value;
};

void
main()
{
        out_value = sqrt(in_value);
}

[test]
# Allocate an ssbo big enough for a float at binding 0
ssbo 0 4

# Set the push constant as an input value
uniform float 0 4

compute 1 1 1

# Probe that we got the expected value
tolerance 0.00006% 0.00006% 0.00006% 0.00006%
probe ssbo float 0 0 ~= 2

The end of 2018 was great for VkRunner! First, the tool was integrated into piglit so we can now use it in this amazing open-source testing suite for 3D graphics drivers. Soon after, it was integrated into Khronos Group's Vulkan and OpenGL Conformance Test Suite (see commit), which will help contributors to easily write SPIR-V tests on Vulkan.

If you want learn more about VkRunner, apart from browsing the repository, Neil wrote a nice blogpost explaining the tool basics, gave a lightning talk at XDC 2018 (slides) in A Coruña and now, he is going to give a talk in the graphics devroom at FOSDEM 2019! You can follow his FOSDEM talk on Saturday via live-stream (or see the recording afterwards) in case you are not going to FOSDEM this year :-)

Igalia

FOSDEM 2019

30 Jan 2019 7:00am GMT

07 Jan 2019

feedplanet.freedesktop.org

Tomeu Visozo: A Panfrost milestone

The video

Below you can see glmark2 running as a Wayland client in Weston, on a NanoPC -T4 (so a RK3399 SoC with a Mali T-864 GPU)). It's much smoother than on the video, which is limited to 5FPS by the webcam.


Weston is running with the DRM backend and the GL renderer.

The history behind it


For more than 10 years, at Collabora we have been happily helping our customers to make the most of their hardware by running free software.

One area some of us have specially enjoyed working on has been open drivers for GPUs, which for a long time have been considered the next frontier in the quest to have a full software platform that companies and individuals can understand, improve and fix without having to ask for permission first.

Something that has saddened me a bit has been our reduced ability to help those customers that for one reason or another had chosen a hardware platform with ARM Mali GPUs, as no open driver was available for those.

While our biggest customers were able to get a high level of support from the vendors in order to have the Mali graphics stack well integrated with the rest of their product, the smaller ones had a much harder time in achieving that level of integration, which manifested in reduced performance, increased power consumption and slipped milestones.

That's why we have been following with great interest the several efforts that aimed to come up with an open driver for GPUs in the Mali family, one similar to those already existing for Qualcomm, NVIDIA and Vivante.

At XDC last year we had the chance of meeting the people involved in the latest effort to develop such a driver: Panfrost. And in the months that followed I made some room in my backlog to come up with a plan to give the effort a boost.

At that point, Panfrost was only able to get its bits in the screen by an elaborate hack that involved copying each frame into a X11 SHM buffer, which besides making the setup of the development environment much more cumbersome, invalidated any performance analysis. It also limited testing to demos such as glmark2.

Due to my previous work on Etnaviv I was already familiar with the abstractions in Mesa for setups in which the display of buffers is performed by a device different from the GPU so it was just a matter of seeing how we could get the kernel driver for the Mali GPU to play well with the rest of the stack.

So during the past month or so I have come up with a proper implementation of the winsys abstraction that makes use of ARM's kernel driver. The result is that now developers have a better base on which to work on the rendering side of things.

By properly creating, exporting and importing buffers, we can now run applications on GBM, from demos such as kmscube and glmark2 to compositors such as Weston, but also big applications such as Kodi. We are also supporting zero-copy display of GPU-rendered clients in Weston.

This should make it much easier to work on the rendering side of things, and work on a proper DRM driver in the mainline kernel can proceed in parallel.

For those interested in joining to the effort, Alyssa has graciously taken the time to update the instructions to build and test Panfrost. You can join us at #panfrost in Freenode and can start sending merge requests to Gitlab.

Thanks to Collabora for sponsoring this work and to Alyssa Rosenzweig and Lyude Paul for their previous work and for answering my questions.

07 Jan 2019 11:54am GMT

15 Dec 2018

feedplanet.freedesktop.org

Peter Hutterer: Understanding HID report descriptors

This time we're digging into HID - Human Interface Devices and more specifically the protocol your mouse, touchpad, joystick, keyboard, etc. use to talk to your computer.

Remember the good old days where you had to install a custom driver for every input device? Remember when PS/2 (the protocol) had to be extended to accommodate for mouse wheels, and then again for five button mice. And you had to select the right protocol to make it work. Yeah, me neither, I tend to suppress those memories because the world is awful enough as it is.

As users we generally like devices to work out of the box. Hardware manufacturers generally like to add bits and bobs because otherwise who would buy that new device when last year's device looks identical. This difference in needs can only be solved by one superhero: Committee-man, with the superpower to survive endless meetings and get RFCs approved.

Many many moons ago, when USB itself was in its infancy, Committee man and his sidekick Caffeine boy got the USB consortium agree on a standard for input devices that is so self-descriptive that operating systems (Win95!) can write one driver that can handle this year's device, and next year's, and so on. No need to install extra drivers, your device will just work out of the box. And so HID was born. This may only be an approximate summary of history.

Originally HID was designed to work over USB. But just like Shrek the technology world is obsessed with layers so these days HID works over different transport layers. HID over USB is what your mouse uses, HID over i2c may be what your touchpad uses. HID works over Bluetooth and it's celebrity-diet version BLE. Somewhere, someone out there is very slowly moving a mouse pointer by sending HID over carrier pigeons just to prove a point. Because there's always that one guy.

HID is incredibly simple in that the static description of the device can just be bytes burnt into the ROM like the Australian sun into unprepared English backpackers. And the event frames are often an identical series of bytes where every bit is filled in by the firmware according to the axis/buttons/etc.

HID is incredibly complicated because parsing it is a stack-based mental overload. Each individual protocol item is simple but getting it right and all into your head is tricky. Luckily, I'm here for you to make this simpler to understand or, failing that, at least more entertaining.

As said above, the purpose of HID is to make devices describe themselves in a generic manner so that you can have a single driver handle any input device. The idea is that the host parses that standard protocol and knows exactly how the device will behave. This has worked out great, we only have around 200 files dealing with vendor- and hardware-specific HID quirks as of v4.20.

HID messages are Reports. And to know what a Report means and how to interpret it, you need a Report Descriptor. That Report Descriptor is static and contains a series of bytes detailing "what" and "where", i.e. what a sequence of bits represents and where to find those bits in the Report. So let's try and parse one of Report Descriptors, let's say for a fictional mouse with a few buttons. How exciting, we're at the forefront of innovation here.

The Report Descriptor consists of a bunch of Items. A parser reads the next Item, processes the information within and moves on. Items are small (1 byte header, 0-4 bytes payload) and generally only apply exactly one tiny little bit of information. You need to accumulate several items to build up enough information to actually know what's happening.

The "what" question of the Report Descriptor is answered with the so-called Usage. This could be something simple like X or Y (0x30 and 0x31) or something more esoteric like System Menu Exit (0x88). A Usage is 16 bits but all Usages are grouped into so-called Usage Pages. A Usage Page too is a 16 bit value and together they form the 32-bit value that tells us what the device can do. Examples:


0001 0031 # Generic Desktop, Y
0001 0088 # Generic Desktop, System Menu Exit
0003 0005 # VR Controls, Head Tracker
0003 0006 # VR Controls, Head Mounted Display
0004 0031 # Keyboard, Keyboard \ and |

Note how the Usage in the last item is the same as the first one, without the Usage Page you will mix things up. It helps if you always think of as the Usage as a 32-bit number. For your kids' bed-time story time, here are the HID Usage Tables from 2004 and the approved HID Usage Table Review Requests of the last decade. Because nothing puts them to sleep quicker than droning on about hex numbers associated with remote control buttons.

To successfully interpret a Report from the device, you need to know which bits have which Usage associated with them. So let's go back to our innovative mouse. We would want a report descriptor with 6 items like this:


Usage Page (Generic Desktop)
Usage (X)
Report Size (16)
Usage Page (Generic Desktop)
Usage (Y)
Report Size (16)

This basically tells the host: X and Y both have 16 bits. So if we get a 4-byte Report from the device, we know two bytes are for X, two for Y.

HID was invented when a time when bits were more expensive than printer ink, so we can't afford to waste any bits (still the case because who would want to spend an extra penny on more ROM). HID makes use of so-called Global items, once those are set their value applies to all following items until changed. Usage Page and Report Size are such Global items, so the above report descriptor is really implemented like this:


Usage Page (Generic Desktop)
Usage (X)
Usage (Y)
Report Count (2)
Report Size (16)
Input (Data,Var,Rel)

The Report Count just tells us that 2 fields of the current Report Size are coming up. We have two usages, two fields, and 16 bits each so we know what to do. The Input item is sort-of the marker for the end of the stack, it basically tells us "process what you've seen so far", together with a few flags. Rel in this case means that the Usages are relative. Oh, and Input means that this is data from device to host. Output would be data from host to device, e.g. to set LEDs on a keyboard. There's also Feature which indicates configurable items.

Buttons on a device are generally just numbered so it'd be monumental 16-bits-at-a-time waste to have HID send Usage (Button1), Usage (Button2), etc. for every button on the device. HID instead provides a Usage Minimum and Usage Maximumto sequentially order them. This looks like this:


Usage Page (Button)
Usage Minimum (1)
Usage Maximum (5)
Report Count (5)
Report Size (1)
Input (Data,Var,Abs)

So we have 5 buttons here and each button has one bit. Note how the buttons are Abs because a button state is not a relative value, it's either down or up. HID is quite intolerant to Schrödinger's thought experiments.

Let's put the two things together and we have an almost-correct Report descriptor:


Usage Page (Button)
Usage Minimum (1)
Usage Maximum (5)
Report Count (5)
Report Size (1)
Input (Data,Var,Abs)

Report Size (3)
Report Count (1)
Input (Cnst,Arr,Abs)

Usage Page (Generic Desktop)
Usage (X)
Usage (Y)
Report Count (2)
Report Size (16)
Input (Data,Var,Rel)

New here is Cnst. This signals that the bits have a constant value, thus don't need a Usage and basically don't matter (haha. yeah, right. in theory). Linux does indeed ignore those. Cnst is used for padding to align on byte boundaries - 5 bits for buttons plus 3 bits padding make 8 bits. Which makes one byte as everyone agrees except for granddad over there in the corner. I don't know how he got in.

Were we to get a 5-byte Report from the device, we'd parse it approximately like this:


button_state = byte[0] & 0x1f
x = bytes[1] | (byte[2] << 8)
y = bytes[3] | (byte[4] << 8)

Hooray, we're almost ready. Except not. We may need more info to correctly interpret the data within those reports.

The Logical Minimum and Logical Maximum specify the value range of the actual data. We need this to tell us whether the data is signed and what the allowable range is. Together with the Physical Minimumand the Physical Maximum they specify what the values really mean. In the simple case:


Usage Page (Generic Desktop)
Usage (X)
Usage (Y)
Report Count (2)
Report Size (16)
Logical Minimum (-32767)
Logical Maximum (32767)
Input (Data,Var,Rel)

This just means our x/y data is signed. Easy. But consider this combination:


...
Logical Minimum (0)
Logical Maximum (1)
Physical Minimum (1)
Physical Maximum (12)

This means that if the bit is 0, the effective value is 1. If the bit is 1, the effective value is 12.

Note that the above is one report only. Devices may have multiple Reports, indicated by the Report ID. So our Report Descriptor may look like this:


Report ID (01)
Usage Page (Button)
Usage Minimum (1)
Usage Maximum (5)
Report Count (5)
Report Size (1)
Input (Data,Var,Abs)
Report Size (3)
Report Count (1)
Input (Cnst,Arr,Abs)

Report ID (02)
Usage Page (Generic Desktop)
Usage (X)
Usage (Y)
Report Count (2)
Report Size (16)
Input (Data,Var,Rel)

If we were to get a Report now, we need to check byte 0 for the Report ID so we know what this is. i.e. our single-use hard-coded parser would look like this:


if byte[0] == 0x01:
button_state = byte[1] & 0x1f
else if byte[0] == 0x02:
x = bytes[2] | (byte[3] << 8)
y = bytes[4] | (byte[5] << 8)

A device may use multiple Reports if the hardware doesn't gather all data within the same hardware bits. Now, you may ask: if I get fifteen reports, how should I know what belongs together? Good question, and lucky for you the HID designers are miles ahead of you. Report IDs are grouped into Collections.

Collections can have multiple types. An Application Collectiondescribes a set of inputs that make sense as a whole. Usually, every Report Descriptor must define at least one Application Collection but you may have two or more. For example, a a keyboard with integrated trackpoint should and/or would use two. This is how the kernel knows it needs to create two separate event nodes for the device. Application Collections have a few reserved Usages that indicate to the host what type of device this is. These are e.g. Mouse, Joystick, Consumer Control. If you ever wondered why you have a device named like "Logitech G500s Laser Gaming Mouse Consumer Control" this is the kernel simply appending the Application Collection's Usage to the device name.

A Physical Collection indicates that the data is collected at one physical point though what a point is is a bit blurry. Theoretical physicists will disagree but a point can be "a mouse". So it's quite common for all reports on a mouse to be wrapped in one Physical Collections. If you have a device with two sets of sensors, you'd have two collections to illustrate which ones go together. Physical Collections also have reserved Usages like Pointer or Head Tracker.

Finally, a Logical Collection just indicates that some bits of data belong together, whatever that means. The HID spec uses the example of buffer length field and buffer data but it's also common for all inputs from a mouse to be grouped together. A quick check of my mice here shows that Logitech doesn't wrap the data into a Logical Collection but Microsoft's firmware does. Because where would we be if we all did the same thing...

Anyway. Now that we know about collections, let's look at a whole report descriptor as seen in the wild:


Usage Page (Generic Desktop)
Usage (Mouse)
Collection (Application)
Usage Page (Generic Desktop)
Usage (Mouse)
Collection (Logical)
Report ID (26)
Usage (Pointer)
Collection (Physical)
Usage Page (Button)
Usage Minimum (1)
Usage Maximum (5)
Report Count (5)
Report Size (1)
Logical Minimum (0)
Logical Maximum (1)
Input (Data,Var,Abs)
Report Size (3)
Report Count (1)
Input (Cnst,Arr,Abs)
Usage Page (Generic Desktop)
Usage (X)
Usage (Y)
Report Count (2)
Report Size (16)
Logical Minimum (-32767)
Logical Maximum (32767)
Input (Data,Var,Rel)
Usage (Wheel)
Physical Minimum (0)
Physical Maximum (0)
Report Count (1)
Report Size (16)
Logical Minimum (-32767)
Logical Maximum (32767)
Input (Data,Var,Rel)
End Collection
End Collection
End Collection

We have one Application Collection (Generic Desktop, Mouse) that contains one Logical Collection (Generic Desktop, Mouse). That contains one Physical Collection (Generic Desktop, Pointer). Our actual Report (and we have only one but it has the decimal ID 26) has 5 buttons, two 16-bit axes (x and y) and finally another 16 bit axis for the Wheel. This device will thus send 8-byte reports and our parser will do:


if byte[0] != 0x1a: # it's decimal in the above descriptor
error, should be 26
button_state = byte[1] & 0x1f
x = byte[2] | (byte[3] << 8)
y = byte[4] | (byte[5] << 8)
wheel = byte[6] | (byte[7] << 8)

That's it. Now, obviously, you can't write a parser for every HID descriptor out there so your actual parsing code needs to be generic. The Linux kernel does exactly that and so does everything else that needs to parse HID. There's a huge variety in devices out there, all with HID descriptors that may or may not be correct. As with so much in life, correct HID implementations are often defined by "whatever Windows accepts" so if you like playing catch, Linux development is for you.

Oh, in case you just got a bit too optimistic about the state of the world: HID allows for vendor-defined usages. Which does exactly what you'd think it does, it hides vendor-specific protocol inside what should be a generic protocol. There are devices with hidden report IDs that you can only unlock by sending the right magic sequence to the report and/or by defeating the boss on Level 4. Usually those devices present themselves as basic/normal devices over HID but if you know the magic sequence you get to use *gasp* all buttons. Or access the device-specific configuration features. Logitech's HID++ is just one example here but at least that's one where we have most of the specs available.

The above describes how to parse the HID report descriptor and interpret the reports. But what happens once you have a HID report correctly parsed? In the case of the Linux kernel, once the report descriptor is parsed evdev nodes are created (one per Application Collection, more or less). As the Reports come in, they are mapped to evdev codes and the data appears on the evdev node. That's where userspace like libinput can pick it up. That bit is actually quite simple (mostly anyway).

The above output was generated with the tools from the hid-tools repository. Go forth and hid-record.

15 Dec 2018 4:47am GMT

14 Dec 2018

feedplanet.freedesktop.org

Bastien Nocera: The tools of libfprint

libfprint, the fingerprint reader driver library, is nearing a 1.0 release.

Since the last time I reported on the status of the library, we've made some headway modernising the library, using a variety of different tools. Let's go through them and how they were used.

Callcatcher

When libfprint was in its infancy, Daniel Drake found the NBIS fingerprint processing library matched what was required to provide fingerprint matching algorithms, and imported it in libfprint. Since then, the code in this copy-paste library in libfprint stayed the same. When updating it to the latest available version (from 2015 rather than 2007), as well as splitting off a patch to make it easier to update the library again in the future, I used Callcatcher to cull the unused functions.

Callcatcher is not a "production-level" tool (too many false positives, lack of support for many common architectures, etc.), but coupled with manual checking, it allowed us to greatly reduce the number of functions in our copy, so they weren't reported when using other source code quality checking tools.

LLVM's scan-build

This is a particularly easy one to use as its use is integrated into meson, and available through ninja scan-build. The output of the tool, whether on stderr, or on the HTML pages, is pretty similar to Coverity's, but the tool is free, and easily integrated into a CI (once you've fixed all the bugs, obviously). We found plenty of possible memory leaks and unintialised variables using this, with more flexibility than using Coverity's web interface, and avoiding going through hoops when using its "source code check as a service" model.

cflow and callgraph

LLVM has another tool, called callgraph. It's not yet integrated into meson, which was a bit of a problem to get some output out of it. But combined with cflow, we used it to find where certain functions were called, trying to find the origin of some variables (whether they were internal or device-provided for example), which helped with implementing additional guards and assertions in some parts of the library, in particular inside the NBIS sub-directory.

0.99.0 is out

We're not yet completely done with the first pass at modernising libfprint and its ecosystem, but we released an early Yule present with version 0.99.0. It will be integrated into Fedora after the holidays if the early testing goes according to plan.

We also expect a great deal from our internal driver API reference. If you have a fingerprint reader that's unsupported, contact your laptop manufacturer about them providing a Linux driver for it and point them at this documentation.

A number of laptop vendors are already asking their OEM manufacturers to provide drivers to be merged upstream, but a little nudge probably won't hurt.

Happy holidays to you all, and see you for some more interesting features in the new year.

14 Dec 2018 4:04pm GMT

12 Dec 2018

feedplanet.freedesktop.org

Peter Hutterer: High resolution wheel scrolling on Linux v4.21

Disclaimer: this is pending for v4.21 and thus not yet in any kernel release.

Most wheel mice have a physical feature to stop the wheel from spinning freely. That feature is called detents, notches, wheel clicks, stops, or something like that. On your average mouse that is 24 wheel clicks per full rotation, resulting in the wheel rotating by 15 degrees before its motion is arrested. On some other mice that angle is 18 degrees, so you get 20 clicks per full rotation.

Of course, the world wouldn't be complete without fancy hardware features. Over the last 10 or so years devices have added free-wheeling scroll wheels or scroll wheels without distinct stops. In many cases wheel behaviour can be configured on the device, e.g. with Logitech's HID++ protocol. A few weeks back, Harry Cutts from the chromium team sent patches to enable Logitech high-resolution wheel scrolling in the kernel. Succinctly, these patches added another axis next to the existing REL_WHEEL named REL_WHEEL_HI_RES. Where available, the latter axis would provide finer-grained scroll information than the click-by-click REL_WHEEL. At the same time I accidentally stumbled across the documentation for the HID Resolution Multiplier Feature. A few patch revisions later and we now have everything queued up for v4.21. Below is a summary of the new behaviour.

The kernel will continue to provide REL_WHEEL as axis for "wheel clicks", just as before. This axis provides the logical wheel clicks, (almost) nothing changes here. In addition, a REL_WHEEL_HI_RES axis is available which allows for finer-grained resolution. On this axis, the magic value 120 represents one logical traditional wheel click but a device may send a fraction of 120 for a smaller motion. Userspace can either accumulate the values until it hits a full 120 for one wheel click or it can scroll by a few pixels on each event for a smoother experience. The same principle is applied to REL_HWHEEL and REL_HWHEEL_HI_RES for horizontal scroll wheels (which these days is just tilting the wheel). The REL_WHEEL axis is now emulated by the kernel and simply sent out whenever we have accumulated 120.

Important to note: REL_WHEEL and REL_HWHEEL are now legacy axes and should be ignored by code handling the respective high-resolution version.

The magic value of 120 is taken directly from Windows. That value was chosen because it has a good number of integer factors, so dividing 120 by whatever multiplier the mouse uses gives you a integer fraction of 120. And because HW manufacturers want it to work on Windows, we can rely on them doing it right, provided we use the same approach.

There are two implementations that matter. Harry's patches enable the high-resolution scrolling on Logitech mice which seem to mostly have a multiplier of 8 (i.e. REL_WHEEL_HI_RES will send eight events with a value of 15 before REL_WHEEL sends 1 click). There are some interesting side-effects with e.g. the MX Anywhere 2S. In high-resolution mode with a multiplier of 8, a single wheel movement does not always give us 8 events, the firmware does its own magic here. So we have some emulation code in place with the goal of making the REL_WHEEL event happen on the mid-point of a wheel click motion. The exact point can shift a bit when the device sends 7 events instead of 8 so we have a few extra bits in place to reset after timeouts and direction changes to make sure the wheel behaviour is as consistent as possible.

The second implementation is for the generic HID protocol. This was all added for Windows Vista, so we're only about a decade behind here. Microsoft got the Resolution Multiplier feature into the official HID documentation (possibly in the hope that other HW manufacturers implement it which afaict didn't happen). This feature effectively provides a fixed value multiplier that the device applies in hardware when enabled. It's basically the same as the Logitech one except it's set through a HID feature instead of a vendor-specific protocol. On the devices tested so far (all Microsoft mice because no-one else seems to implement this) the multipliers vary a bit, ranging from 4 to 12. And the exact behaviour varies too. One mouse behaves correctly (Microsoft Comfort Optical Mouse 3000) and sends more events than before. Other mice just send the multiplied value instead of the normal value, so nothing really changes. And at least one mouse (Microsoft Sculpt Ergonomic) sends the tilt-wheel values more frequently and with a higher value. So instead of one event with value 1 every X ms, we now get an event with value 3 every X/4 ms. The mice tested do not drop events like the Logitech mice do, so we don't need fancy emulation code here. Either way, we map this into the 120 range correctly now, so userspace gets to benefit.

As mentioned above, the Resolution Multiplier HID feature was introduced for Windows Vista which is... not the most recent release. I have a strong suspicion that Microsoft dumped this feature as well, the most recent set of mice I have access to don't provide the feature anymore (they have vendor-private protocols that we don't know about instead). So the takeaway for all this is: if you have a Logitech mouse, you'll get higher-resolution scrolling on v4.21. If you have a Microsoft mouse a few years old, you may get high-resolution wheel scrolling if the device supports it. Any other vendor or a new Microsoft mouse, you don't get it.

Coincidentally, if you know anyone at Microsoft who can provide me with the specs for their custom protocol, I'd appreciate it. We'd love to have support for it both in libratbag and in the kernel. Or any other vendor, come to think of it.

12 Dec 2018 4:27am GMT

10 Dec 2018

feedplanet.freedesktop.org

Eric Anholt: 2018-12-10

For V3D last week, I resurrected my old GLES 3.1 series with SSBO and shader imgae support, rebuilt it for V3D 4.1 (shader images no longer need manual tiling), and wrote indirect draw support and started on compute shaders. As of this weekend, dEQP-GLES31 is passing 1387/1567 of tests with "compute" in the name on the simulator. I have a fix needed for barrier(), then it's time to build the kernel interface. In the process, I ended up fixing several job flushing bugs, plugging memory leaks, improving our shader disassembly debug dumps, and reducing memory consumption and CPU overhead.

The TFU series is now completely merged, and the kernel cache management series from last week is now also merged. I also fixed a bug in the core GPU scheduler that would cause use-after-frees when tracing.

For vc4, Boris completed and merged the H/V flipping work after fixing display of flipped/offset SAND images. More progress has been made on chamelium testing, so we're nearing having automated regression tests for parts of our display stack. My PM driver has the acks we needed, but my co-maintainer has tacked a new request on in v3 so it'll be delayed.

10 Dec 2018 12:30am GMT

05 Dec 2018

feedplanet.freedesktop.org

Samuel Iglesias: VK_KHR_shader_float_controls and Mesa support

Khronos Group has published two new extensions for Vulkan: VK_KHR_shader_float16_int8 and VK_KHR_shader_float_controls. In this post, I will talk about VK_KHR_shader_float_controls, which is the extension I have been implementing on Anvil driver, the open-source Intel Vulkan driver, as part of my job at Igalia. For information about VK_KHR_shader_float16_int8 and its implementation in Mesa, you can read Iago's blogpost.

The Vulkan Working Group has defined a new extension VK_KHR_shader_float_controls, which allows applications to query and override the implementation's default floating point behavior for rounding modes, denormals, signed zero and infinity. From the Vulkan application developer perspective, VK_shader_float_controls defines a new structure called VkPhysicalDeviceFloatControlsPropertiesKHR where the drivers expose the supported capabilities such as the rounding modes for each floating point data type, how the denormals are expected to be handled by the hardware (either flush to zero or preserve their bits) and if the value is a signed zero, infinity and NaN, whether it will preserve their bits.

typedef struct VkPhysicalDeviceFloatControlsPropertiesKHR {
    VkStructureType    sType;
    void*              pNext;
    VkBool32           separateDenormSettings;
    VkBool32           separateRoundingModeSettings;
    VkBool32           shaderSignedZeroInfNanPreserveFloat16;
    VkBool32           shaderSignedZeroInfNanPreserveFloat32;
    VkBool32           shaderSignedZeroInfNanPreserveFloat64;
    VkBool32           shaderDenormPreserveFloat16;
    VkBool32           shaderDenormPreserveFloat32;
    VkBool32           shaderDenormPreserveFloat64;
    VkBool32           shaderDenormFlushToZeroFloat16;
    VkBool32           shaderDenormFlushToZeroFloat32;
    VkBool32           shaderDenormFlushToZeroFloat64;
    VkBool32           shaderRoundingModeRTEFloat16;
    VkBool32           shaderRoundingModeRTEFloat32;
    VkBool32           shaderRoundingModeRTEFloat64;
    VkBool32           shaderRoundingModeRTZFloat16;
    VkBool32           shaderRoundingModeRTZFloat32;
    VkBool32           shaderRoundingModeRTZFloat64;
} VkPhysicalDeviceFloatControlsPropertiesKHR;

This structure will be filled by the driver when calling vkGetPhysicalDeviceProperties2(), with a pointer to such structure as one of the pNext pointers of VkPhysicalDeviceProperties2 structure. With that, we know if the driver will support the SPIR-V capabilities we want to use in our shaders, if separate*Settings are true, remember to check the value of the property for the floating point bit-size types you are planning to work with.

The required bits to enable such capabilities in a SPIR-V shader are the following:

  1. Enable the extension: OpExtension "SPV_KHR_float_controls"
  2. Enable the desired capability. For example: OpCapability DenormFlushToZero
  3. Specify where to apply it. For example, we would like to flush to zero all fp64 denormalss in the %main function of a shader: OpExecutionMode %main DenormFlushToZero 64. If we want to apply different modes, we would repeat that line with the needed ones.
  4. Profit!

I implemented the support of this extensions for the Anvil's supported GPUs (Broadwell, Skylake, Kabylake and newer), although we don't support all the capabilities. For example on Broadwell, float16 denormals are not supported, and the support for flushing to zero the float16 denormals is not supported for all the instructions in the rest of generations.

If you are interested, the patches are now under review :-) As there are not real world code using this feature yet, please fill any bug you find about this in our bugzilla.

05 Dec 2018 3:57pm GMT

04 Dec 2018

feedplanet.freedesktop.org

Iago Toral: VK_KHR_shader_float16_int8 on Anvil

The last time I talked about my driver work was to announce the implementation of the shaderInt16 feature for the Anvil Vulkan driver back in May, and since then I have been working on VK_KHR_shader_float16_int8, a new Vulkan extension recently announced by the Khronos group, for which I have just posted initial patches in mesa-dev supporting Broadwell and later Intel platforms.

As you probably guessed by the name, this extension enables Vulkan to consume SPIR-V shaders that use of Float16 and Int8 types in arithmetic operations, extending the functionality included with VK_KHR_16bit_storage and VK_KHR_8bit_storage, which was limited to load/store operations. In theory, applications that do not need the range and precision of regular 32-bit floating point and integers, can use these new types to improve performance by increasing ALU throughput and reducing register pressure, which in some platforms can also lead to improved parallelism.

In the case of the Intel platforms initial testing done by Intel suggests that better ALU throughput is expected when issuing half-float instructions. Lower register pressure is also expected, at least for SIMD16 fragment and compute shaders, where we can pack all 16-channels worth of half-float data into a single GPU register, which could significantly improve performance for shaders that would otherwise need to spill registers to memory.

Another neat thing is that while VK_KHR_shader_float16_int8 is a Vulkan extension, its implementation is mostly API agnostic, so most of the work we did here should also help us have a proper mediump implementation for GLSL ES shaders in the future.

There are a few caveats to consider as well though: on some hardware platforms smaller bit-sizes have certain hardware restrictions that may lead to emitting worse shader code in some scenarios, and generally, Mesa's compiler infrastructure (and the Intel compiler backend in particular) have a long history of being 32-bit only, so there are parts of the compiler stack that still work better for 32-bit code.

Because VK_KHR_shader_float16_int8 is a brand new feature, we don't really have any real world use cases yet. This is on top of the fact that Mesa's compiler backends have been mostly (or exclusively) 32-bit aware until now (and more recently 64-bit too), so going forward I would expect a lot of focus on making our compiler be as robust (and optimal) for 16-bit code as it is for 32-bit code.

While we are already aware of a few areas where we can do better and I am currently working on addressing a few of these, one of the major limiting factors we have at the moment is the fact that the only source of 16-bit shaders available to us is the Khronos CTS, which due to its particular motivation, is very different from real world shader workloads and it is not a valid source material to drive compiler optimization work. Unfortunately, it might take some time until we start seeing applications using these new features, so in the meantime we will need to find other ways to drive further work in this area, and I think our best option here might be GLSL ES's mediump and lowp qualifiers.

GLSL ES mediump and lowp qualifiers have been around for a long time but they are only defined as hints to the shader compiler that lower precision is acceptable and we have never really used them to emit half-float code. Thankfully, Topi Pohjolainen from Intel has been working on this for a while, which would open up a much better scenario for improving our 16-bit compiler paths, so this is something I am really looking forward to.

Finally, as I say above, we could could definitely use more testing and feedback from real world use cases, so if you decide to use this feature in your next project and you hit any bugs, please be sure to file them in Bugzilla so we can continue to improve our implementation.

04 Dec 2018 8:25am GMT

03 Dec 2018

feedplanet.freedesktop.org

Robert Foss: Running Docker privileged inside of LXC / LXD

The architecture is a bit of container matroska, but what we're trying to achieve is running Docker privileged inside of a LXC container on a baremetal host.

Alt text

Setup container on LXC Host

In order to give Docker in the guest privileges, the guest container itself has to be given privileges.

There is no simple switch for doing this in LXC unfortunately, but a few config options will do the trick.

lxc launch images:ubuntu/bionic container

lxc config set container security.nesting true
lxc config set container security.privileged true
cat <<EOT | lxc config set container raw.lxc -
lxc.cgroup.devices.allow = a
lxc.cap.drop =
EOT

lxc restart container

Setup docker on container

Just to verify that this works, start a privileged Docker container …

03 Dec 2018 6:00pm GMT

Peter Hutterer: ggkbdd is a generic gaming keyboard daemon

Last week while reviewing a patch I read that some gaming keyboards have two modes - keyboard mode and gaming mode. When in gaming mode, the keys send out pre-recorded macros when pressed. Presumably (I am not a gamer) this is to record keyboard shortcuts to have quicker access to various functionalities. The macros are stored in the hardware and are thus relatively independent of the host system. Pprovided you have access to the custom protocol, which you probably don't when you're on Linux. But I digress.

I reckoned this could be done in software and work with any 5 dollar USB keyboard. A few hours later, I have this working now: ggkbdd. It sits directly above the kernel and waits for key events. Once the 'mode key' is hit, the keyboard will send pre-configured key sequences for the respective keys. Hitting the mode key again (or ESC) switches back to normal mode.

There's a lot of functionality that is missing such as integration with the desktop (probably via DBus), better security (dropping privs, masking the fd to avoid accidental key logging), better system integration (request fds from logind, possibly through the compositor). And error handling, etc. I think the total time on this spent is somewhere between 3 and 4h, and that includes the time to write this blog post and debug the systemd unit autostartup. There are likely other projects that solve it the same way, or at least in a similar manner. I didn't check.

This was done as proof-of-concept and

In the grand glorious future and provided this is indeed something generally useful, this would need compositor integration. Not sure we'll ever get to that point. Meanwhile, consider this a code drop for a proof-of-concept and expect that you'll have to fix any bugs yourself.

03 Dec 2018 5:43am GMT

Dave Airlie (blogspot): Open source compute stack talk from Linux Plumbers Conference 2018

I spoke at Linux Plumbers Conference 2018 in Vancouver a few weeks ago, about CUDA and the state of open source compute stacks.

The video is now available.

https://www.youtube.com/watch?v=d94N2Lu4x9s


03 Dec 2018 1:43am GMT