27 May 2016


Julien Danjou: Gnocchi talk at the Paris Monitoring Meetup #6

Last week was the sixth edition of the Paris Monitoring Meetup, where I was invited as a speaker to present and talk about Gnocchi.

There was around 50 persons in the room, listening to my presentation of Gnocchi.

The talk went fine and I had a few interesting questions and feedback. One interesting point that keeps coming when talking about Gnocchi, is its OpenStack label, which scares away a lot of people. We definitely need to continue explaining that the project work stand-alone has a no dependency on OpenStack, just a great integration with it.

The slides are available online for those who are interested and may have not been present that day!

The Monitoring-fr organization also interviewed me after the meetup about Gnocchi. The interview is in French, obviously. I talk about Gnocchi, what it does, how it does it and why we started the project a couple of years ago. Enjoy, and let me know what you think!

27 May 2016 1:54pm GMT

25 May 2016


Olivier Crête: GStreamer Spring Hackfest 2016

After missing the last few GStreamer hackfests I finally managed to attend this time. It was held in Thessaloniki, Greece's second largest city. The city is located by the sea side and the entire hackfest and related activities were either directly by the sea or just a couple blocks away.

Collabora was very well represented, with Nicolas, Mathieu, Lubosz also attending.

Nicolas concentrated his efforts on making kmssink and v4l2dec work together to provide zero-copy decoding and display on a Exynos 4 board without a compositor or other form of display manager. Expect a blog post soon explaining how to make this all fit together.

Lubosz showed off his VR kit. He implemented a viewer for planar point clouds acquired from a Kinect. He's working on a set of GStreamer plugins to play back spherical videos. He's also promised to blog about all this soon!

Mathieu started the hackfest by investigating the intricacies of Albanian customs, then arrived on the second day in Thessaloniki and hacked on hotdoc, his new fancy documentation generation tool. He'll also be posting a blog about it, however in the meantime you can read more about it here.

As for myself, I took the opportunity to fix a couple GStreamer bugs that really annoyed me. First, I looked into bug #766422: why glvideomixer and compositor didn't work with RTSP sources. Then I tried to add a ->set_caps() virtual function to GstAggregator, but it turns out I first needed to delay all serialized events to the output thread to get predictable outcomes and that was trickier than expected. Finally, I got distracted by a bee and decided to start porting the contents of docs.gstreamer.com to Markdown and updating it to the GStreamer 1.0 API so we can finally retire the old GStreamer.com website.

I'd also like to thank Sebastian and Vivia for organising the hackfest and for making us all feel welcomed!

GStreamer Hackfest Venue

25 May 2016 8:43pm GMT

Bastien Nocera: Blog backlog, Post 3, DisplayLink-based USB3 graphics support for Fedora

Last year, after DisplayLink released the first version of the supporting tools for their USB3 chipsets, I tried it out on my Dell S2340T.

As I wanted a clean way to test new versions, I took Eric Nothen's RPMs, and updated them along with newer versions, automating the creation of 32- and 64-bit x86 versions.

The RPM contains 3 parts, evdi, a GPLv2 kernel module that creates a virtual display, the LGPL library to access it, and a proprietary service which comes with "firmware" files.

Eric's initial RPMs used the precompiled libevdi.so, and proprietary bits, compiling only the kernel module with dkms when needed. I changed this, compiling the library from the upstream repository, using the minimal amount of pre-compiled binaries.

This package supports quite a few OEM devices, but does not work correctly with Wayland, so you'll need to disable Wayland support in /etc/gdm/custom.conf if you want it to work at the login screen, and without having to restart the displaylink.service systemd service after logging in.

Plugged in via DisplayPort and USB (but I can only see one at a time)

The source for the RPM are on GitHub. Simply clone and run make in the repository to create 32-bit and 64-bit RPMs. The proprietary parts are redistributable, so if somebody wants to host and maintain those RPMs, I'd be glad to pass this on.

25 May 2016 4:18pm GMT

19 May 2016


Nicolai Hähnle: A little 5-to-8-bit mystery

Writing the accelerated glReadPixels path for reads to PBOs for Gallium, I wanted to make sure the various possible format conversions are working correctly. They do, but I noticed something strange: when reading from a GL_RGB565 framebuffer to GL_UNSIGNED_BYTE, I was getting tiny differences in the results depending on the code path that was taken. What was going on?

Color values are conceptually floating point values, but most of the time, so-called normalized formats are used to store the values in memory. In fact, many probably think of color values as 8-bit normalized values by default, because of the way many graphics programs present color values and because of the #cccccc color format of HTML.

Normalized formats generalize this well-known notion to an arbitrary number of bits. Given a normalized integer value x in N bits, the corresponding floating point value is x / (2**N - 1) - for example, x / 255 for 8 bits and x / 31 for 5 bits. When converting between normalized formats with different bit depths, the values cannot be mapped perfectly. For example, since 255 and 31 are coprime, the only floating point values representable exactly in both 5- and 8-bit channels are 0.0 and 1.0.

So some imprecision is unavoidable, but why was I getting different values in different code paths?

It turns out that the non-PBO path first blits the requested framebuffer region to a staging texture, from where the result is then memcpy()d to the user's buffer. It is the GPU that takes care of the copy from VRAM, the de-tiling of the framebuffer, and the format conversion. The blit uses the normal 3D pipeline with a simple fragment shader that reads from the "framebuffer" (which is really bound as a texture during the blit) and writes to the staging texture (which is bound as the framebuffer).

Normally, fragment shaders operate on 32-bit floating point numbers. However, Radeon hardware allows an optimization where color values are exported from the shader to the CB hardware unit as 16-bit half-precision floating point numbers when the framebuffer does not require the full floating point precision. This is useful because it reduces the bandwidth required for shader exports and allows more shader waves to be in flight simultaneously, because less memory is reserved for the exports.

And it turns out that the value 20 in a 5-bit color channel, when first converted into half-float (fp16) format, becomes 164 in an 8-bit color channel, even though the 8-bit color value that is closest to the floating point number represented by 20 in 5-bit is actually 165. The temporary conversion to fp16 cuts off a bit that would make the difference.

Intrigued, I wrote a little script to see how often this happens. It turns out that 20 in a 5-bit channel and 32 in a 6-bit channel are the only cases where the temporary conversion to fp16 leads to the resulting 8-bit value to be off by one. Luckily, people don't usually use GL_RGB565 framebuffers... and as a general rule, taking a value from an N-bit channel, converting it to fp16, and then storing the value again in an N-bit value (of the same bit depth!) will always result in what we started out with, as long as N <= 11 (figuring out why is an exercise left to the reader ;-)) - so the use cases we really care about are fine.

19 May 2016 4:18am GMT

13 May 2016


Bastien Nocera: Blutella, a Bluetooth speaker receiver

Quite some time ago, I was asked for a way to use the AV amplifier (which has a fair bunch of speakers connected to it) in our living-room that didn't require turning on the TV to choose a source.

I decided to try and solve this problem myself, as an exercise rather than a cost saving measure (there are good-quality Bluetooth receivers available for between 15 and 20€).

Introducing Blutella

I found this pot of Nutella in my travels (in Europe, smaller quantities are usually in a jar that looks like a mustard glass, with straight sides) and thought it would be a perfect receptacle for a CHIP, to allow streaming via Bluetooth to the amp. I wanted to make a nice how-to for you, dear reader, but best laid plans...

First, the materials:

That's around 10€ in parts (cables always seem to be expensive), not including our salvaged Nutella jar, and the CHIP itself (9$ + shipping).

You'll start by painting the whole of the jar, on the inside, with the acrylic paint. Allow a couple of days to dry, it'll be quite thick.

So, the plan that went awry. Turns out that the CHIP, with the cables plugged in, doesn't fit inside this 140g jar of Nutella. I also didn't make the holes exactly in the right place. The CHIP is tiny, but not small enough to rotate inside the jar without hitting the side, and the groove to screw the cap also have only one position.

Anyway, I pierced two holes in the lid for the audio jack and the USB charging cable, stuffed the CHIP inside, and forced the lid on so it clipped on the jar's groove.

I had nice photos with foam I cut to hold the CHIP in place, but the finish isn't quite up to my standards. I guess that means I can attempt this again with a bigger jar ;)

The software

After flashing the CHIP with Debian, I logged in, and launched a script which I put together to avoid either long how-tos, or errors when I tried to reproduce the setup after a firmware update and reset.

The script for setting things up is in the CHIP-bluetooth-speaker repository. There are a few bugs due to drivers, and lack of integration, but this blog is the wrong place to track them, so check out the issues list.

Apart from those driver problems, I found the integration between PulseAudio and BlueZ pretty impressive, though I wish there was a way for the speaker to reconnect to the phone I streamed from when turned on again, as Bluetooth speakers and headsets do, removing one step from playing back audio.

13 May 2016 4:30pm GMT

12 May 2016


Christian Schaller: H264 in Fedora Workstation

So after a lot of work to put the policies and pieces in place we are now giving Fedora users access to the OpenH264 plugin from <a href="http://www.cisco.comCisco.
Dennis Gilmore posted a nice blog entry explaining how you can install OpenH264 in Fedora 24.

That said the plugin is of limited use today for a variety of reasons. The first being that the plugin only supports the Baseline profile. For those not intimately familiar with what H264 profiles are they are
basically a way to define subsets of the codec. So as you might guess from the name Baseline, the Baseline profile is pretty much at the bottom of the H264 profile list and thus any file encoded with another profile of H264 will not work with it. The profile you need for most online videos is the High profile. If you encode a file using OpenH264 though it will work with any decoder that can do Baseline or higher, which is basically every one of them.
And there are some things using H264 Baseline, like WebRTC.

But we realize that to make this a truly useful addition for our users we need to improve the profile support in OpenH264 and luckily we have Wim Taymans looking at the issue and he will work with Cisco engineers to widen the range of profiles supported.

Of course just adding H264 doesn't solve the codec issue, and we are looking at ways to bring even more codecs to Fedora Workstation. Of course there is a limit to what we can do there, but I do think we will have some announcements this year that will bring us a lot closer and long term I am confident that efforts like Alliance for Open Media will provide us a path for a future dominated by royalty free media formats.

But for now thanks to everyone involved from Cisco, Fedora Release Engineering and the Workstation Working Group for helping to make this happen.

12 May 2016 2:30pm GMT

11 May 2016


Lennart Poettering: CfP is now open

The systemd.conf 2016 Call for Participation is Now Open!

We'd like to invite presentation and workshop proposals for systemd.conf 2016!

The conference will consist of three parts:

We are now accepting submissions for the first three days: proposals for workshops, training sessions and regular talks. In particular, we are looking for sessions including, but not limited to, the following topics:

Please submit your proposals by August 1st, 2016. Notification of acceptance will be sent out 1-2 weeks later.

If submitting a workshop proposal please contact the organizers for more details.

To submit a talk, please visit our CfP submission page.

For further information on systemd.conf 2016, please visit our conference web site.

11 May 2016 10:00pm GMT

Daniel Vetter: Neat drm/i915 Stuff for 4.7

The 4.6 release is almost out of the door, it's time to look at what's in store for 4.7.
Let's first look at the epic saga called atomic support. In 4.7 the atomic watermark update support for Ironlake through Broadwell from Matt Roper, Ville Syrjälä and others finally landed. This took about 3 attempts to get merged because there's lots of small little corner cases that caused regressions each time around, but it's finally done. And it's an absolutely key piece for atomic support, since Intel hardware does not support atomic updates of the watermark settings for the display fetch fifos. And if those values are wrong tearings and other ugly things will result. We still need corresponding support for other platforms, but this is a really big step. But that's not the only atomic work: Maarten Lankhorst made the hardware state checker atomic, and there's been tons of smaller things all over to move the driver towards the shiny new.

Another big feature on the display side is color management, implemented by Lionel Landwerlin, and then fixes to make it fully atomic from Maarten. Color management aims for more accurate reproduction of a well definied color space on panels, using a de-gamma table, then a color matrix, and finally a gamma table.

For platform enabling the big thing is support for DSI panels on Broxton from Jani Nikula and Ramalingam C. One fallout from this effort is the cleaned up VBT parsing code, done by Jani. There's now a clean split between parsing the different VBT versions on all the various platforms, now neatly consolidated, and using that information in different places within the driver. Ville also hooked up upscaling/panel fitting for DSI panels on all platforms.

Looking more at driver internals Ander Conselvan de Oliviera and Ville refactored the entire display PLL code on all platforms, with the goal to reuse it in the DP detection code for upfront link training. This is needed to detect the link configuration in certain situations like USB type C connectors. Shubhangi Shrivastava reworked the DP detection code itself, again to prep for these features. Still on pure display topics Ville fixed lots of underrun issues to appease our CI on lots of platforms. Together with the atomic watermark updates this should shut up one of the largest sources of noise in our test results.

Moving on to power management work the big thing is lots of small fixes for the runtime PM support all over the place from Imre Deak and Ville, with a big focus on the Broxton platform. And while we talk features affecting the entire driver: Imre added fault injection to the driver load paths so that we can start to exercise all that code in an automated way.

Finally looking at the render/GEM side of the driver the short summary is that Tvrtko Ursulin and Chris Wilson worked the code all over the place: A cleanup up and tuned forcewake handling code from Tvrtko, fixes for more userptr corner cases from Chris, a new notifier to handle vmap exhaustion and assorted polish in the related shrinker code, cleaned up and fixed handling of gpu reset corner cases, fixes for context related hard hangs on Sandybridge and Ironlake, large-scale renaming of parameters and structures to realign old code with the newish execlist hardware mode, the list goes on. And finally a rather big piece, and one which causes some trouble, is all the work to speed up the execlist code, with a big focusing on reducing interrupt handling overhead. This was done by moving the expensive parts of execlist interrupt handling into a tasklet. Unfortunately that uncovered some bugs in our interrupt handling on Braswell, so Ville jumped in and fixed it all up, plus of course removed some cruft and applied some nice polish.

Other work in the GT are are gpu hang fixes for Skylake GT3 and GT4 configurations from Mika Kuoppala. Mika also provided patches to improve the edram handling on those same chips. Alex Dai and Dave Gordon kept working on making GuC ready for prime time, but not yet there. And Peter Antoine improved the MOCS support to work on all engines.

And of course there's been tons of smaller improvements, bugfixes, cleanups and refactorings all over the place, as usual.

11 May 2016 3:02pm GMT

09 May 2016


Bastien Nocera: Blog backlog, Post 2, xdg-app bundles

I recently worked on creating an xdg-app bundle for GNOME Videos, aka Totem, so it would be built along with other GNOME applications, every night, and made available via the GNOME xdg-app repositories.

There's some functionality that's not working yet though:

However, I created a bundle that extends the freedesktop runtime, that contains gst-libav. We'll need to figure out a way to distribute it in a way that doesn't cause problems for US hosts.

As we also have a recurring problem in Fedora with rpmfusion being out of date, and I sometimes need a third-party movie player to test things out, I put together an mpv manifest, which is the only MPlayer-like with a .desktop and a GUI when launched without any command-line arguments.

Finally, I put together a RetroArch bundle for research into a future project, which uncovered the lack of joystick/joypad support in the xdg-app sandbox.

Hopefully, those few manifests will be useful to other application developers wanting to distribute their applications themselves. There are some other bundles being worked on, and that can be used as examples, linked to in the Wiki.

09 May 2016 4:15pm GMT

06 May 2016


Matthias Klumpp: Adventures in D programming

I recently wrote a bigger project in the D programming language, the appstream-generator (asgen). Since I rarely leave the C/C++/Python realm, and came to like many aspects of D, I thought blogging about my experience could be useful for people considering to use D.

Disclaimer: I am not an expert on programming language design, and this is not universally valid criticism of D - just my personal opinion from building one project with it.

Why choose D in the first place?

The previous AppStream generator was written in Python, which wasn't ideal for the task for multiple reasons, most notably multiprocessing and LMDB not working well together (and in general, multiprocessing being terrible to work with) and the need to reimplement some already existing C code in Python again.

So, I wanted a compiled language which would work well together with the existing C code in libappstream. Using C was an option, but my least favourite one (writing this in C would have been much more cumbersome). I looked at Go and Rust and wrote some small programs performing basic operations that I needed for asgen, to get a feeling for the language. Interfacing C code with Go was relatively hard - since libappstream is a GObject-based C library, I expected to be able to auto-generate Go bindings from the GIR, but there were only few outdated projects available which did that. Rust on the other hand required the most time in learning it, and since I only briefly looked into it, I still can't write Rust code without having the coding reference open. I started to implement the same examples in D just for fun, as I didn't plan to use D (I was aiming at Go back then), but the language looked interesting. The D language had the huge advantage of being very familiar to me as a C/C++ programmer, while also having a rich standard library, which included great stuff like std.concurrency.Generator, std.parallelism, etc. Translating Python code into D was incredibly easy, additionally a gir-d-generator which is actively maintained exists (I created a small fork anyway, to be able to directly link against the libappstream library, instead of dynamically loading it).

What is great about D?

This list is just a huge braindump of things I had on my mind at the time of writing 😉

Interfacing with C

There are multiple things which make D awesome, for example interfacing with C code - and to a limited degree with C++ code - is really easy. Also, working with functions from C in D feels natural. Take these C functions imported into D:


struct _mystruct {}
alias mystruct_p = _mystruct*;

mystruct_p = mystruct_create ();
mystruct_load_file (mystruct_p my, const(char) *filename);
mystruct_free (mystruct_p my);

You can call them from D code in two ways:

auto test = mystruct_create ();
// treating "test" as function parameter
mystruct_load_file (test, "/tmp/example");
// treating the function as member of "test"
test.mystruct_load_file ("/tmp/example");
test.mystruct_free ();

This allows writing logically sane code, in case the C functions can really be considered member functions of the struct they are acting on. This property of the language is a general concept, so a function which takes a string as first parameter, can also be called like a member function of string.

Writing D bindings to existing C code is also really simple, and can even be automatized using tools like dstep. Since D can also easily export C functions, calling D code from C is also possible.

Getting rid of C++ "cruft"

There are many things which are bad in C++, some of which are inherited from C. D kills pretty much all of the stuff I found annoying. Some cool stuff from D is now in C++ as well, which makes this point a bit less strong, but it's still valid. E.g. getting rid of the #include preprocessor dance by using symbolic import statements makes sense, and there have IMHO been huge improvements over C++ when it comes to metaprogramming.

Incredibly powerful metaprogramming

Getting into detail about that would take way too long, but the metaprogramming abilities of D must be mentioned. You can do pretty much anything at compiletime, for example compiling regular expressions to make them run faster at runtime, or mixing in additional code from string constants. The template system is also very well thought out, and never caused me headaches as much as C++ sometimes manages to do.

Built-in unit-test support

Unittesting with D is really easy: You just add one or more unittest { } blocks to your code, in which you write your tests. When running the tests, the D compiler will collect the unittest blocks and build a test application out of them.

The unittest scope is useful, because you can keep the actual code and the tests close together, and it encourages writing tests and keep them up-to-date. Additionally, D has built-in support for contract programming, which helps to further reduce bugs by validating input/output.

Safe D

While D gives you the whole power of a low-level system programming language, it also allows you to write safer code and have the compiler check for that, while still being able to use unsafe functions when needed.

Unfortunately, @safe is not the default for functions though.

Separate operators for addition and concatenation

D exclusively uses the + operator for addition, while the ~ operator is used for concatenation. This is likely a personal quirk, but I love it very much that this distinction exists. It's nice for things like addition of two vectors vs. concatenation of vectors, and makes the whole language much more precise in its meaning.

Optional garbage collector

D has an optional garbage collector. Developing in D without GC is currently a bit cumbersome, but these issues are being addressed. If you can live with a GC though, having it active makes programming much easier.

Built-in documentation generator

This is almost granted for most new languages, but still something I want to mention: Ddoc is a standard tool to generate code documentation for D code, with a defined syntax for describing function parameters, classes, etc. It will even take the contents of a unittest { } scope to generate automatic examples for the usage of a function, which is pretty cool.

Scope blocks

The scope statement allows one to execute a bit of code before the function exists, when it failed or was successful. This is incredibly useful when working with C code, where a free statement needs to be issued when the function is exited, or some arbitrary cleanup needs to be performed on error. Yes, we do have smart pointers in C++ and - with some GCC/Clang extensions - a similar feature in C too. But the scopes concept in D is much more powerful. See Scope Guard Statement for details.

Built-in syntax for parallel programming

Working with threads is so much more fun in D compared to C! I recommend taking a look at the parallelism chapter of the "Programming in D" book.

"Pure" functions

D allows to mark functions as purely-functional, which allows the compiler to do optimizations on them, e.g. cache their return value. See pure-functions.

D is fast!

D matches the speed of C++ in almost all occasions, so you won't lose performance when writing D code - that is, unless you have the GC run often in a threaded environment.

Very active and friendly community

The D community is very active and friendly - so far I only had good experience, and I basically came into the community asking some tough questions regarding distro-integration and ABI stability of D. The D community is very enthusiastic about pushing D and especially the metaprogramming features of D to its limits, and consists of very knowledgeable people. Most discussion happens at the forums/newsgroups at forum.dlang.org.

What is bad about D?

Half-proprietary reference compiler

This is probably the biggest issue. Not because the proprietary compiler is bad per se, but because of the implications this has for the D ecosystem.

For the reference D compiler, Digital Mars' D (DMD), only the frontend is distributed under a free license (Boost), while the backend is proprietary. The FLOSS frontend is what the free compilers, LLVM D Compiler (LDC) and GNU D Compiler (GDC) are based on. But since DMD is the reference compiler, most features land there first, and the Phobos standard library and druntime is tuned to work with DMD first.

Since major Linux distributions can't ship with DMD, and the free compilers GDC and LDC lack behind DMD in terms of language, runtime and standard-library compatibility, this creates a split world of code that compiles with LDC, GDC or DMD, but never with all D compilers due to it relying on features not yet in e.g. GDCs Phobos.

Especially for Linux distributions, there is no way to say "use this compiler to get the best and latest D compatibility". Additionally, if people can't simply apt install latest-d, they are less likely to try the language. This is probably mainly an issue on Linux, but since Linux is the place where web applications are usually written and people are likely to try out new languages, it's really bad that the proprietary reference compiler is hurting D adoption in that way.

That being said, I want to make clear DMD is a great compiler, which is very fast and build efficient code. I only criticise the fact that it is the language reference compiler.

UPDATE: To clarify the half-proprietary nature of the compiler, let me quote the D FAQ:

The front end for the dmd D compiler is open source. The back end for dmd is licensed from Symantec, and is not compatible with open-source licenses such as the GPL. Nonetheless, the complete source comes with the compiler, and all development takes place publically on github. Compilers using the DMD front end and the GCC and LLVM open source backends are also available. The runtime library is completely open source using the Boost License 1.0. The gdc and ldc D compilers are completely open sourced.

Phobos (standard library) is deprecating features too quickly

This basically goes hand in hand with the compiler issue mentioned above. Each D compiler ships its own version of Phobos, which it was tested against. For GDC, which I used to compile my code due to LDC having bugs at that time, this means that it is shipping with a very outdated copy of Phobos. Due to the rapid evolution of Phobos, this meant that the documentation of Phobos and the actual code I was working with were not always in sync, leading to many frustrating experiences.

Furthermore, Phobos is sometimes removing deprecated bits about a year after they have been deprecated. Together with the older-Phobos situation, you might find yourself in a place where a feature was dropped, but the cool replacement is not yet available. Or you are unable to import some 3rd-party code because it uses some deprecated-and-removed feature internally. Or you are unable to use other code, because it was developed with a D compiler shipping with a newer Phobos.

This is really annoying, and probably the biggest source of unhappiness I had while working with D - especially the documentation not matching the actual code is a bad experience for someone new to the language.

Incomplete free compilers with varying degrees of maturity

LDC and GDC have bugs, and for someone new to the language it's not clear which one to choose. Both LDC and GDC have their own issues at time, but they are rapidly getting better, and I only encountered some actual compiler bugs in LDC (GDC worked fine, but with an incredibly out-of-date Phobos). All issues are fixed meanwhile, but this was a frustrating experience. Some clear advice or explanation which of the free compilers is to prefer when you are new to D would be neat.

For GDC in particular, being developed outside of the main GCC project is likely a problem, because distributors need to manually add it to their GCC packaging, instead of having it readily available. I assume this is due to the DRuntime/Phobos not being subjected to the FSF CLA, but I can't actually say anything substantial about this issue. Debian adds GDC to its GCC packaging, but e.g. Fedora does not do that.

No ABI compatibility

D has a defined ABI - too bad that in reality, the compilers are not interoperable. A binary compiled with GDC can't call a library compiled with LDC or DMD. GDC actually doesn't even support building shared libraries yet. For distributions, this is quite terrible, because it means that there must be one default D compiler, without any exception, and that users also need to use that specific compiler to link against distribution-provided D libraries. The different runtimes per compiler complicate that problem further.

The D package manager, dub, does not yet play well with distro packaging

This is an issue that is important to me, since I want my software to be easily packageable by Linux distributions. The issues causing packaging to be hard are reported as dub issue #838 and issue #839, with quite positive feedback so far, so this might soon be solved.

The GC is sometimes an issue

The garbage collector in D is quite dated (according to their own docs) and is currently being reworked. While working with asgen, which is a program creating a large amount of interconnected data structures in a threaded environment, I realized that the GC is significantly slowing down the application when threads are used (it also seems to use UNIX signals SIGUSR1 and SIGUSR2 to stop/resume threads, which I still find odd). Also, the GC performed poorly on memory pressure, which did get asgen killed by the OOM killer on some more memory-constrained machines. Triggering a manual collection run after a large amount of these interconnected data structures wasn't needed anymore solved this problem for most systems, but it would of course have been better to not needing to give the GC any hints. The stop-the-world behavior isn't a problem for asgen, but it might be for other applications.

These issues are at time being worked on, with a GSoC project laying the foundation for further GC improvements.

"version" is a reserved word

Okay, that is admittedly a very tiny nitpick, but when developing an app which works with packages and versions, it's slightly annoying. The version keyword is used for conditional compilation, and needing to abbreviate it to ver in all parts of the code sucks a little (e.g. the "Package" interface can't have a property "version", but now has "ver" instead).

The ecosystem is not (yet) mature

In general it can be said that the D ecosystem, while existing for almost 9 years, is not yet that mature. There are various quirks you have to deal with when working with D code on Linux. It's always nothing major, usually you can easily solve these issues and go on, but it's annoying to have these papercuts.

This is not something which can be resolved by D itself, this point will solve itself as more people start to use D and D support in Linux distributions gets more polished.


I like to work with D, and I consider it to be a great language - the quirks it has in its toolchain are not that bad to prevent writing great things with it.

At time, if I am not writing a shared library or something which uses much existing C++ code, I would prefer D for that task. If a garbage collector is a problem (e.g. for some real-time applications, or when the target architecture can't run a GC), I would not recommend to use D. Rust seems to be the much better choice then.

In any case, D's flat learning curve (for C/C++ people) paired with the smart choices taken in language design, the powerful metaprogramming, the rich standard library and helpful community makes it great to try out and to develop software for scenarios where you would otherwise choose C++ or Java. Quite honestly, I think D could be a great language for tasks where you would usually choose Python, Java or C++, and I am seriously considering to replace quite some Python code with D code. For very low-level stuff, C is IMHO still the better choice.

As always, choosing the right programming language is only 50% technical aspects, and 50% personal taste 😉

UPDATE: To get some idea of D, check out the D tour on the new website tour.dlang.org.

06 May 2016 6:46am GMT

04 May 2016


Peter Hutterer: The difference between uinput and evdev

A recurring question I encounter is the question whether uinput or evdev should be the approach do implement some feature the user cares about. This question is unfortunately wrongly framed as uinput and evdev have no real overlap and work independent of each other. This post outlines what the differences are. Note that "evdev" here refers to the kernel API, not to the X.Org evdev driver.

First, the easy flowchart: do you have to create a new virtual device that has a set of specific capabilities? Use uinput. Do you have to read and handle events from an existing device? Use evdev. Do you have to create a device and read events from that device? You (probably) need two processes, one doing the uinput bit, one doing the evdev bit.

Ok, let's talk about the difference between evdev and uinput. evdev is the default input API that all kernel input device nodes provide. Each device provides one or more /dev/input/eventN nodes that a process can interact with. This usually means checking a few capability bits ("does this device have a left mouse button?") and reading events from the device. The events themselves are in the form of struct input_event, defined in linux/input.h and consist of a event type (relative, absolute, key, ...) and an event code specific to the type (x axis, left button, etc.). See linux/input-event-codes.h for a list or linux/input.h in older kernels.Specific to evdev is that events are serialised - framed by events of type EV_SYN and code SYN_REPORT. Anything before a SYN_REPORT should be considered one logical hardware event. For example, if you receive an x and y movement within the same SYN_REPORT frame, the device has moved diagonally.

Any event coming from the physical hardware goes into the kernel's input subsystem and is converted to an evdev event that is then available on the event node. That's pretty much it for evdev. It's a fairly simple API but it does have some quirks that are not immediately obvious so I recommend using libevdev whenever you actually need to communicate with a kernel device directly.

uinput is something completely different. uinput is an kernel device driver that provides the /dev/uinput node. A process can open this node, write a bunch of custom commands to it and the kernel then creates a virtual input device. That device, like all others, presents an /dev/input/eventN node. Any event written to the /dev/uinput node will re-appear in that /dev/input/eventN node and a device created through uinput looks just pretty much like a physical device to a process. You can detect uinput-created virtual devices, but usually a process doesn't need to care so all the common userspace (libinput, Xorg) doesn't bother. The evemu tool is one of the most commonly used applications using uinput.

Now, there is one thing that may cause confusion: first, to set up a uinput device you'll have to use the familiar evdev type/code combinations (followed-by a couple of uinput-specific ioctls). Events written to uinput also use the struct input_event form, so looking at uinput code one can easily mistake it for evdev code. Nevertheless, the two serve a completely different purpose. As with evdev, I recommend using libevdev to initalise uinput devices. libevdev has a couple of uinput-related functions that make life easier.

Below is a basic illustration of how things work together. The physical devices send their events through the event nodes and libinput is a process that reads those events. evemu talks to the uinput module and creates a virtual device which then too sends events through its event node - for libinput to read.

04 May 2016 11:42pm GMT

Rob Clark: Freedreno (not so) periodic update

Since I seem to be not so good at finding time for blog updates recently, this update probably covers a greater timespan than it should, and some of this is already old news ;-)

Already quite some time ago, but in case you didn't already notice: with the mesa 11.1 release, freedreno now supports up to (desktop) gl3.1 on both a3xx and a4xx (in addition to gles3). Which is high enough to show up on the front page at glxinfo. (Which, btw, is a useful tool to see exactly which gl/gles extensions are supported by which version of mesa on various different hw.)

A couple months back, I spent a bit of time starting to look at performance. On master now (so will be in 11.3), we have timestamp and time-elapsed query support for a4xx, and I may expose a few more performance counters (mostly for the benefit of gallium HUD). I still need to add support for a3xx, but already this is useful to help profile. In addition, I've cobbled together a simple fdperf cmdline tool:

I also got around to (finally) implementing hw binning support for a4xx, which for *some* games can have a pretty big perf boost:
  • glmark2 'refract' bench (an extreme example): 31fps -> 124fps
  • xonotic (med): 44.4fps -> 50.3fps
  • supertuxkart (new render engine): 15fps -> 19fps
More recently I've started to run the dEQP gles3 tests against freedreno. Initially the results where not too good, but since then I've fixed a couple thousand test cases.. fortunately it was just a few bugs and a couple missing workaround for hw bug/limitations (depending on how you look at it) which counted for the bulk of the fails. Now we are at 98.9% pass (or 99.5% if you don't count the 'skips' against the pass ratio). These fixes have also helped piglit, where we are now up to 98.3% pass. These figures are a4xx, but most of the fixes apply to a3xx as well.

I've also made some improvements in ir3 (shader compiler for a3xx and later) so the code it generates is starting to be pretty decent. The immediate->const lowering that I just pushed helps reduce register pressure in a lot of cases. We still need support for spilling, but at least now shadertoy (which is some sort of cruel joke against shader compiler writers) isn't a complete horror show:

In other cool news, in case you had not already seen: Rob Herring and John Stultz from linaro have been doing some cool work, with Rob getting android running on an upstream kernel plus mesa running on db410c and qemu (with freedreno and virtgl), and John taking all that, and getting it all running on a nexus7 tablet. (And more recently, getting wifi working as well.) I had the opportunity to see this in person when I was at Linaro Connect in March. It might not seem impressive if you are unfamiliar with the extent to which android device kernels diverge from upstream, but to see an upstream kernel running on an actual device with only ~50patches is quite a feat:

The UI was actually reasonably fast, despite not yet using overlays to bypass GPU for composition. But as ongoing work in drm/kms for explicit fencing, and mesa EGL_ANDROID_native_fence_sync land, we should be able to get hw composition working.

04 May 2016 11:38pm GMT

Bastien Nocera: Blog backlog, Post 1, Emoji

Short version

dnf copr enable hadess/emoji
dnf update cairo
dnf install eosrei-emojione-fonts

Long version

A little while ago, I was reading this article, called "Emoji: how do you get from U+1F355 to 🍕?", which said, and I reluctantly quote: "[...] and I don't know what Linux does, but it's probably black and white and who cares [...]".

Well. I care. And you probably do as well if your pizza slice above is black and white.

So I set out to check on the status of Behdad Esfahbod (or just "Behdad" as we know him)'s patches to add colour font support to cairo, which he presented at GUADEC in Strasbourg Gothenburg. It adds support for the "bitmap in font" as Android does, and as freetype supports.

It kind of worked, and Matthias Clasen reworked the patches a few times, completing the support. This is probably not the code that will be worked on and will land in cairo, but it's a good enough base for people interested in contributing to use.

After that, we needed something to display using that feature. We ended up using the same font recommended in this article, the Emoji One font.

There's still plenty to be done to support emojis, even after the cairo support is merged. We'd need a way to input emojis (maybe Lalo Martins is listening), and support in a lot of toolkits other than GNOME (Firefox only supports the SVG-in-OTF format, WebKit, Chrome, LibreOffice don't seem to know about colour fonts either).

You can find more information about design interests in GNOME around Emoji on the Wiki.

Update: Behdad's presentation was in Gothenburg, not Strasbourg. You can also see the video on YouTube.

04 May 2016 6:18pm GMT

Julien Danjou: The Hacker's Guide to Python 3 edition is out

Exactly a year ago, I released the second edition of my book The Hacker's Guide to Python. One more time, it has been a wonderful release and I received a lot of amazing feedback from my readers all over this year.

Since then, the book has been translated into 2 languages: Korean and Chinese. A few thousands of copies has been distributed there, and I'm very glad the book has been such a success. I'm looking into getting it translated into more languages - don't hesitate to get in touch with me if you have any interesting connections in your country.

For those who still don't know about this guide, that I first released a couple of years ago, let me sum up by saying it's the Python book that I always wanted to read, never found, and finally wrote. It does not cover the basics of the language, but deals with concrete problems, best practice and some of the languages internals.

It includes content about unit testing, methods, decorators, AST, distribution, documentation, functional programming, scaling, Python 3, etc. All of that made it pretty successful! It comes with awesome 9 interviews that I realized with some of my fellow experienced Python hackers and developers!

The paperback 3rd edition

The Korean edition!

In that 3rd edition, there is, like in each new edition, a few fixes on code, typos, etc. I guess books need a lot of time to become perfect! I also updated some of the content: things evolved a bit since I last revised the content a year ago. Finally, a new chapter about timestamps handling and timezone has made his appearance too.

If you didn't get the book yet, it's time to go check it out and use the coupon THGTP3LAUNCH to get 20 % off during the next 48 hours!

04 May 2016 3:00pm GMT

03 May 2016


Damien Lespiau: Testing for pending migrations in Django

DB migration support has been added in Django 1.7+, superseding South. More specifically, it's possible to automatically generate migrations steps when one or more changes in the application models are detected. Definitely a nice feature!

I've written a small generic unit-test that one should be able to drop into the tests directory of any Django project and that checks there's no pending migrations, ie. if the models are correctly in sync with the migrations declared in the application. Handy to check nobody has forgotten to git add the migration file or that an innocent looking change in models.py doesn't need a migration step generated. Enjoy!

See the code on djangosnippets or as a github gist!

03 May 2016 5:09pm GMT

02 May 2016


Corbin Simpson: Monte Compiler Ramble

Writing compilers is hard. I don't think that anybody disputes this. However, I've grown frustrated with the lack of compiler performance and robustness in the Monte toolchain. Monte will have a developer preview release in a few weeks and I need to get some stuff concerning compilers out of my head and onto the page.

Monte, the Mess

Right now, Monte is in the doldrums. We have deliberately wound down effort on features and exploration in order to produce a developer preview meant to pique interest and generate awareness of Monte, E, object capabilities, etc. As a result, it's worth taking stock of what we've built so far.

Monte's reference implementation is embodied by the Typhon VM, a JIT written in RPython which implements the runtime and interpreter, but does not do any parsing or expansion. Typhon is satisfactory, performing much like early CPython, and outperforming E-on-Java by a bit. However, its JIT does not provide any speed boost compared to interpretation; our execution model is far too sloppy. Additionally, the JIT is fragile and crash-prone as of writing, and we have it disabled by default.

Our current method of execution is to load Kernel-Monte, compile it to an in-memory bytecode resembling, but slightly different from, Smallcaps; and then provide user-built objects which interoperate with runtime objects and internally run this quasi-Smallcaps bytecode.

Performance is behind CPython by a serious but not insurmountable margin. This is unacceptable. One of the goals of Monte is to, by virtue of being less dynamic than Python, be faster than Python in execution. It's been a truism of compilers that lower expressiveness correlates with greater optimization opportunities for a long time, and clearly we are missing out.

Monte, the Metalanguage

A non-trivial portion of the ideology of Monte, which I did not realize when I first embarked on this journey, is that Monte is really an object calculus of some sort; it hides beneath it a simple core language (Kernel-Monte) that turns out to be a very simple universal computer based on message-passing. Almost everything that makes Monte distinct as a language is built on top of this core, from promises and vats, through modules and quasiliterals, to the entirety of the safe scope. The only gnarl that I have found while working with this core, in honesty, is the semantics of mutable names (var x := f()), which I am still on the fence about, but overall is not a tough complication. (Specifically, I don't know whether mutable slots should be created by a virtual machine instruction, or a primitive _makeVarSlot object.)

Unfortunately, Monte's metalanguage doesn't exactly correspond to anything in the literature. Worse, it somewhat resembles many different things in the literature simultaneously, making the choice of techniques much harder. Computer science, as a discipline, has developed an amazing corpus of compiler techniques, but they do require one to already have committed to a particular semantics, and choosing the semantic and evaluation model for Monte has been a challenge.

I'm only one person. Whatever I end up using has to be comprehensible to me, because otherwise I can't build it. This is unfortunate, as I'm something of a dunce, so I would prefer it if my semantics were simple. Additionally, typing is hard and it would be nice to find something easy to implement.

As a brief aside, I want to emphasize that I am not going to talk about parsing today. Monte's parsing story is very simple and solid, and the canonical expansion from Full-Monte into Kernel-Monte is also relatively well-understood. I want to talk about the stuff that makes compilers hard to scale; I want to talk about optimizations and modelling of semantics.

When I say "semantics of Monte" today, I am referring to the bundle of concepts that represent Monte's evaluation at its lowest level. Everything I'm talking about today starts at Kernel-Monte and (hopefully) only goes downward in expressiveness.

Monte, the Schemer

Strange as it might seem to children like myself, Monte is actually descended from Scheme via E, and this manifests in E's actor-like concurrency and also in the E documentation, which discusses E as a variant of lambda calculus.

What Maps Well

After slot expansion, (set!) bears clear similarity to the behavior of mutable names with VarSlot.

The general design of lexically-scoped closures on objects, and thus the optimization patterns, appear very similar between Monte and Scheme. For example, this commit was directly inspired by this Scheme compiler optimization, posted to Lambda the Ultimate a week prior.

List patterns are present in some Schemes, like Racket, and Monte's list patterns are still present in Kernel-Monte; one of the few explicit type-checked kernel situations. (I think that the only other one is the if expression's test… We don't currently require bindings to be :Binding.)

What Maps Poorly

Exceptions are the obvious problem. (call/cc) provides undelimited continuations, but ejectors are explicitly delimited continuations. Something like Oleg's shift/reset semantics, or Racket exceptions, provide sufficient structure to recover the semantics, but the difference is clear. Oleg only outlines how things work; he does not offer hints on optimization. There is a standard suite of optimizations on continuations when using CPS (continuation-passing style); however, CPS massively complicates implementation.

In particular, when using CPS, all method signatures are complicated by the transformation, which means that tracebacks and other debugging information have to be munged more. We also lose our "no stale stack frames" policy, which informally states that we don't have coroutines nor generators. The CPS transformation generally generates a form of code which should be run as a coroutine, with a live (delimited) continuation passed in from the runtime. This is not impossible, but it is a drastic shift away from what I've studied and built so far.

Since Kernel-Monte is an expression language, a desugaring from def to some sort of intermediate let is highly desirable. However, I have attempted to build this algorithm thrice and been stymied every time. The corner cases are constantly recurring; even the canonical expansion is sufficient to send me into fits with some of its pathological generated code. I've concluded that this transformation, while something I dearly want, requires much more thought.

What Isn't Clear

A-normal form sounds so enticing, but I can't articulate why.

Monte, the Talker

Monte's family tree is firmly rooted in elder Smalltalk-80, and incorporates concepts from its parents Python and E, cousins Ruby and JavaScript, and of course its {famti} Java. Our family is not all speed demons, but many of them are quite skilled at getting great performance out of their semantics, and our distant cousin Lua is often on top of benchmarks. We should make sure that we're not missing any lessons from them.

What Maps Well

We can suppose that every object literal is recurrent in a scope; it has some maker, even if that maker is the top-level eval(). In that sense, the script of an object literal, combined with the closure of the object literal, is just like a description of a class in a class-based language with no inheritance. We can even add on Monte-style extends object composition by permitting subclasses; there is no this or self, but the subclasses could override superclass methods and we could use the standard method cache technique to make that fast.

We have two more layers of boxing than most other object-based languages, but that doesn't seem to really impede the otherwise-identical "pass-by-object" semantics of Monte with pretty much every other language in the family. Our JIT has definitely proven capable of seeing through FinalSlot and Binding just like it can see through Int.

What Maps Poorly

Our family tree really should have a strict line in the sand for scoping rules, because half of the family doesn't have static lexical scopes. Much of what has gone into making Python fast, especially in the fast implementations like PyPy and ZipPy, doesn't apply at all to Monte because Monte does not have dynamic scopes, and so Monte does not need to recover static scoping information in the JIT.

Our static scope, honestly, works against us somewhat. I can't help but feel that most of the quirky design in Ruby and Python bytecode is due to not being able to erase away lots of scope semantics; contrapositively, Monte is actually kind of hard to compile into lower forms precisely because the static scoping information makes manipulating terms harder. (This might just be me whining; I mean "hard" in the "lots of typing and thinking" sense.)

We really do need a "deslotification" system of some sort. I've thought about this, and come up with a couple conceptual systems that generate type information for slots and erase bindings and slots during compilation when it can prove that they're not needed. Unfortunately, I haven't actually implemented anything; this is another situation where things are hard to implement. Once again, this is relatively untrodden territory, other than the word "deslotification" itself, which comes from this E page. Interestingly, I independently came up with some of the schemes on that page, which suggests that I'm on the right track, but I also learned that this stuff was never really implemented, so maybe it's a dead end.

What Isn't Clear

Bytecode seems like a good thing. It also seems like a serious albatross, once we start picking on-disk bytecode formats. I'm not sure whether the Smallcaps construction really is the best way of modelling the actions that Monte objects take.

Paths Unpaved

There's a couple options available to us that are relatively orthogonal to what I've talked about so far.

LLVM is the elephant in the room. It is an aggressively-optimizing, competent code generator for anything that can be put into a standard low-level-typed SSA form. For Monte, LLVM would enable a compilation strategy much like Objective-C (or, I imagine, like Swift): Arrange all objects into a generated class hierarchy, prove and erase all types to get as many unboxed objects as possible, and then emit LLVM, producing a binary that runs at a modest speed.

The main trick to LLVM that I see is that it appears to dictate a semantic model, but that is only because we are looking at LLVM through its intended lens of compiling C++, from which Objective-C appears the closest relative to Monte. However, there exist LLVM-targeting compilers which emit code that looks quite alien; the example that comes to my mind is GHC's LLVM backend, which generates the same graph-reducing machine as GHC's native backend. There's no reason that we could not pursue a similar path after figuring out our semantics.

Another growing elephant is Truffle. Truffle is much like RPython, providing pretty much the same stuff, but with two important differences. First, Truffle itself is not translated in the same way as RPython; there's a complex interaction between Truffle, Graal, and the JVM which produces the desired JIT effects. RPython's complexity is mostly borne by the compiler author; the fanciest switch on the panel of a translated RPython program is the one that controls the JIT's parameters. Truffle lets you pick between multiple different JITs at runtime! This is partially due to choices made in the JVM ecosystem that make this preferable.

The second different is worth emphasizing, just because it matters deeply to me, and I figure that it surely must resonate with other folks. Truffle is not a meta-tracing JIT like RPython, but a partially evaluating JIT. This is both a solid theoretical foundation, and a recipe for proven-correct aggressive optimizations. In benchmarks, Truffle does not disappoint. The only downside to Truffle is having to write Java in roughly the normal Java-to-Python proportions instead of RPython.

We could write pretty much anything in Truffle that we could in RPython; thus, sticking with RPython for the accumulated knowledge and experience that we have on that platform makes sense for now. A Truffle port could be done at some point, perhaps by the community.

Monte, the Frustration

I hate via patterns. But only as a compiler author. As a writer of Monte code, via is delightful. When compiling via patterns, though, one has to extract the guts of the pattern, which turns out to be a seriously tricky task in the corner cases. It's the kind of thing that even production-quality Haskell compiler libraries flinch at handling. (As a corollary, if I understood the Haskell bound package, I would be writing one compiler, in Haskell, and nothing else.)

DeepFrozen proof obligations really should be discharged at compile time whenever possible. They aren't really that expensive, but they definitely impose some running overhead. Similarly, a specializer that could discharge or desugar things like (0..!MAXSIZE) would be nice; that single expression was 20% of the runtime of the richards benchmark recently.

To be more blunt, I like partial evaluation. I would love to have a practical partial evaluator for Monte. I also don't feel that Monte's current semantics are very friendly towards partial evaluation. I really do want to lower everything into some simpler form before doing any specialization.

In Conclusion

In conclusion, I need a vacation, I think. If only there were a Python convention coming up…

02 May 2016 9:02pm GMT