20 Jan 2017
This is the write-up of my talk at LCA 2017 in Hobart. It's not exactly the same, because this is a blog and not a talk, but the same contents. The slides for the talk are here, and I will link to the video as soon as it is available.
Linux Kernel Maintainers
First let's look at how the kernel community works, and how a change gets merged into Linus Torvalds' repository. Changes are submitted as patches to mailing list, then get some review and eventually get applied by a maintainer to that maintainer's git tree. Each maintainer then sends pull request, often directly to Linus. With a few big subsystems (networking, graphics and ARM-SoC are the major ones) there's a second or third level of sub-maintainers in. 80% of the patches get merged this way, only 20% are committed by a maintainer directly.
Most maintainers are just that, a single person, and often responsible for a bunch of different areas in the kernel with corresponding different git branches and repositories. To my knowledge there are only three subsystems that have embraced group maintainership models of different kinds: TIP (x86 and core kernel), ARM-SoC and the graphics subsystem (DRM).
The radical change, at least for the kernel community, that we implemented over a year ago for the Intel graphics driver is to hand out commit rights to all regular contributors. Currently there are 19 people with commit rights to the drm-intel repository. In the first year of ramp-up 70% of all patches are now committed directly by their authors, a big change compared to how things worked before, and still work everywhere else outside of the graphics subsystem. More recently we also started to manage the drm-misc tree for subsystem wide refactorings and core changes in the same way.
I've covered the details of the new process in my Kernel Recipes talk "Maintainers Don't Scale", and LWN has covered that, and a few other talks, in their article on linux kernel maintainer scalability. I also covered this topic at the kernel summit, again LWN covered the group maintainership discussion. I don't want to go into more detail here, mostly because we're still learning, too, and not really experts on commit rights for everyone and what it takes to make this work well. If you want to enjoy what a community does who really has this all figured out, watch Emily Dunham's talk "Life is better with Rust's community automation" from last year's LCA.
What we are experts on is the Linux Kernel's maintainer model - we've run things for years with the traditional model, both as single maintainers and small groups, and now gained the outside perspective by switching to something completely different. Personally, I've come to believe that the maintainer model as implemented by the kernel community just doesn't scale. Not in the technical sense of big-O scalability, because obviously the kernel community scales to a rather big size. Much larger organizations, entire states are organized in a hierarchical way, the kernel maintainer hierarchy is not anything special. Besides that, git was developed specifically to support the Linux maintainer hierarchy, and git won. Clearly, the linux maintainer model scales to big numbers of contributors. Where I think it falls short is the constant factor of how efficiently contributions are reviewed and merged, especially for non-maintainer contributors. Which do 80% of all patches.
Cult of Busy
The first issue that routinely comes out when talking about maintainer topics is that everyone is overloaded. There's a pervasive spirit in our industry (especially in the US) hailing overworked engineers as heroes, with an entire "cult of busy" around. If you have time, you're a slacker and probably not worth it. Of course this doesn't help when being a maintainer, but I don't believe it's a cause of why the Linux maintainer model doesn't work. This cult of busy leads to burnout, which is in my opinion a prime risk when you're an open source person. Personally I've gone through a few difficult phases until I understood my limits and respected them. When you start as a maintainer for 2-3 people, and it increases to a few dozen within a couple of years, then getting a bit overloaded is rather natural - it's a new job, with a different set of responsibilities and I had no clue about a lot of things. That's no different from suddenly being a leader of a much bigger team anywhere else. A great talk on this topic is "What part of "… for life" don't you understand?" from Jacob Kaplan-Moss since it's by a former maintainer. It also contains a bunch of links to talks on burnout specifically. Ignoring burnout is not healthy, or not knowing about the early warning signs, it is rampant in our communities, but for now I'll leave it at that.
Boutique Trees and Bus Factors
The first issue I see is how maintainers usually are made: You scratch an itch somewhere, write a bit of code, suddenly a few more people find it useful, and "tag" you're the maintainer. On top, you often end up being stuck in that position "for life". If the community keeps growing, or your maintainer becomes otherwise busy with work&life, you have your standard-issue overloaded bottleneck.
That's the point where I think the kernel community goes wrong. When other projects reach this point they start to build up a more formal community structure, with specialized roles, boards for review and other bits and pieces. One of the oldest, and probably most notorious, is Debian with its constitution. Of course a small project doesn't need such elaborate structures. But if the goal is world domination, or at least creating something lasting, it helps when there's solid institutions that cope with people turnover. At first just documenting processes and roles properly goes a long way, long before bylaws and codified decision processes are needed.
The kernel community, at least on the maintainer side, entirely lacks this.
What instead most often happens is that a new set of ad-hoc, chosen-by-default maintainers start to crop up in a new level of the hierarchy, below your overload bottleneck. Because becoming your own maintainer is the only way to help out and to get your own features merged. That only perpetuates the problem, since the new maintainers are as likely to be otherwise busy, or occupied with plenty of other kernel parts already. If things go well that area becomes big, and you have another git tree with another overloaded maintainer. More often than not people move around, and accumulate small bits allover under their maintainership. And then the cycle repeats.
The end result is a forest of boutique trees, each covering a tiny part of the project, maintained by a bunch of notoriously overloaded people. The resulting cross-tree coordination issues are pretty impressive - in the graphics subsystem we fairly often end up with with simple drivers that somehow need prep patches in 5 different trees before you can even land that simple driver in the graphics tree.
Unfortunately that's not the bad part. Because these maintainers are all busy with other trees, or their work, or life in general, you're guaranteed that one of them is not available at any given time. Worse, because their tree has relatively little activity because it covers a small area, many only pick up patches once per kernel release, which means a built-in 3 month delay. That's all because each tree and area has just one maintainer. In the end you don't even need the proverbial bus to hit anyone to feel the pain of having a single point of failure in your organization - there's so many maintainer trees around that that absence always happens, and constantly.
Of course people get fed up trying to get features merged, and often the fix is trying to become a maintainer yourself. That takes a while and isn't easy - only 20% of all patches are authored by maintainers - and after the new code landed it makes it all worse: Now there's one more semi-absent maintainer with one more boutique tree, adding to all the existing troubles.
Checks and Balances
All patches merged into the Linux kernel are supposed to be reviewed, and rather often that review is only done by the maintainers who merges the patch. When maintainers send out pull requests the next level of maintainers then reviews those patch piles, until they land in Linus' tree. That's an organization where control flows entirely top-down, with no checks and balances to reign in maintainers who are not serving their contributors well. History of dicatorships tells us that despite best intentions, the end result tends to heavily favour the few over the many. As a crude measure for how much maintainers subject themselves to some checks&balances by their peers and contributors I looked at how many patches authored and committed by the same person (probably a maintainer) do not also carry a reviewed or acked tag. For the Intel driver that's less than 3%. But even within the core graphics code it's only 5%, and that covers the time before we started to experiment with commit rights for that area. And for the graphics subsystem overall the ratio is still only about 25%, including a lot of drivers with essentially just one contributor, who is always volunteered as the maintainer, and hence somewhat natural that those maintainers lack reviewers.
Outside of graphics only roughly 25% of all patches written by maintainers are reviewed by their peers - 75% of all maintainer patches lack any kind of recorded peer review, compared to just 25% for graphics alone. And even looking at core areas like
mm/ the ratio is only marginally better at about 30%. In short, in the kernel at large, peer review of maintainers isn't the norm.
And there's nothing outside of the maintainer hierarchy that could provide some checks and balance either. The only way to escalate disagreement is by starting a revolution, and revolutions tend to be long, drawn-out struggles and generally not worth it. Even Debian only recently learned that they lack a way to depose maintainers, and that maybe going maintainerless would be easier (again, LWN has you covered).
Of course the kernel is not the only hierarchy where there's no meaningful checks and balances. Professor at universities, and managers at work are in a fairly similar position, with minimal options for students or employers to meaningfully appeal decisions. But that's a recognized problem, and at least somewhat countered by providing ways to provide anonymous feedback, often through regular surveys. The results tend to not be all that significant, but at least provide some control and accountability to the wider masses of first-level dwellers in the hierarchy. In the kernel that amounts to about 80% of all contributions, but there's no such survey. On the contrary, feedback sessions about maintainer happiness only reinforce the control structure, with e.g. the kernel summit featuring an "Is Linus happy?" session each year.
Another closely related aspect to all this is how a project handles personal conflicts between contributors. For a very long time Linux didn't have any formal structures in this area either, with the only options available to unhappy people to either take it or leave it. Well, or usurping a maintainer with a small revolution, but that's not really an option. For two years we've now had the "Code of Conflict", which de facto just throws up its hands and declares that conflict are the normal outcome, essentially just encoding the status quo. Refusing to handle conflicts in a project with thousands of contributors just doesn't work, except that it results in lots of frustration and ultimately people trying to get away. Again, the lack of a poised board to enforce a strong code of conduct, independent of the maintainer hierarchy, is in line with the kernel community unwillingness to accept checks and balances.
Mesh vs. Hierarchy
The last big issue I see with the Linux kernel model, featuring lots of boutique trees and overloaded maintainer, is that it seems to harm collaboration and integration of new contributors. In the Intel graphics, driver maintainers only ever reviewed a small minority of all patches over the last few years, with the goal to foster direct collaboration between contributors. Still, when a patch was stuck, maintainers were the first point of contact, especially, but not only, for newer contributors. No amount of explaining that only the lack of agreement with the reviewer was the gating factor could persuade people to fully collaborate on code reviews and rework the code, tests and documentation as needed. Especially when they're coming with previous experience where code review is more of a rubber-stamp step compared to the distributed and asynchronous pair-programming it often resembles in open-source. Instead, new contributors often just ended up falling back to pinging maintainers to make a decision or just merge the patches as-is.
Giving all regular contributors commit rights and fully trusting them to do the right thing entirely fixed that: If the reviewer or author have commit rights there's no easy excuse anymore to involve maintainers when the author and reviewer can't reach agreement. Of course that requires a lot of work in mentoring people, making sure requirements for merging are understood and documented, and automating as much as possible to avoid screw ups. I think maintainers who lament their lack of review bandwidth, but also state they can't trust anyone else aren't really doing their jobs.
At least for me, review isn't just about ensuring good code quality, but also about diffusing knowledge and improving understanding. At first there's maybe one person, the author (and that's not a given), understanding the code. After good review there should be at least two people who fully understand it, including corner cases. And that's also why I think that group maintainership is the only way to run any project with more than one regular contributor.
On the topic of patch review and maintainers, there's also the habit of wholesale rewrites of patches written by others. If you want others to contribute to your project, then that means you need to accept other styles and can't enforce your own all the time. Merging first and polishing later recognizes new contributions, and if you engage newcomers for the polish work they tend to stick around more often. And even when a patch really needs to be reworked before merging it's better to ask the author to do it: Worst case they don't have time, best case you've improved your documentation and training procedure and maybe gained a new regular contributor on top.
A great take on the consequences of having fixed roles instead of trying to spread responsibilities more evenly is Alice Goldfuss' talk "Rock Stars, Builders, and Janitors: You're doing it wrong". I also think that rigid roles present a bigger bar for people with different backgrounds, hampering diversity efforts and in the spirit of Sarah Sharps post on what makes a good community, need to be fixed first.
Towards a Maintainer's Manifest
I think what's needed in the end is some guidelines and discussions about what a maintainer is, and what a maintainer does. We have ready-made licenses to avoid havoc, there's code of conducts to copypaste and implement, handbooks for building communities, and for all of these things, lots of conferences. Maintainer on the other hand you become by accident, as a default. And then everyone gets to learn how to do it on their own, while hopefully not burning too many bridges - at least I myself was rather lost on that journey at times. I'd like to conclude with a draft on a maintainer's manifest.
It's About the People
If you're maintainer of a project or code area with a bunch of full time contributors (or even a lot of drive-by contributions) then primarily you deal with people. Insisting that you're only a technical leader just means you don't acknowledge what your true role really is.
And then, trust them to do a good job, and recognize them for the work they're doing. The important part is to trust people just a bit more than what they're ready for, as the occasional challenge, but not too much that they're bound to fail. In short, give them the keys and hope they don't wreck the car too badly, but in all cases have insurance ready. And insurance for software is dirt cheap, generally a
git revert and the maintainer profusely apologizing to everyone and taking the blame is all it takes.
Recognize Your Power
You're a maintainer, and you have essentially absolute power over what happens to your code. For successful projects that means you can unleash a lot of harm on people who for better or worse are employed to deal with you. One of the things that annoy me the most is when maintainers engage in petty status fights against subordinates, thinly veiled as technical discussions - you end up looking silly, and it just pisses everyone off. Instead recognize your powers, try to stay on the good side of the force and make sure you share it sufficiently with the contributors of your project.
Accept Your Limits
At the beginning you're responsible for everything, and for a one-person project that's all fine. But eventually the project grows too much and you'll just become a dictator, and then failure is all but assured because we're all human. Recognize what you don't do well, build institutions to replace you. Recognize that the responsibility you initially took on might not be the same as that which you'll end up with and either accept it, or move on. And do all that before you start burning out.
Be a Steward, Not a Lord
I think one of key advantages of open source is that people stick around for a very long time. Even when they switch jobs or move around. Maybe the usual "for life" qualifier isn't really a great choice, since it sounds more like a mandatory sentence than something done by choice. What I object to is the "dictator" part, since if your goal is to grow a great community and maybe reach world domination, then you as the maintainer need to serve that community. And not that the community serves you.
Thanks a lot to Ben Widawsky, Daniel Stone, Eric Anholt, Jani Nikula, Karen Sandler, Kimmo Nikkanen and Laurent Pinchart for reading and commenting on drafts of this text.
20 Jan 2017 12:00am GMT
17 Jan 2017
In this blog post I promised I would get back to people who want to use the nvidia driver on an optimus laptop.
The set of xserver patches I blogged about last time have landed upstream and in Fedora 25 (in xorg-x11-server 1.19.0-3 and newer), allowing the nvidia driver packages to drop a xorg.conf snippet which will make the driver atuomatically work on optimus setups.
The negativo17.org nvidia packages now are using this, so if you install these, then the nvidia driver should just work on your laptop.
Note that you should only install these drivers if you actually have a supported (new enough) nvidia GPU. These drivers replace the libGL implementation, so installing them on a system without a nvidia GPU will cause things to break. This will be fixed soon by switching to libglvnd as the official libGL provider and having both mesa and the nvidia driver provide "plugins" for libglvnd. I've actually just completed building a new enough libglvnd + libglvnd enabled mesa for rawhide, so rawhide users will have libglvnd starting tomorrow.
17 Jan 2017 1:31pm GMT
16 Jan 2017
My day-to-day activities are still evolving around the Python programming language, as I continue working on the OpenStack project as part of my job at Red Hat. OpenStack is still the biggest Python project out there, and attract a lot of Python hackers.
Those last few years, however, things have taken a different turn for me when I made the choice with my team to rework the telemetry stack architecture. We decided to make a point of making it scale way beyond what has been done in the project so far.
I started to dig into a lot of different fields around Python. Topics you don't often look at when writing a simple and straight-forward application. It turns out that writing scalable applications in Python is not impossible, nor that difficult. There are a few hiccups to avoid, and various tools that can help, but it really is possible - without switching to another whole language, framework, or exotic tool set.
Working on those projects seemed to me like a good opportunity to share with the rest of the world what I learned. Therefore, I decided to share my most recent knowledge addition around distributed and scalable Python application in a new book, entitled The Hacker's Guide to Scaling Python (or Scaling Python, in short). The book should be released in a few months - fingers crossed.
And as the book is still a work-in-progress, I'll be happy to hear any remark, subject, interrogation or topic idea you might have or any particular angle you would like me to take in this book (reply in the comments section or shoot me an email). And if you'd like to get be kept updated on this book advancement, you can subscribe in the following form or from the book homepage.
The adventure of working on my previous book, The Hacker's Guide to Python, has been so tremendous and the feedback so great, that I'm looking forward releasing this new book later this year!
16 Jan 2017 4:00pm GMT
Last week I was on vacation, but the week before that I did some more work on figuring out Intel's Mesa CI system, and Mark Janes has started working on some better documentation for it. I now understand better how the setup is going to work, but haven't made much progress on actually getting a master running yet.
More fun, though, was finally taking a look at optimizing the tiled texture load/store code. This got started with Patrick Walton tweeting a link to a blog post on clever math for implementing texture tiling, given a couple of assumptions.
As with all GPUs these days, VC4 swizzles textures into a tiled layout so that when a cacheline is loaded from memory, it will cover nearby pixels in the Y direction as well as X, so that you are more likely to get cache hits for your neighboring pixels in drawing. And the tiling tends to be multiple levels, so that nearby cachelines are also nearby on the screen, reducing DRAM access latency.
For small textures on VC4, we have a single level of tiling: 4x4@32bpp blocks ("utiles") of raster order data, themselves arranged in up to a 4x4 block, and this mode is called LT. Once things go over that size, we go to T tiling, where we call a 4x4 LT block a subtile, and arrange subtiles either clockwise or counterclockwise order within a 2x2 block (a tile, which at this point is now 1kb of data), and tiles themselves are arranged left-to-right, then right-to-left, then left-to-right again.
The first thing I did was implement the blog post's clever math for LT textures. One of the nice features of the math trick is that it means I can do partial utile updates finally, because it skips around in the GPU's address space based on the CPU-side pixel coordinate instead of trying to go through GPU address space to update a 4x4 utile at a time. The downside of doing things this way is that jumping around in GPU address space means that our writes are unlikely to benefit from write combining, which is usually important for getting full write performance to GPU memory. It turned out, though, that the math was so much more efficient than what I was doing that it was still a win.
However, I found out that the clever math made reads slower. The problem is that, because we're write-combined, reads are uncached -- each load is a separate bus transaction to main memory. Switching from my old utile-at-a-time load using memcpy to the new code meant that instead of doing 4 loads using NEON to grab a row at a time, we were now doing 16 loads of 32 bits at a time, for what added up to a 30% performance hit.
Reads *shouldn't* be something that people do much, except that we still have some software fallbacks in X on VC4 ("core" unantialiased text rendering, for example, which I need to write the GLSL 1.20 shaders for), which involve doing a read/modify/write cycle on a texture. My first attempt at fixing the regression was just adding back a fast path that operates on a utile at a time if things are nicely utile-aligned (they generally are). However, some forced inlining of functions per cpp that I was doing for the unaligned case meant that the glibc memcpy call now got inlined back to being non-NEON, and the "fast" utile code ended up not helping loads.
Relying on details of glibc's implementation (their tradeoff for when to do NEON loads) and of gcc's implementation (when to go from memcpy calls to inlined 32-bits-at-a-time code) seems like a pretty bad idea, so I decided to finally write the NEON acceleration that Eben and I have talked about several times.
My first hope was that I could load a full cacheline with NEON's VLD. VLD1 only loads up to 1 "quadword" (16 bytes) at a time, so that doesn't seem like much help. VLD4 can load 64 bytes like we want, but it also turns AOS data into SOA in the process, and there's no corresponding "SOA-back-to-AOS store 8 or 16 bytes at a time" like we need to do to get things back into the CPU's strided representation. I tried VLD4+VST4 into a temporary, then doing my old untiling path on the cached temporary, but that still left me a few percent slower on loads than not doing any of this work at all.
Finally, I hit on using the VLDM instruction. It seems to be intended for stack loads/stores, but we can also use it to get 64 bytes of data in from memory untouched into NEON registers, and then I can use 4 (32bpp) or 8 (8 or 16bpp) VST1s to store it to the CPU side. With this, we get a 208.256% +/- 7.07029% (n=10) improvement to GetTexImage performance at 1024x1024. Doing the same NEON code for stores gave a 41.2371% +/- 3.52799% (n=10) improvement, probably mostly due to not calling into memcpy and having it go through its size/alignment-based memcpy path choosing process.
I'm not yet hitting full memory bandwidth, but this should be a noticeable improvement to X, and it'll probably help my piglit test suite runtime as well. Hopefully I'll get the code polished up and landed next week when I get back from LCA.
16 Jan 2017 10:28am GMT
11 Jan 2017
Hans de Goede: Xorg in Fedora-26 will use xorg-x11-drv-modesetting instead of -intel for all recent Intel GPUs
A while back Debian has switched to using the modesetting Xorg driver rather then the intel Xorg driver for Intel GPUs.
There are several good reasons for this, rather then repeating them I'm just going to point to the Debian announcement.
This blog post is to let all Fedora users know that starting with Fedora-26 / rawhide as of today, we are making the same change.
Note that the xorg-x11-drv-intel package has already been carrying a Fedora patch to not bind to the GPU on Skylake or newer, even before Debian announced this, this just makes the same change for older Intel GPUs.
For people who are using the now default GNOME3 on Wayland session, nothing changes, since Xwayland always uses glamor for X acceleration, just like the modesetting driver.
If you encounter any issues causes by this change, please file a bug in bugzilla.
The "for all recent Intel GPUs" in the subject of this blog post in practice means that we're making this change for gen4 and newer Intel GPUs.
11 Jan 2017 9:07am GMT
09 Jan 2017
Since beginning last year, our team at Igalia has been involved in enabling ARB_gpu_shader_fp64 extension to different Intel generations: first Broadwell and later, then Haswell, now IvyBridge (still under review); so working on supporting Vulkan's Float64 capability into Mesa was natural.
09 Jan 2017 10:33am GMT
Since beginning last year, our team at Igalia has been involved in enabling ARB_gpu_shader_fp64 extension to different Intel generations: first Broadwell and later, then Haswell, now IvyBridge (still under review); so working on supporting Vulkan's Float64 capability into Mesa was natural.
09 Jan 2017 10:33am GMT
07 Jan 2017
I've been lately working on integrating ModemManager in OpenWRT, in order to provide a unique and consolidated way to configure and manage mobile broadband modems (2G, 3G, 4G, Iridium…), all working with
OpenWRT already has some support for a lot of the devices that ModemManager is able to manage (e.g. through the
wwan packages), but unlike the current solutions, ModemManager doesn't require protocol-specific configurations or setups for the different devices; i.e. the configuration for a modem running in MBIM mode may be the same one as the configuration for a modem requiring AT commands and a PPP session.
Currently the OpenWRT package prepared is based on ModemManager git master, and therefore it supports: QMI modems (including the new MC74XX series which are raw-ip only and don't support DMS UIM operations), MBIM modems, devices requiring QMI over MBIM operations (e.g. FCC auth), and of course generic AT+PPP based modems, Cinterion, Huawei (both AT+PPP and AT+NDISDUP), Icera, Haier, Linktop, Longcheer, Ericsson MBM, Motorola, Nokia, Novatel, Option (AT+PPP and HSO), Pantech, Samsung, Sierra Wireless (AT+PPP and DirectIP), Simtech, Telit, u-blox, Wavecom, ZTE… and even Iridium and Thuraya satellite modems. All with the same configuration.
Along with ModemManager itself, the OpenWRT feed also contains libqmi and libmbim, which provide the
mbimcli, and soon the
qmi-firmware-update utilities. Note that you can also use these command line tools, even if ModemManager is running, via the qmi-proxy and mbim-proxy setups (i.e. just adding
-p to the
This is not the first time I've tried to do this; but this time I believe it is a much more complete setup and likely ready for others to play with it. You can jump to the modemmanager-openwrt bitbucket repository and follow the instructions to include it in your OpenWRT builds:
The following sections try to get into a bit more detail of which were the changes required to make all this work.
And of course, thanks to VeloCloud for sponsoring the development of the latest ModemManager features that made this integration possible
udev vs hotplug
One of the latest biggest features merged in ModemManager was the possibility to run without
udev support; i.e. without automatically monitoring the device addition and removals happening in the system.
Instead of using
mmcli command line tool ended up with a new
--report-kernel-event that can be used to report the device addition and removals manually, e.g.:
$ mmcli --report-kernel-event="action=add,subsystem=tty,name=ttyUSB0" $ mmcli --report-kernel-event="action=add,subsystem=net,name=wwan0"
With the integration in the hotplug scripts, ModemManager will automatically detect and probe the different ports exposed by the broadband modem devices.
ModemManager relies on udev rules for different things:
- Blacklisting devices: E.g. we don't want ModemManager to claim and probe the TTYs exposed by Arduinos or braille displays. The package includes a USB vid:pid based blacklist of devices that expose TTY ports and are not modems to be managed by ModemManager.
- Blacklisting ports: There are cases where we don't want the automatic logic selection to grab and use some specific modem ports, so the package also provides a much shorter list of ports blacklisted from actual modem devices. E.g. the QMI implementation in some ZTE devices is so poor that we decided to completely skip it and fallback to AT+PPP.
- Greylisting USB serial adapters: The TTY ports exposed by USB serial adapters aren't probed automatically, as we don't know what's connected in the serial side. If we want to have a serial modem, though, the
mmcli --scan-modemsoperation may be executed, which will include the probing of these greylisted devices.
- Specifying port type hints: Some devices expose multiple AT ports, but with different purposes. E.g. a modem may expose a port for AT control and another port for the actual PPP session, and choosing the wrong one will not work. ModemManager includes a list of port type hints so that the automatic selection of which port is for what purpose is done transparently.
As we're not using udev when running in OpenWRT, ModemManager includes now a custom generic udev rules parser that uses sysfs properties to process and apply the rules.
procd based startup
The ModemManager daemon is setup to be started and controlled via
procd. The init script controlling the startup will also take care of scheduling the re-play of the hotplug events that had earlier triggered
--report-kernel-event actions (they're cached in
/tmp); e.g. to cope with events coming before the daemon started or to handle daemon restarts gracefully.
Well, no, I didn't port ModemManager to use
ubus If you want to run ModemManager under OpenWRT you'll also need to have the
DBus daemon running.
netifd protocol handler
When using ModemManager, the user shouldn't need to know the peculiarities of the modem being used: all modems and protocols (QMI, MBIM, Generic AT, vendor-specific AT…) are all managed via the same single DBus interfaces. All the modem control commands are internal to ModemManager, and the only additional considerations needed are related to how to setup the network interface once the modem is connected, e.g.:
- PPP: some modems require a PPP session over a serial port.
- Static: some modems require static IP configuration on a network interface.
- DHCP: some modems require dynamic IP configuration on a network interface.
The OpenWRT package for ModemManager includes a custom protocol handler that enables the modemmanager protocol to be used when configuring network interfaces. This new protocol handler takes care of configuring and bringing up the interfaces as required when the modem gets into connected state.
The following snippet shows an example interface configuration to set in
config interface 'broadband' option device '/sys/devices/platform/soc/20980000.usb/usb1/1-1/1-1.2/1-1.2.1' option proto 'modemmanager' option apn 'ac.vodafone.es' option username 'vodafone' option password 'vodafone' option pincode '7423' option lowpower '1'
The settings currently supported are the following ones:
- device: The full sysfs path of the broadband modem device needs to be configured. Relying on the interface names exposed by the kernel is never a good idea, as these may change e.g. across reboots or when more than one modem device is available in the system.
- proto: As said earlier, the new modemmanager protocol needs to be configured.
- apn: If the connection requires an APN, the APN to use.
- username: If the access point requires authentication, the username to use.
- password: If the access point requires authentication, the password to use.
- pincode: If the SIM card requires a PIN, the code to use to unlock it.
- lowpower: If enabled, this setting will request the modem to go into low-power state (i.e. IMSI detach and RF off) when the interface is disconnected.
As you can see, the configuration can be used for any kind of modem device, regardless of what control protocol it uses, which interfaces are exposed, or how the connection is established. The settings are currently only IPv4 only, but adding IPv6 support shouldn't be a big issue, patches welcome
SMS, USSD, GPS…
The main purpose of using a mobile broadband modem is of course the connectivity itself, but it also may provide many more features. ModemManager provides specific interfaces and
mmcli actions for the secondary features which are also available in the OpenWRT integration, including:
- SMS messaging (both 3GPP and 3GPP2).
- Location information (3GPP LAC/CID, CDMA Base station, GPS…).
- Time information (as reported by the operator).
- 3GPP USSD operations (e.g. to query prepaid balance to the operator).
- Extended signal quality information (RSSI, Ec/Io, LTE RSRQ and RSRP…).
- OMA device management operations (e.g. to activate CDMA devices).
- Voice call control.
Worth noting that not all these features are available for all modem types (e.g. SMS messaging is available for most devices, but OMA DM is only supported in QMI based modems).
You can now have your 2G/3G/4G mobile broadband modems managed with ModemManager and netifd in your OpenWRT based system.
07 Jan 2017 11:34am GMT
06 Jan 2017
Iago Toral: GL_ARB_gpu_shader_fp64 / OpenGL 4.0 lands for Intel/Haswell. More gen7 support coming soon!
2017 starts with good news for Intel Haswell users: it has been a long time coming, but we have finally landed GL_ARB_gpu_shader_fp64 for this platform. Thanks to Matt Turner for reviewing the huge patch series!
Maybe you are not particularly excited about GL_ARB_gpu_shader_fp64, but that does not mean this is not an exciting milestone for you if you have a Haswell GPU (or even IvyBridge, read below): this extension was the last piece missing to finally bring Haswell to expose OpenGL 4.0!
If you want to give it a try but you don't want to build the driver from the latest Mesa sources, don't worry: the feature freeze for the Mesa 13.1 release is planned to happen in just a few days and the current plan is to have the release in early February, so if things go according to plan you won't have to wait too long for an official release.
But that is not all, now that we have landed Fp64 we can also send for review the implementation of GL_ARB_vertex_attrib_64bit. This could be a very exciting milestone, since I believe this is the only thing missing for Haswell to have all the extensions required for OpenGL 4.5!
You might be wondering about IvyBridge too, and 2017 also starts with good news for IvyBridge users. Landing Fp64 for Haswell allowed us to send for review the IvyBridge patches we had queued up for GL_ARB_gpu_shader_fp64 which will bring IvyBridge up to OpenGL 4.0. But again, that is not all, once we land Fp64 we should also be able to send the patches for GL_vertex_attrib_64bit and get IvyBridge up to OpenGL 4.2, so look forward to this in the near future!
We have been working hard on Fp64 and Va64 during a good part of 2016, first for Broadwell and later platforms and later for Haswell and IvyBridge; it has been a lot of work so it is exciting to see all this getting to the last stages and on its way to the hands of the final users.
All this has only been possible thanks to Intel's sponsoring and the great support and insight that our friends there have provided throughout the development and review processes, so big thanks to all of them and also to the team at Igalia that has been involved in the development with me.
06 Jan 2017 7:49am GMT
03 Jan 2017
This post describes the synclient tool, part of the xf86-input-synaptics package. It does not describe the various options, that's what the synclient(1) and synaptics(4) man pages are for. This post describes what synclient is, where it came from and how it works on a high level. Think of it as a anti-bus-factor post.
The most important thing first: synclient is part of the synaptics X.Org driver which is in maintenance mode, and superseded by libinput and the xf86-input-libinput driver. In general, you should not be using synaptics anymore anyway, switch to libinput instead (and report bugs where the behaviour is not correct). It is unlikely that significant additional features will be added to synclient or synaptics and bugfixes are rare too.
synclient's interface is extremely simple: it's a list of key/value pairs that would all be set at the same time. For example, the following command sets two options, TapButton1 and TapButton2:
synclient TapButton1=1 TapButton2=2
The -l switch lists the current values in one big list:
$ synclient -l
LeftEdge = 1310
RightEdge = 4826
TopEdge = 2220
BottomEdge = 4636
FingerLow = 25
FingerHigh = 30
MaxTapTime = 180
The commandline interface is effectively a mapping of the various xorg.conf options. As said above, look at the synaptics(4) man page for details to each option.
A decade ago, the X server had no capabilities to change driver settings at runtime. Changing a device's configuration required rewriting an xorg.conf file and restarting the server. To avoid this, the synaptics X.Org touchpad driver exposed a shared memory (SHM) segment. Anyone with knowledge of the memory layout (an internal struct) and permission to write to that segment could change driver options at runtime. This is how synclient came to be, it was the tool that knew that memory layout. A synclient command would thus set the correct bits in the SHM segment and the driver would use the newly updated options. For obvious reasons, synclient and synaptics had to be the same version to work.
8 or so years ago, the X server got support for input device properties, a generic key/value store attached to each input device. The keys are the properties, identified by an "Atom" (see box on the side). The values are driver-specific. All drivers make use of this now, being able to change a property at runtime is the result of changing a property that the driver knows of.
synclient was converted to use properties instead of the SHM segment and eventually the SHM support was removed from both synclient and the driver itself. The backend to synclient is thus identical to the one used by the xinput tool or tools used by other drivers (e.g. the xsetwacom tool). synclient's killer feature was that it was the only tool that knew how to configure the driver, these days it's merely a commandline argument to property mapping tool. xinput, GNOME, KDE, they all do the same thing in the backend.
How synclient works
The driver has properties of a specific name, format and value range. For example, the "Synaptics Tap Action" property contains 7 8-bit values, each representing a button mapping for a specific tap action. If you change the fifth value of that property, you change the button mapping for a single-finger tap. Another property "Synaptics Off" is a single 8-bit value with an allowed range of 0, 1 or 2. The properties are described in the synaptics(4) man page. There is no functional difference between this synclient command:
and this xinput command
xinput set-prop "SynPS/2 Synaptics TouchPad" "Synaptics Off" 1
Both set the same property with the same calls. synclient uses XI 1.x's XChangeDeviceProperty() and xinput uses XI 2.x's XIChangeProperty() if available but that doesn't really matter. They both fetch the property, overwrite the respective value and send it back to the server.
Pitfalls and quirks
synclient is a simple tool. If multiple touchpads are present it will simply pick the first one. This is a common issue for users with a i2c touchpad and will be even more common once the RMI4/SMBus support is in a released kernel. In both cases, the kernel creates the i2c/SMBus device and an additional PS/2 touchpad device that never sends events. So if synclient picks that device, all the settings are changed on a device that doesn't actually send events. This depends on the order the devices were added to the X server and can vary between reboots. You can work around that by disabling or ignoring the PS/2 device.
synclient is a one-shot tool, it does not monitor devices. If a device is added at runtime, the user must run the command to change settings. If a device is disabled and re-enabled (VT-switch, suspend/resume, ...), the user must run synclient to change settings. This is a major reason we recommend against using synclient, the desktop environment should take care of this. synclient will also conflict with the desktop environment in that it isn't aware when something else changes things. If synclient runs before the DE's init scripts (e.g. through xinitrc), its settings may be overwritten by the DE. If it runs later, it overwrites the DE's settings.
synclient exclusively supports synaptics driver properties. It cannot change any other driver's properties and it cannot change the properties created by the X server on each device. That's another reason we recommend against it, because you have to mix multiple tools to configure all devices instead of using e.g. the xinput tool for all property changes. Or, as above, letting the desktop environment take care of it.
The interface of synclient is IMO not significantly more obvious than setting the input properties directly. One has to look up what TapButton1 does anyway, so looking up how to set the property with the more generic xinput is the same amount of effort. A wrong value won't give the user anything more useful than the equivalent of a "this didn't work".
If you're TL;DR'ing an article labelled "the definitive guide to" you're kinda missing the point...
03 Jan 2017 5:45am GMT
02 Jan 2017
The 3DMMES test has been thoroughly instruction count limited, so any wins we can get on code generation translate pretty directly into performance gains. Last week I decided to work on fixing up Jonas's patch to schedule instructions in the delay slots of thread switching, which can save us 3 instructions per texture sample.
Thread switching, you may recall, is a trick in the fragment shader to hide texture fetch latency by cooperatively switching to another fragment shader instance after you request a texture fetch, so that it can make some progress when you'd probably be blocked anyway. This lets us better occupy the ALUs, at the cost of each shader needing to fit into half of the physical register file.
However, VC4 doesn't cancel instructions in the pipeline when we request a switch (same as for within-shader branching), so 3 more instructions from the current shader get executed. For my first implementation of thread switching, I just dumped in 3 NOPs after each THRSW to fill the delay slots. This was still a massive win over not threading, so it was good enough.
Jonas's patch tries to fill the delay slots after we schedule in the thread switch by trying to extract the THRSW signal off of the instruction we scheduled and move it up to 2 instructions before that, and then only add enough NOPs to get us 3 slots filled. There was a little bug (it re-scheduled the thrsw instruction instead of a NOP in trying to pad out to delay slots), but it basically worked and got us a 1.55% instruction count win on shader-db.
The problem was that he was scheduling ALU operations along with the thrsw, and if the thrsw signal was alone without an ALU operation in it, after moving the thrsw up we'd have a NOP left in the old location. I wrote a followon patch to fix that: We now only schedule thrsws on their own without ALU operations, insert the THRSW as early as we can, and then add NOPs as necessary to fill remaining delay slots. This was another 0.41% instruction count win.
This isn't as good as it could be. Maybe we don't fill the slots as well as we could before choosing to schedule thrsw, but instructions we choose to schedule after that could be fit in. Those would be tricky because we have to check that they don't write flags or the accumulators (which wouldn't be preserved across the thrsw) or new texture coordinates. We also don't put the THRSW at any particular point in the timeline between sampler request and results collection. We might be able to get wins by trying to put thrsw at the lowest-register-pressure point between them, so that fewer things need to be forced into physical regs instead of accumulators.
Those will be projects for later. It's probably much more important that we figure out how to schedule 2-4 samples with a single thrsw, instead of doing a thrsw per sample like we do today.
The other project last week was starting to build up a plan for vc4 CI. We've had a few regressions to vc4 in Mesa master because developers (including me!) don't do testing on vc4 hardware on every commit. The Intel folks have a lovely CI system that does piglit, DEQP, and performance regression testing for them, both tracking Mesa master and doing pre-commit tests of voluntarily submitted branches. I despaired of the work needed to build something that good, but apparently they've open sourced their configuration, so I'll be trying to replicate that in the next few weeks. This week I worked on getting the hardware and a plan for where to host it. I'm looking forward to a bright future of 30-minute test runs on the actual hardware and long-term tracking of performance metrics. Unfortunately, there are no docs to it and I've never worked with jenkins before, so this is going to be a challenge.
Other things: I submitted a patch to mm/ to shut up the CMA warning we've been suffering from (and patching away downstream) for years, got exactly the expected response ("this dmesg spew in an common path might be useful for somebody debugging something some day, so it should stay there"), so hopefully we can just get Fedora and Debian to patch it out instead. This is yet another data point in favor of Matthew Garrett's plan of "write kernel patches you need, don't bother submitting them upstream". Started testing a new patch to reduce error return rate on vc4 memory allocatoins on upstream kernel, haven't confirmed that it's working yet. I also spent a lot of time reviewing tarceri's patches to Mesa preparing for the on-disk shader cache, which should help improve app startup times and reduce memory consumption.
02 Jan 2017 9:01pm GMT
Packaging Python has been a painful experience for long. The history of the various distribution that Python offered along the years is really bumpy, and both the user and developer experience has been pretty bad.
Fortunately, things improved a lot in the recent years, with the reconciliation of setuptools and distribute.
Though in the context of the OpenStack project, a solution on top of setuptools has been already started a while back. Its usage is now spread across a whole range of software and libraries.
This project is called pbr, for Python Build Reasonableness. Don't be afraid by the OpenStack colored themed of the documentation - it is a bad habit of OpenStack folks to not advertise their tooling in an agnostic fashion. The tool has no dependency with the cloud platform, and can be used painlessly with any package.
How it works
pbr takes inspiration from distutils2 (a now abandoned project) and uses a
setup.cfg file to describe the packager's intents. This is how a
setup.py using pbr looks like:
Two lines of code - it's that simple. The actual metadata that the setup requires is stored in
name = foobar
author = Dave Null
author-email = firstname.lastname@example.org
summary = Package doing nifty stuff
license = MIT
home-page = http://pypi.python.org/pypi/foobar
requires-python = >=2.6
Development Status :: 4 - Beta
Environment :: Console
Intended Audience :: Developers
Intended Audience :: Information Technology
License :: OSI Approved :: Apache Software License
Operating System :: OS Independent
Programming Language :: Python
This syntax is way easier to write and read than the standard
pbr also offers other features such as:
- automatic dependency installation based on
- automatic documentation building and generation using Sphinx
- automatic generation of
ChangeLogfiles based on git history
- automatic creation of the list of files to include using git
- version management based on git tags
All of this comes with little to no effort on your part.
One of the feature that I use a lot, is the definition of flavors. It's not tied particularly to pbr - it's actually provided by setuptools and pip themselves - but pbr
setup.cfg file makes it easy to use.
When distributing a software, it's common to have different drivers for it. For example, your project could support both PostgreSQL or MySQL - but nobody is going to use both at the same time. The usual trick to make it work is to add the needed library to the requirements list (e.g.
requirements.txt). The upside is that the software will work directly with either RDBMS, but the downside is that this will install both libraries, whereas only one is needed. Using flavors, you can specify different scenarios:
When installing your package, the user can then just pick the right flavor by using pip to install the package:
$ pip install foobar[postgresql]
This will install foobar, all its dependencies listed in
requirements.txt, plus whatever dependencies are listed in the
[extras] section of
setup.cfg matching the flavor. You can also combine several flavors, e.g.:
$ pip install foobar[postgresql,mysql]
would install both flavors.
pbr is well-maintained and in very active development, so if you have any plans to distribute your software, you should seriously consider including pbr in those plans.
02 Jan 2017 4:00pm GMT
23 Dec 2016
Yesterday Valve gave me a copy of DOOM for Christmas (not really for Christmas), and I got the wine bits in place from Fedora, then I spent today trying to get DOOM to render on radv.
Thanks to ParkerR on #radeon for taking the picture from his machine, I'm too lazy.
So it runs kinda, it hangs the GPU a fair bit, it misrenders some colors in some scenes, but you can see most of it. I'm not sure if I'll get back to this before next year (I'll try), but I'm pretty happy to have gotten it this far in a day, though I'm sure the next few things will me much more difficult to debug.
The branch is here:
23 Dec 2016 7:32am GMT
22 Dec 2016
If you are interested in how this board can help you automate the testing of your display (and not only!) code and hardware, a new mailing list has been created to discuss its uses. We at Collabora will be happy to help you integrate this board in your CI lab as well.
Thanks go to Intel for sponsoring the preparation of these slides and for allowing me to share them under an open license.
And of course, thanks to Google's ChromeOS team for releasing the hardware design with an open hardware license along with the code they are running on it and with it.
22 Dec 2016 8:50am GMT
The fancy new Sphinx-based documentation has landed a while ago in upstream. Jani Nikula has written a nice overview on LWN (part 2). And it is getting used a lot. But judging by how often I type it in replies on the mailing list what's missing is a super-short howto. To build the documentation, run:
$ make DOCBOOKS="" htmldocs
The output can then be found in Documentation/output/. When typing documentation please always check that your new text does get rendered. The output also contains documentation about kernel-doc and the toolchain itself. Since the build is incremental it is recommended that you first run it before touching anything. That way you'll only see warnings in areas you've touched, not all of them - the build is unfortunately somewhat noisy.
22 Dec 2016 6:00am GMT
20 Dec 2016
Two big successes last week.
One is that Dave Airlie has pulled Boris's VEC (STDV) code for 4.10. We didn't quite make it for getting the DT changes in time to have full support in 4.10, but the DT changes will be a lot easier to backport than the driver code.
The other is that I finally got the DSI panel working, after 9 months of frustrating development. It turns out that my DSI transactions to the Toshiba chip aren't working, but if I use I2C to the undocumented Atmel microcontroller to relay to the Toshiba in one of the 4 possible orderings of the register sequence, the panel comes up. The Toshiba docs say I need to do I2C writes at the beginning of the poweron sequence, but the closed firmware uses DSI transactions for the whole sequence, making me suspicious that the Atmel has already done some of the Toshiba register setup.
I've now submitted the DSI driver, panel, and clock patches after some massive cleanup. It even comes with proper runtime power management for both the panel and the VC4 DSI module.
The next steps in modesetting land will be to write an input driver for the panel, do any reworks we need from review, and get those chunks merged in time for 4.11. While I'm working on this, Boris is now looking at my HDMI audio code so that hopefully we can get that working as well. Eben is going to poke Dom about fixing the VC4driver interop with media decode. I feel like we're now making rapid progress toward feature parity on the modesetting side of the VC4 driver.
20 Dec 2016 1:00am GMT