28 Jun 2017

feedLXer Linux News

4 easy ways to work toward a zero trust security model

There has been a lot of talk about zero trust networks lately, but little consensus about what they actually are. Similar to DevOps or software defined networking, that zero trust means something a little different to everyone is becoming clear. That said, there is one thing we can all agree on: The network cannot be trusted.read more

28 Jun 2017 3:34pm GMT

An introduction to functional programming in JavaScript

When Brendan Eich created JavaScript in 1995, he intended to do Scheme in the browser. Scheme, being a dialect of Lisp, is a functional programming language. Things changed when Eich was told that the new language should be the scripting language companion to Java. Eich eventually settled on a language that has a C-style syntax (as does Java), yet has first-class functions. Java technically did not have first-class functions until version 8, however you could simulate first-class functions using anonymous classes.read more

28 Jun 2017 2:08pm GMT

feedLinuxtoday.com

Diskio Pi Wants to Be the Ultimate Open Source Tablet Powered by Raspberry Pi

Diskio Pi is the result of 18 months of development, and, in fact, it seems to be some sort of versatile device built on top of a single-board computer.

28 Jun 2017 1:00pm GMT

feedLXer Linux News

How to Install The Latest Mesa Version On Debian 9 Stretch Linux

Mesa is a big deal if you're running open source graphics drivers. It can be the difference between a smooth experience and an awful one. Mesa is under active development, and it sees constant noticeable performance improvements. That means it's really worthwhile to stay on top of the latest releases.

28 Jun 2017 12:42pm GMT

feedLinuxtoday.com

An introduction to functional programming in JavaScript

Explore functional programming and how using it can make your programs easier to read and debug.

28 Jun 2017 12:00pm GMT

feedLXer Linux News

How to Set Up an Email Server in Ubuntu

There are plenty of reasons to run your own email server. In the past it's been a real pain, but now using Ubuntu, Docker, and Mailcow, it's much easier.

28 Jun 2017 11:16am GMT

feedLinuxtoday.com

How to turn a Raspberry Pi into an eBook server

The Calibre eBook management software makes it easy to set up an eBook server on a Raspberry Pi 3, even in low-connectivity areas.

28 Jun 2017 11:00am GMT

feedLXer Linux News

Open Tools Help Streamline Kubernetes and Application Development

New tools are emerging that help streamline Kubernetes and make building container-based applications easier. Here, we will consider several open source options worth noting.

28 Jun 2017 9:51am GMT

Type 7 module runs Yocto Linux on 16-core Atom C3000 SoC

DFI's rugged, Linux-ready "DV970" COM Express Basic Type 7 module debuts the server-class, 16-core Atom C3000, and supports 4x 10GbE-KR and 16x PCIe 3.0. DFI promotes the DV970 as the first COM Express Basic Type 7 module based on the Intel Atom C3000 "Denverton" SoC, but it's the first product of any kind that we've seen that uses the SoC.

28 Jun 2017 8:25am GMT

CrossOver for Android Lets You Run Windows Apps on Intel-Based Chromebooks

CodeWeavers?, the commercial company behind the well-known CrossOver for Linux and Mac application that lets users install and run Windows apps and games is still working to release an Android version.

28 Jun 2017 6:59am GMT

feedLinuxtoday.com

Install Nginx with ngx_pagespeed on CentOS 7

HowToForge: Ngx-pagespeed is a free and open source Nginx module that can be used to speeds up your site and reduces page load time.

28 Jun 2017 6:00am GMT

feedLXer Linux News

Rugged marine computer runs Linux on Skylake-U

Avalue's "EMS-SKLU-Marine" is an IEC EN60945 certified computer with 6th Gen Core CPUs, -20 to 60°C support, plus 2x GbE, 4x USB 3.0, M2, and mini-PCIe. The EMS-SKLU-Marine is designed for maritime applications such as control room or engine room, integrated bridge systems, propulsion control or safety systems, and boat entertainment systems. Avalue touts the […]

28 Jun 2017 5:33am GMT

Extreme Tux Racer 0.7.4

One of the major games for Linux is 'Extreme Tux Racer', available for Android, Linux, Microsoft Windows, Macintosh operating systems and Ubuntu Touch. The goal of the game is to control Tux, or another chosen character, to get to the bottom of the hill. The character will slide down the hill of snow and ice on his belly. Along the way you can pick up herring.

28 Jun 2017 4:08am GMT

feedLinuxtoday.com

SUSE Expands Container Management and Deployment Capabilities

ServerWatch: Like most Linux vendors today, SUSE is keeping busy updating it portfolio to support the growing demand for container management and services.

28 Jun 2017 2:45am GMT

feedLXer Linux News

Ubuntu Budgie 17.04 – new kid on the block

Ubuntu Budgie is the newest addition to the officially supported Ubuntu flavours. It is quite interesting how these two parts can play together. The first time they married was the Ubuntu Budgie 16.04 remix. And since 17.04 Ubuntu Budgie is officially supported by Canonical.

28 Jun 2017 2:42am GMT

HMS Windows XP: Britain's newest warship runs Swiss Cheese OS

Spotted on carrier control room screens - reportsThe Royal Navy's brand new £3.5bn aircraft carrier HMS Queen Elizabeth is running Windows XP in her flying control room, according to reports.…

28 Jun 2017 1:16am GMT

27 Jun 2017

feedLXer Linux News

Install Nginx with ngx_pagespeed on CentOS 7

Ngx-pagespeed is a free and open source Nginx module that can be used to speeds up your site and reduces page load time. It works by automatically applying web performance best practices to pages and associated assets without requiring you to modify your existing content or workflow. You can easily optimize various files such as CSS, HTML, png, and jpg using Ngx-pagespeed module.

27 Jun 2017 11:50pm GMT

openSUSE Leap Is Now 99.9% Enterprise Distribution

Two years ago when openSUSE decided to move the base of openSUSE Leap to SUSE Linux Enterprise (SLE), they were entering uncharted territory. SLE is a tightly controlled enterprise ship that runs on mission critical systems. On the other hand openSUSE has been a community-driven project that, despite sponsorship from SUSE, is relatively independent.

27 Jun 2017 10:25pm GMT

feedLinuxtoday.com

SunTrust CIO's formula for speed relies on cloud, DevOps

EnterprisersProject: The 5-layer business acceleration plan that is working for SunTrust IT

27 Jun 2017 9:00pm GMT

feedLXer Linux News

Ubuntu Core Now Officially Supported for Raspberry Pi Compute Module 3 (CM3)

Canonical, the company behind the popular Ubuntu Linux operating system, today announced that their Snappy-enabled Ubuntu Core OS is now available for Raspberry Pi Compute Module 3 (CM3).

27 Jun 2017 8:59pm GMT

feedLinuxtoday.com

GitHub launches Open Source Friday

Developers from 24 Pull Requests and GitHub created Open Source Friday as a weekly reminder to give back to the open source projects that power our daily work.

27 Jun 2017 8:27pm GMT

feedLXer Linux News

GitHub launches Open Source Friday 

Open source software is developed by hobbyists and professionals alike. In fact, 65% of respondents to this year's GitHub open source survey who make contributions to open source projects do so as part of their job. However, the survey indicates that employers often lack a clear policy on employee contributions. A new project from GitHub aims to increase contributions to open source projects and to educate employers on why it's important.read more

27 Jun 2017 7:33pm GMT

feedLinuxtoday.com

Ubuntu Budgie 17.04 ??? new kid on the block

Ubuntu Budgie is the newest addition to the officially supported Ubuntu flavours.

27 Jun 2017 7:15pm GMT

Lumina Desktop 1.3 Released with Own, Default Material Design Icon Themes

The biggest change of the Lumina Desktop 1.3 release appears to be the replacement of KDE Project's Oxygen icon theme with its own Material Design icon themes

27 Jun 2017 6:00pm GMT

feedLXer Linux News

How to Open multiple files and switch between them in vi editor

In this Article you will learn how to Open multiple files and switch between them in vi editor.

27 Jun 2017 5:42pm GMT

20 Jun 2017

feedKernel Planet

Arnaldo Carvalho de Melo: Pahole in the news

Found another interesting article, this time mentioning a tool I wrote long ago and that, at least for kernel object files, has been working for a long time without much care on my part: pahole, go read a bit about it at Will Cohen's "How to avoid wasting megabytes of memory a few bytes at a time" article.

Guess I should try running a companion script that tries to process all .o files in debuginfo packages to see how bad it is for non-kernel files, with all the DWARF changes over these years…


20 Jun 2017 3:49pm GMT

15 Jun 2017

feedKernel Planet

Linux Plumbers Conference: Early Bird Rate Registration Ending Soon

A reminder that our Early Bird registration rate is ending soon. The last day at the Early Bird rate of 400$ is Sunday June 18th. We are also almost sold out of Early Bird slots (15% left of our quota). Get yours soon!
Starting June 19th registration will be at the regular rate of 550$.
Please see the Attend page for info.

15 Jun 2017 11:20pm GMT

14 Jun 2017

feedKernel Planet

Paul E. Mc Kenney: Stupid RCU Tricks: Simplifying Linux-kernel RCU

The last month or two has seen a lot of work simplifying the Linux-kernel RCU implementation, with more than 2700 net lines of code removed. The remainder of this post lists the user-visible changes, along with alternative ways to get the corresponding job done.


  1. The infamous CONFIG_RCU_KTHREAD_PRIO Kconfig parameter is now defunct, but the rcutree.kthread_prio kernel boot parameter gets the job done.
  2. The CONFIG_NO_HZ_FULL_SYSIDLE Kconfig parameter has kicked the bucket. There is no replacement because no one was using it. If you need it, revert the -rcu commit tagged by sysidle.2017.05.11a.
  3. The CONFIG_PROVE_RCU_REPEATEDLY Kconfig parameter is no more. There is no replacement because as far as I know, no one has used it for many years. It was a great help in tracking down lockdep-RCU warnings back in the day, but these warnings are now sufficiently rare that finding them one boot at a time is no longer a problem. If you need it, do the obvious hacking on Kconfig and lockdep.c.
  4. The CONFIG_SPARSE_RCU_POINTER Kconfig parameter now rests in peace. There is no replacement because there doesn't seem to be any reason for RCU's sparse checking to be the only such checking that is optional. If you really need to disable RCU's sparse checking, hand-edit the definition as needed.
  5. The CONFIG_CLASSIC_SRCU Kconfig parameter bought the farm. This was only present to handle massive failures of the new Tree/Tiny SRCU implementations, but these appear to be quite reliable and should be used instead of Classic SRCU.
  6. RCU's debugfs tracing is done for. As far as I know, I was the only real user, and I haven't used it in years. If you need it, revert the -rcu commit tagged by debugfs.2017.05.15a.
  7. The CONFIG_RCU_NOCB_CPU_NONE, CONFIG_RCU_NOCB_CPU_ZERO, and CONFIG_RCU_NOCB_CPU_ALL Kconfig parameters have departed. Use the rcu_nocbs kernel boot parameter instead, which can do quite a bit more than those Kconfig parameters ever could.
  8. Tiny RCU's event tracing and RCU CPU stall warnings are now pushing up daisies. The point of Tiny RCU is to be tiny and educational, and these added features were not helping reach either of these two goals. The replacement is to reproduce the problem with Tree RCU.
  9. These changes should matter only to people running rcutorture:

    1. The CONFIG_RCU_TORTURE_TEST_SLOW_PREINIT and CONFIG_RCU_TORTURE_TEST_SLOW_PREINIT_DELAY Kconfig parameters have been entombed: Use the rcutree.gp_preinit_delay kernel boot parameter instead.
    2. The CONFIG_RCU_TORTURE_TEST_SLOW_INIT and CONFIG_RCU_TORTURE_TEST_SLOW_INIT_DELAY Kconfig parameters have given up the ghost: Use the rcutree.gp_init_delay kernel boot parameter instead.
    3. The CONFIG_RCU_TORTURE_TEST_SLOW_CLEANUP and CONFIG_RCU_TORTURE_TEST_SLOW_CLEANUP_DELAY Kconfig parameters have passed on: Use the rcutree.gp_cleanup_delay kernel boot parameter instead.

There will probably be a few more simplifications in the near future, but this should be at least enough for one merge window!

14 Jun 2017 9:03pm GMT

12 Jun 2017

feedKernel Planet

Linux Plumbers Conference: RDMA Microconference Accepted into the Linux Plumbers Conference

Following on from the successful RDMA Microconference last year, which resulted in a lot of fruitful discussions we're pleased to announce there will be a follow on at Plumbers in Los Angeles this year.

In addition to looking at the usual kernel core gaps and ABI issues, Documentation and testing, we'll also be looking at new fabrics (including NVME), challenges to implement virtual RDMA device and integration possibilities with netdev.

For more details on this, please see this microconference's wiki page.

12 Jun 2017 9:53pm GMT

Arnaldo Carvalho de Melo: Article about ‘perf annotate’

Just found out about Ravi's article about 'perf annotate', concise yet covers most features, including cross-annotation, go read it!


12 Jun 2017 7:21pm GMT

09 Jun 2017

feedKernel Planet

Paul E. Mc Kenney: Stupid RCU Tricks: rcutorture Accidentally Catches an RCU Bug

With the Linux-kernel v4.13 merge window coming up, it is time to do at least a little heavy-duty testing of the patches destined for v4.14, which had been but lightly tested on my laptop. An overnight run on a larger test machine looked very good-with the exception of scenario TREE01 (defined by tools/testing/selftests/rcutorture/configs/rcu/TREE01{.boot,} in the Linux-kernel source tree), which got no fewer than 190 failures in a half-hour run. In other words, rcutorture saw 190 too-short grace periods in 30 minutes, for about one every 20 seconds.

This is not just bad. This is RCU completely and utterly failing to be RCU.

My first action was to re-run the tests on the commits slated for v4.13. You can imagine my relief to see them pass on all scenarios, including TREE01.

Then it was time for bisection. I have been burned many times by false bisections due to RCU's probabilistic failure modes, so I ran 24 30-minute tests on each commit. Fortunately, I could run six in parallel, so that each commit only consumed about two hours of test time. The bisection converged on a commit that adds a --kconfig argument to the rcutorture scripts, which allow me to do things like force lockdep to run in all scenarios. However, this commit should have absolutely no effect on the inner workings of RCU.

OK, perhaps this commit managed to fatally mess up the .config file. But no, the .config files from this commit compare equal to those from the preceding commit. Some additional poking gives me confidence that the kernels being built are also identical. Still, the one fails and the other does not.

The next step is to look very carefully at the console output from the failing runs, most of which contain many complaints about RCU grace periods being too short. Except that one of them also contains RCU CPU stall warnings. In fact, one of the stall warnings lists no fewer than 26 CPUs as stalling the current RCU grace period.

This came as a bit of a surprise, partly because I don't ever recall ever seeing that many CPUs stalling a single grace period, but mostly because the test was only supposed to use eight CPUs.

A look at the beginning of the console output showed that RCU was inexplicably prepared to deal with 43 CPUs instead of the expected eight. A bit more digging showed that the qemu command used to run the failing test had "-smp 43", while the qemu command for the successful test instead had "-smp 8". In both cases, the qemu command also included the kernel boot parameter "maxcpus=8". And a very stupid bug in the --kconfig change to the scripts turned out to be responsible for the bogus -smp argument.

The next step is to swap the values of qemu's -smp argument. And the failure follows the "-smp 43" setting. This means that it is possible that the RCU failures are due to a latent timing bug in RCU. After all, the test system has only 64 CPUs, and I was running 43*6=258 CPUs worth of tests on it. But running six concurrent rcutorture tests with both -smp and maxcpus set to 43 passes with flying colors. So RCU must be suffering from some other problem.

The next question is exactly what is supposed to happen when qemu and the kernel have very different ideas of how many CPUs there are. The ever-helpful Documentation/admin-guide/kernel-parameters.txt file states that maxcpus= limits not the overall number of CPUs, but rather the number that are brought up at boot time. Another look at the console output confirms that in the failing case, eight CPUs are brought up at boot time. However, the other 35 come online some time after boot, sometimes taking a few minutes to come up. Which explains another anomaly I noticed while bisecting, namely that about half the tests ran 30 minutes without failure, but the ones that failed did so within the first five minutes of the run. Apparently the RCU failures are connected somehow to the late arrival of the extra 35 CPUs.

Except that RCU configured itself for the full 43 CPUs, and RCU is supposed to be able to handle CPUs coming and going. In fact, RCU has repeatedly demonstrated its ability to handle CPUs coming and going for more than a decade. So it is time to enable event tracing on a failure scenario (thank you, Steve!). One of the traces shows that there is no RCU callback connected with the first failure, which points the finger of suspicion at RCU expedited grace periods.

A quick inspection of the expedited code shows missing synchronization for the case where a CPU makes its very first appearance just as an expedited grace period starts. Oh, the leaf rcu_node structure's ->lock is held both when updating the number of CPUs that have ever been seen (which is the rcu_state structure's ->ncpus field) and when updating the bitmasks indicating exactly which CPUs have ever been seen (which is the leaf rcu_node structure's ->expmaskinitnext field), but it drops that lock between those two updates.

This means that the expedited grace period might sample the ->ncpus field, notice the change, and therefore check all the ->expmaskinitnext fields-but before those fields had been updated. Not a problem for this grace period, since the new CPUs haven't yet started and thus cannot yet be running any RCU read-side critical sections, which means that there is no reason whatsoever for this grace period to pay any attention to them. However, the next expedited grace period would again sample the ->ncpus field, see no change, and thus not bother checking the ->expmaskinitnext fields. Thus, this grace period would also ignore the new CPUs, which by this time could be very much alive and running RCU read-side critical sections. Hence the too-short grace periods, and hence them showing up within the first few minutes of the run, during the time that the extra 35 CPUs are in the process of coming online.

The fix is easy: Just move the update of ->ncpus to the same critical section as the update of ->expmaskinitnext. With this fix, rcutorture passes the TREE01 scenario even with bogus -smp arguments to qemu. There is therefore once again a bug in rcutorture: There are still bugs in RCU somewhere, and rcutorture is failing to find them!

Strangely enough, I might never have noticed the bug in expedited grace periods had I not made a stupid mistake in the scripting. Sometimes it takes a bug to locate a bug!

09 Jun 2017 8:49pm GMT

07 Jun 2017

feedKernel Planet

Paul E. Mc Kenney: Verification Challenge 6: Linux-Kernel Tree RCU

It has been more than two years since I posted my last verification challenge, so it is only natural to ask what, if anything, has happened in the meantime. The answer is "Quite a bit!"

I had the privilege of attending The Royal Society Verified Trustworthy Software Systems Meeting, where I was on a panel on "Verification in Industry". I also attended the follow-on Verified Trustworthy Software Systems Specialist Meeting, where I presented on formal verification and RCU. There were many interesting presentations (see slides 9-12 of this presentation), the most memorable being a grand challenge to apply formal verification to machine-learning systems. If you think that challenge is not all that grand, you should watch this video, which provides an entertaining demonstration of a few of the difficulties.

Closer to home, in the past year there have been three successful applications of automated formal-verification tools to Linux-kernel RCU:


  1. Lihao Liang applied the C Bounded Model Checker (CBMC) to Tree RCU (draft paper). This was a bit of a tour de force, converting Linux-kernel C code (with a bit of manual preprocessing) to a logic expression, then invoking a SAT solver on that expression. The expression's variables correspond to the inputs to the code, the possible multiprocessor scheduling decisions, and the possible memory-model reorderings. The expression evaluates to true if there exists an execution that triggers an assertion. The largest expression had 90 million boolean variables, 450 million clauses, occupied some tens of gigabytes of memory, and, stunningly, was solved with less than 80 hours of CPU time. Pretty amazing considering that SAT is NP complete and that two to the ninety millionth power is an excessively large number!!!
  2. Michalis Kokologiannakis applied Nidhugg to Tree RCU (draft paper). Nidhugg might not make quite as macho an attack on an NP-complete problem as does CBMC, but there is some reason to believe that it can handle larger chunks of the Linux-kernel RCU code.
  3. Lance Roy applied CBMC to Classic SRCU. Interestingly enough, this verification could be carried out completely automatically, without manual preprocessing. This approach is therefore available in my -rcu tree (git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git) on the branch formal.2017.06.07a.



In all three cases, the tools verified portions of RCU and SRCU as correct, and in all three cases, the tools successfully located injected bugs. (Hey, any of us could write a program that consisted of "printf("Validated/n");", so you do have to validate the verifier!) And yes, I have given these guys trouble about the fact that their tools didn't find any bugs that I didn't already know about, but these results are nevertheless extremely impressive. Had you told me ten years ago that this would happen, I have no idea how I would have responded, but I most certainly would not have believed you.

In theory, Nidhugg is more scalable but less thorough than CBMC. In practice, it is too early to tell.

So what is the status of the first five verification challenges?


  1. rcu_preempt_offline_tasks(): Still open. That said, Michalis found the infamous RCU bug at Linux-kernel commit 281d150c5f88 and further showed that my analysis of the bug was incorrect, though my fixes did actually fix the bug. So this challenge is still open, but the tools have proven their ability to diagnose rather ornate concurrency bugs.
  2. RCU NO_HZ_FULL_SYSIDLE: Still open. Perhaps less pressing given that it will soon be removed from the kernel, but the challenge still stands!
  3. Apply CBMC to something: This is an ongoing challenge to developers to give CBMC a try on concurrent code. And why not also try Nidhugg?
  4. Tiny RCU: This was a self-directed challenge was "born surmounted".
  5. Uses of RCU: Lihao, Michalis, and Lance verified some simple RCU uses as part of their work, but this is an ongoing challenge. If you are a formal-verification researcher and really want to prove your tool's mettle, take on the Linux kernel's dcache subsystem!



But enough about the past! Given the progress over the past two years, a new verification challenge is clearly needed!

And this sixth challenge is available on 20 branches whose names start with "Mutation" at https://github.com/paulmckrcu/linux.git. Some of these branches are harmless transformations, but others inject bugs. These bugs range from deterministic failures to concurrent data races to forward-progress failures. Can your tool tell which is which?

If you give any of these a try, please let me know how it goes!

07 Jun 2017 8:41pm GMT

04 Jun 2017

feedKernel Planet

Kernel Podcast: Catching up on podcasts…new one drops Monday!

Sorry for the delay with getting podcasts out. I'm working on a new one! Coming Monday!

04 Jun 2017 6:44am GMT

31 May 2017

feedKernel Planet

Eric Sandeen: 2012 Nissan LEAF battery deathwatch

First of all - I think EVs are great. They are the future of personal transportation. But this is the story of a first-gen EV battery with some … issues.

I bought a used 2012 Nissan LEAF with about 38k miles for a great price - in part because it started life as a leased car in Texas, and the early LEAF batteries didn't much like the heat. As a result, the battery is not super healthy, with only about 60 miles of range on a full charge on a balmy day. While this is enough to get me around on most days, there are times when a bit more range would be nice. Thankfully, Nissan has retroactively warrantied LEAF batteries to retain 70% of their capacity (really, closer to 66%) for the first 5 years or 60,000 miles.

The LEAF dash shows remaining battery capacity (as opposed to current charge) on a 12-bar scale; when new, it showed 12 bars, and Nissan will warranty the battery if it gets to 8 bars or less. My car currently has 9 bars. 1 to go.

So this was a gamble. I'd actually like my battery to lose enough capacity before January 2018 to get a warranty replacement.

Thanks to a cool app called LeafSpy, I can monitor battery health,and correlate it to what others have said about when they dropped that 9th bar. I'll try to remember to update this periodically, but here are the readings so far, with trend lines and "target" values based on when The Internet said they lost their 9th bar, on average. The aHr metric seems most relevant. With luck, it looks like I may make it, though I can't explain the recent plateau after the initial steady decline…

I'll try to remember to update this occasionally as time goes by.
Update: Here's a constantly updated version of my stats:

31 May 2017 1:41am GMT

24 May 2017

feedKernel Planet

Pete Zaitcev: Community Meeting

<notmyname> first, the idea of having a regular meeting in addition to this one for people in different timezones
<cschwede_> +2!
<notmyname> specifically, mahatic and pavel/onovy/seznam. but of course we've all seen various chinese contributors too
<notmyname> but the point is that it's a place to bring up stuff that those in the other time zones are working on
<mattoliverau> Cool
<notmyname> I think it's a terrific idea
<tdasilva> i bet the guys working on tape would like that too
<notmyname> my goal is to find a time for it that is so horrible for US timezones that it will be obvious that not everyone needs to be there
<zaitcev> Yeah, if only there was a way to send a message... like a mail... to a list of people. And then it could be stored on a computer somewhere, ready to be read in any timezone recepient is in.
<notmyname> zaitcev: crazytown!
<mattoliverau> zaitcev: now your just talkin crazy

24 May 2017 9:52pm GMT

23 May 2017

feedKernel Planet

Michael Kerrisk (manpages): Linux Shared Libraries course, Munich, Germany, 20 July 2017

I've scheduled a public instance of my "Building and Using Shared Libraries on Linux" course to take place in Munich, Germany on 20 July 2017. This one-day course provides a thorough introduction to building and using shared libraries. covering topics such as: the basics of creating, installing, and using shared libraries; shared library versioning and naming conventions; the role of the dynamic linker; run-time symbol resolution; controlling symbol visibility; symbol versioning; preloading shared libraries; and dynamically loaded libraries (dlopen). The course format is a mixture of theory and practical.

The course is aimed at programmers who create and use shared libraries. Systems administrators who are managing and troubleshooting applications that use shared libraries will also find the course useful.

You can find out more about the course (such as expected background and course pricing) at http://man7.org/training/shlib/ and see a detailed course outline at
http://man7.org/training/shlib/shlib_course_outline.html.

23 May 2017 2:14pm GMT

Michael Kerrisk (manpages): Cgroups/namespaces/seccomp/capabilities course

There are still some places available on my "Linux Security and Isolation APIs" that will take place in Munich, Germany on 17-19 July 2017. This three-day course provides a deep understanding of the low-level Linux features (set-UID/set-GID programs, capabilities, namespaces, cgroups, and seccomp) used to implement privileged applications and build container, virtualization, and sandboxing technologies. The course format is a mixture of theory and practical.

The course is aimed at designers and programmers building privileged applications, container applications, and sandboxing applications. Systems administrators who are managing such applications are also likely to find the course of benefit.

You can find out more about the course (such as expected background and course pricing) at
http://man7.org/training/sec_isol_apis/
and see a detailed course outline at
http://man7.org/training/sec_isol_apis/sec_isol_apis_course_outline.html

23 May 2017 2:01pm GMT

18 May 2017

feedKernel Planet

Linux Plumbers Conference: Linux Kernel Memory Model Workshop Accepted into Linux Plumbers Conference

A good understanding of the Linux kernel memory model is essential for a great many kernel-hacking and code-review tasks. Unfortunately, the current documentation (memory-barriers.txt) has been said to frighten small children, so this workshop's goal is to demystify this memory model, including hands-on demos of the tools, help installing/running the tools, and help constructing appropriate litmus tests. These tools should go a long way toward the ultimate goal of automating the process of using memory models to frighten small children.

For more information, please see this microconference's wiki page. For those who like getting a head start, this page also includes information on downloading and installing the tools, the memory model, and thousands of pre-existing litmus tests. (Collect the whole set!!!) We also welcome experience reports from early adopters of these tools.

We hope to see you there!

18 May 2017 5:04pm GMT

15 May 2017

feedKernel Planet

Kernel Podcast: Linux Kernel Podcast for 2017/05/14

Audio: http://traffic.libsyn.com/jcm/20170514.mp3

In this week's catchup mega-issue: Linux 4.12-rc1 (including a full summary of the 4.12 merge window), Linux 4.11 final is released, saving TLB flushes, various ongoing development, and a bunch of announcements.

Editorial Note

This podcast is a free service that I provide to the community in my spare time. It takes many, many hours to prepare and produce a single episode, much more during the merge window. This means that when I have major events (such as Red Hat Summit followed by OpenStack Summit) it will be delayed, as was the case this last week week. Over the coming months, I hope to automate the production in order to reduce the overhead but there will be some weeks where I need to skip a show. I am however covering the whole 4.12 merge window regardless. So while I would usually have just moved on, the circumstance warrants a mega-length catchup episode. I hope you're still awake by the end.

Linux 4.12-rc1

Linus Torvalds announced Linux 4.12-rc1, "one day early, because I don't like last-minute pull requests during the merge window anyway, and tomorrow is mother's day [in the US], so I may end up being roped into various happenings". He also noted "Besides, this has actually been a pretty large merge window, so despite there technically being time for one more day of pulls, I actually do have enough changes already. So there." In his announcement, he says things look smooth so far, but calls those also "Famous last words". Finally, he calls out the "odd" diffstat which is dominated by the AMD Vega10 headers. As was noted in the pull requests, unlike certain other graphics companies, AMD actually provides nice automatically generated headers and other information about their graphics chipsets, which is why the Vega10 update is plentiful.

Later in the day yesterday, following the 4.12-rc1 announcement, Guenter Roeck posted "watchdog updates for v4.12", and Jon Mason posted "NTB bug fixes for vv4.12", along with an apologies for tardiness.

Linux 4.11

Linus Torvalds announced Linux 4.11 noting that the extra week due a (rare-ish) "rc8" (Release Candidate 8) meant that he had felt "much happier releasing a final 4.11 now". As usual, Linux Kernel Newbies has a writeup of 4.11, here: https://kernelnewbies.org/Linux_4.11

Announcements

Greg K-H (Kroah-Hartman) announced Linux 4.4.68, 4.9.28, 4.10.16, and 4.11.1. He later sent "Bad signatures on recent stable updates" in which he noted that "The stable kernels I just released have had signatures due to a mixup using pixz in the new kernel.org backend. It will be fixed soon…", which were later corrected. He would like to hear from anyone still seeing problems.

Greg also announced (separately) Linux 3.18.52. While Jiri Slaby announced Linux 3.12.74.

Stephen Hemminger announced iproute2 4.11 matching the new kernel release.

Michael Kerrisk announced map-pages-4.11.

Steven Rostedt announced trace-cmd 2.6.1.

Steven also announced Linux 4.4.66-rt79, 3.18.51-rt57, and 3.12.73-rt98 (preempt-rt) kernels.

Con Kolivas posted an updated version of his (renamed) "MuQSS CPU scheduler" [renamed from the BFS - Brain F*** Scheduler] in Linux 4.11-ck1.

Karel Zak announced util-linux v2.30-rc1, which includes a fix to libblkid that "has been fixed to extract LABEL= and UUID= from UDF rather than ISO9660 header on hybrid CDROM/DVD media. This change[] makes UDF media on Linux user-space more compatible with another operation systems." but he calls it out since it could also introduce regressions for some other users.

Junio C Hamano announced Git version 2.13.0. Separately, he released maintenance versions of "Git v2.12.3 and others" which include fixes for
"a recently disclosed problem with "git shell", which may allow a user who comes over SSH to run an interactive pager by causing it to spawn "git upload-pack -help" (CVE-2017-8386)."

Jan Kiszka announced version 0.7 of the Jailhouse hypervisor, which includes various debug and regular console driver updates and gcov debug statistics.

Bartosz Golaszewski announced libgpiod v0.2: "The most prominent new feature is the test suite working together with the gpio-mockup module".

Christoph Hellwig notes that the Open OSD [an in-kernel OSD - Object-Based Storage Device] SCSI initiator library for Linux seems to be dead. He does this by posting a patch to the MAINTAINERS file "update OSD entries" in which he removes the (now defunct) open-osd.org website, and the bouncing email address for Benny Halevy. Benny appeared and ACKed.

In a similar vain, Ben Hutchings pondered aloud about the "Future of liblockdep", which apparently "hasn't been buildable since (I think) Linux
4.6". Sasha Levin said things would be cleaned up promptly. And they were, with a pull request soon following with fixes for Linux 4.12.

Masahiro Yamada posted an RFC patch entitled "Increase Minimal GNU Make version for Linux Kernel from 3.80 to 3.81" in which he essentially noted that the kernel hadn't actually worked with 3.80 (which is 15 years old!) in a very long time, but instead actually really needs 3.81 (which was itself released in 2006). It was apparently "broken" 3 years ago, but nobody noticed. Neither Greg K-H (Kroah-Hartman) nor Linus seemed to lose any sleep over this, with Linus saying "you make a strong case of "it hasn't worked for a while already and nobody even noticed"".

Paolo Bonzini posted "CFP: KVM Forum 2017" announcing that the KVM Forum will be held October 25-27 at the Hilton in Prague, CZ, and that all submissions for proposed topics must be made by midnight June 15.

Thomas Gleixner announced "[CFP] RT-Summit - Call for Presentations" noting that the Real-Time Summit 2017 is being organized by the Linux Foundation Real-Time Linux (RTL) collaborative project in cooperation with OSADL/RTLWS and will be held also in Prague on October 21st. The cutoff for submissions is July 14th via rt-cfp@linutronix.de.

4.12 Merge Window

In his 4.11 announcement, Linus reminded us that the release of 4.11 meant that "the merge window [for kernel 4.12] is obviously open. I already have two pull request[s] for 4.12 in my inbox, I expect that overnight I'll get a lot more." He wasn't disappointed. The flood gates well and truly opened. And they continued going for the entire two week (less one day) period. Let's dive into what has been posted so far for 4.12 during the (now closed) merge window.

Stephen Rothwell [linux-next pre-merge development kernel tree maintainer] noted in a head's up that Linus was going to see a "Large new drm driver" [drm - Direct Rendering Manager, not the "digital rights" technology]. Dave Airlie (the drm maintainer) had a reply but Stephen said everything was just fine and he was simply seeking to avoid surprising Linus (again). Once the pull came in, and Linus had pulled it, he quickly followed up to note that he was getting a lot of warnings about "Atomic update on pipe (A) took". Daniel Vetter followed up to say that "We [Intel] did improve evasion a lot to the point that it didn't show up in our CI machines anymore, so we felt we could risk enabling this everywhere. But of course it pops up all over the place as soon as drm-next hits mainline".

4.12 git Pulls for existing subsystems

Hans-Christian Noren Egtvedt posted "AVR32 change for 4.12 - architecture removal" in which he removes AVR32 and "clean away the most obvious architecture related parts". He posted followups to pick off more leftovers.

Ingo Molnar posted "RCU changes for 4.12" which includes "Parallelize SRCU callback handling", performance improvements, documentation updates, and various other fixes. Linus pulled it. But then "after looking at it, ended up un-pulling it again". He posted a rant about a new header file (linux/rcu_segcblist.h) which was a "header file from hell", saying "I see absolutely no point in taking a heade file of several hundred lines of code", along with more venting about the use of too much inline code (code that is always expanded in-place rather than called as a function - leading to a larger footprint sometimes). Finally, Linus said "The RCU code needs to start showing some good taste". Sir Paul McKenney, the one and only author of RCU followed up swiftly, apologizing for the transgression in attempting to model "the various *list*.h header files", proposing a fix, which Linus liked. Ingo Molnar implemented the suggestions, in "srcu: Debloat the <linux/rcu_segcblist.h> head", which Paul provided a minor fix against for the case of !SMP (non-multi-processor kernel) builds.

Ingo Molnar also posted "EFI changes for 4.12" including fixes to the BGRT ACPI table (used for boottime graphics information) to allow it to be shared between x86 and ARM systems, an update to the arm64 boot protocol, improvements to the EFI stub's command line parsing, and support for randomizing the virtual mapping of UEFI runtime services on arm64. The latter means that the function pointers for UEFI Runtime Services callbacks will be placed into random virtual address locations during the call to ExitBootServices which sets up the new mappings - it's a good way to look for problems with platforms containing broken firmware that doesn't correctly handle the change in location of runtime service calls.

Ingo Molnar also posted "x86/process changes for 4.12" which includes a new ARCH_[GET|SET]_CPUID prctl (process control) ABI extension that a running process can use in order to determine whether it has access to call the CPUID instruction directly. This is to support a userspace debugger known as "rr" that would like to trap and emulate calls to "CPUID" which are otherwise normally unprivileged on x86 systems.

Separately, Ingo posted "x86 fixes", which includes "mostly misc fixes" for such things as "two boot crash fixes", etc.

Ingo Molnar also posted "perf changes for 4.12" which includes updates to K and uprobes, making their trampolines (the codepaths jumped through when executing the probe sequence) read-only while they are used, changing UPROBES_EVENTS to be default yes in the Kconfig (since distros do this), and various other fixes. He also includes support for AMD IOMMU events, and new events for Intel Goldmont CPUs. The perf tooling itself gets many more updates, including PERF_RECORD_NAMESPACES, which allows the kernel to record information "required to associate samples to namespaces".

Separately, Ingo posted "perf fixes", which includes "mostly tooling updates".

Ingo Molnar also posted "RAS changes for v4.12" which includes a "correct Errors Collector" kernel feature that will gather statistics aout correctable errors affecting physical memory pages. Once a certain watermark is reached, pages generating many correctable errors will be permanently offlined [this is useful both for DDR and NV-DIMMs]. Finally, he deprecates the existing /dev/mcelog driver and includes cleanups for MCE (Machine Check Exception) errors during kexec on x86 (which we covered in previous editions of this podcast).

Ingo Molnar also posted "x86/asm changes for v4.12", which includes various fixes, among which are cleanups to stack trace unwinding.

Ingo Molanr also posted "x86/cpu changes for v4.12", which includes support for "an extension of the Intel RDT code to extend it with Intel Memory Bandwidth Allocation CPU support: MBA allows bandwidth allocation between cores, while CBM (already upstream) allows CPU cache partitioning". Effectively, Intel incorporate changes to their memory controller's hardware scheduling algorithms as part of RDT. These allow the DDR interface to manage bandwidth for specific cores, which will almost certainly include both explict data operations, as well as separate algorithms for prefetching and speculative fetching of instructions and data. [This author has spent many hours reading about memory controller scheduling over the past year]

Ingo Molnar also posted "x86/debug changes for v4.12", which includes support for the USB3 "debug port" based early console. As we have mentioned previously, USB3 includes a built-in "debug port" which no longer requires a special dongle to connect a remote machine for debug. It's common in Windows kernel development to use a debug port, and since USB3 includes baseline support with the need for additional hardware, serial over USB3 is likely to become more common when developing for Linux - especially with the demise of DB9 headers on systems or even IDC10 headers on motherboards internally (to say nothing of laptop systems). As a reminder, with debug ports, usually only one USB port will support debug mode. I
guess my old USB debug port dongle can go in the pile of obsolete gear.

Ingo Molnar also posted "x86/platform changes for v4.12" which includes "continued SGI UV4 hardware-enablement changes, plus there's also new Bluetooth support for the Intel Edison [a low cost IoT board] platform".

Ingo Molnar also posted "x86/vdso changes for v4.12" which includes support for a "hyper-V TSC page" which is what it sounds like - a special shared page made available to guests under Microsoft's Hyper-V hypervisor and providing a fast means to enumerate the current time. This is plumbed into the kernel's vDSO mechanism (Virtual Dynamic Shared Objects look a bit like software libraries that are automatically linked against every running program when it launches) to allow fast clock reading.

Ingo Molnar also posted "x86/mm changes for v4.12", which includes yet more work toward Intel 5-level paging among many other updates.

Separately Ingo posted a single "core kernel fix" to "increase stackprotector canary randomness on 64-bit kernels with very little cost".

Thomas Gleixner posted "irq updates for 4.12", which include a new driver for a MediaTek SoC, ACPI support for ITS (Interrupt Translation Services) when using a GICv3 on ARM systems, support for shared nested
interrupts, and "the usual pile of fixes and updates all over t[h]e place".

Thomas Gleixner also posted "timer updates for 4.12" that include more reworking of year 2038 support (the infamous wrap of the Unix epoch), a "massive rework of the arm architected timer", and various other work.

Separately, Ingo Molnar followed up with "timer fix" including "A single ARM Juno clocksource driver fix".

Corey Minyard posted "4.12 for IPMI" including a watchdog fix. He "switched over to github at Stephen Rothwell's [linux-next maintainer] request".

Jonathan Corbet posted "Docs for 4.12" which includes "a new guide for user-space API documents" along with many other updates. Anil Nair noted "Missing File REPORTING-BUGS in Linux Kernel" which suggests that the Debian kernel package tools need to be taught about the recent changes in the kernel's documentation layout. Separately, Jonathan replied to a thread entitled "Find more sane first words we have to say about Linux" noting that the kernel's documentation files might not be the first place that someone completely new to Linux is going to go looking for information: "So I don't doubt we could put something better there, but can we think for a moment about who the audience is here? If you're "completely new to Linux", will you really start by jumping into the kernel source tree?" The guy should do kernel standup in addition to LWN. It'd be hilarious.

Later, Jon posted "A few small documentation updates" which "Connect the newly RST-formatted documentation to the rest; this had to wait until the input pull was done. There's also a few small fixes that wandered in".

Tejun Heo posted "libata changes for 4.12-rc1" which includes "removal of SCT WRITE SAME support, which never worked properly". SCT stands for "SMART [Self Monitoring And Reporting Technology - an error management mechanism common in contemporary disks] Command Transport". The "write same" part means to set the drive content to a specific pattern (e.g. to zero it out) in cases that TRIM is not available. One wonders if that is also a feature used during destruction, though apparently the only (NSA) trusted way to destroy disks today is shredding and burning after zeroing.

Tejun Heo also posted "workqueue changes for v4.12-rc1", which includes "One trivial patch to use setup_deferrable_timer() instead of open-coding the initialization".

Tejun Heo also posted "cgroup changes for v4.12-rc1", which includes a "second stab at fixing the long-standard race condition in the mount path and suppression of spurious warning from cgroup_get".

Rafael J. Wysocki posted "Power management updates for v4.12-rc1, part 1" which includes many updates to the cpufreq subsystem and "to the intel_pstate driver in particular". Its sysfs interface has apparently also been reworked to be more consistent with general expectations. He adds "Apart from that, the AnalyzeSuspend utility for system suspend profiling gets a companion called AnalyzeBoot for the analogous profiling of system boot and they both go into one place".

Separately, he posted "Power management updates for v4.12-rc1, part 2", which "add new CPU IDs [Intel Gemini Lake] to a couple of drivers [intel_idle and intel_rapl - Running Average Power Limit], fix a possible NULL pointer deference in the cpuidle core, update DT [DeviceTree]-related things in the generic power domains framwork and finally update the suspend/resume infrastructure to improve the handling of wakeups from suspend-to-idle".

Rafael J. Wysocki also posted "ACPI updates for v4.12-rc1, part 1", which includes a new Operation Region driver for the Intel CHT [Cherry Trail] Whiskey Cove PMIC [Power Management Integrated Circuit], and new sysfs entries for CPPC [Collaborative Processor Performance Control], which is a much more fine grained means for OS and firmware to coordinate on power management and CPU frequency/performance state transitions.

Separately, he posted "ACPI updates for v4.12-rc1, part 2", which "update the ACPICA [ACPI - Advanced Configuration and Power Interface - Component Architecture, the cross-Operating System reference code]" to "add a few minor fixes and improvements", and also "update ACPI SoC drivers with new device IDs, platform-related information and similar, fix the register information in xpower PMIC [Power Management IC] driver, introduce a concept of "always present" devices to the ACPI device enumeration code and use it to fix a problem with one platform [INT0002, Intel Cherry Trail], and fix a system resume issue related to power resources".

Separately, Benjamin Tissories posted a patch reverting some ACPI laptop lid logic that had been introduced in Linux 4.10 but was breaking laptops from booting with the lid closed (a feature folks especially in QE use).

Rafael J. Wysocki also posted "Generic device properties framework updates for v4.12-rc1", which includes various updates to the ACPI _DSD [Device Properties] method call to recognize "ports and endpoints".

Shaohua Li posted "MD update for 4.12" which includes support for the "Partial Parity Log" feature present on the Intel IMSM RAID array, and a rewrite of the underlying MD bio (the basic storage IO concept used in Linux) handling. He notes "Now MD doesn't directly access bio bvec, bi_phys_segments and uses modern bio API for bio split".

Ulf Hansson posted "MMC for v[.]4.12" which includes many driver updates as well as refactoring of the code to "prepare for eMMC CMDQ and blkmq". This is the planned transition to blkmq (block-multiqueue) for such storage devices. Previously it had stalled due to the performance hit when trying to use a multi-queue approach on legacy and contemporary non-mq devices.

Linus Walleij posted "pin control bulk changes for v4.12" in which he notes that "The extra week before the merge window actually resulted in some of the type of fixes that usually arrive after the merge window already starting to trickle in from eager developers using -next, I'm impressed". He's also impressed with the new "Samsung subsystem maintainer" (Krzysztof). Of the many changes, he says "The most pleasing to see is Julia Cartwright[']s work to audit the irqchip-providing drivers for realtime locking compliance. It's one of those "I should really get around to looking into that" things that have been on my TODO list since forever".

Linus Walliej also posted "Bulk GPIO changes for v4.12", which has "Nothing really exciting goes on here this time, the most exciting for me is the same as for pin control: realtime is advancing thanks [t]o Julia Cartwright".

Petr Mladek posted "printk for 4.12" which includes a fix for the "situation when early console is not deregistered because the preferred one matches a wrong entry. It caused messages to appear twice".

Jiri Kosina posted "HID for 4.12" which includes various fixes, amongst them being an inversion of the HID_QUIRK_NO_INIT_REPORTS to the opposite due to the fact that it is appearently easier to whitelist working devices.

Jiri Kosina also posted "livepatching for 4.12" which includes a new "per-task consistency model" that is "being added for architectures that support reliable stack dumping", which apparently "extends the nature of the types of patches than can be applied by live patching".

Lee Jones posted "Backlight for 4.12" which includes various fixes.

Lee Jones also posted "MFD for v4.12" which includes some new drivers, new device support, and various new functionality and fixes.

Juergen Gross posted "xen: fixes and features for 4.12" which includes support for building the kernel with Xen enabled but without enabling paravirtualization, a new 9pfs xen frontend driver(!), and support for EFI "reset_sytem" (needed for ARMv8 Dom0 host to reboot), among various other fixes and cleanups.

Alex Williamson posted "VFIO updates for v4.12-rc1".

Joerg Roedel posted "IOMMU Updates for Linux v4.12", which includes "Some code optimizations for the Intel VT-d driver, code to "switch off a previously enabled Intel IOMMU" (presumably in order to place it into bypass mode for performance or other reasons?), "ACPI/IORT updates and fixes" (which enables full support for the ACPI IORT on 64-bit ARM).

Dmitry Torokhov posted "Input updates for v.4.11-rc0" which includes a documentation converstion to ReST (RST, the new kernel doc format), an update to the venerable Synaptics "PS/2" driver to be aware of companion "SMBus" devices and various other miscellaneous fixes.

Darren Hart posted "platform-drivers-x86 for 4.12-1" which includes "a significantly larger and more complex set of changes than those of prior merge windows". These include "several changes with dependencies on other subsytems which we felt were best managed through merges of immutable branches".

James Bottomley posted "first round of SCSI updates for the 4.11+ merge window", which includes many driver updates, but also comes with a warning to Linus that "The major thing you should be aware of is that there's a clash between a char dev change in the char-misc tree (adding the new cdev_device_add) and the make checking the return value of scsi_device_get() mandatory". Linus and Greg would later clarify what cdev_device_add does in response to Greg's request to pull "Char/Misc driver patches for 4.12-rc1".

David Miller posted "Networking" which includes many fixes.

David also posted "Sparc", which includes a "bug fix for handling exceptions during bzero on some sparc64 cpus".

David also posted "IDE", which includes "two small cleanups".

Greg K-H (Kroah-Hartman) posted "USB driver patches for 4.12-rc1", which includes "Lots of good stuff here, after many many many attempts, the kernel finally has a working typeC interface, many thanks to Heikki and Guenter and others who have taken the time to get this merged. It wasn't an easy path for them at all." It will be interesting to test that out!

Greg K-H also posted "Driver core patches for 4.12-rc1", which is "very tiny" this time around and consists mostly of documentation fixes, etc.

Greg K-H also posted "Char/Misc driver patches for 4.12-rc1" which features "lots of new drivers" including Google firmware drivers, FPGA drivers, etc. This lead to a reaction from Linus about how the tree conflicted with James Bottomley's tree (which he had already pulled, "as per James' suggestion", and a back and forth between James and Greg about how to better handle such a conflict next time, and Linus noting that he prefers to fix merge conflicts himself but "*also* really really prefer the two sides of the conflict having been more aware of the clash" and providing him with a head's up in the pull.

Greg K-H also posted "Staging/IIO driver fixes for 4.12-rc1", which adds "about 350k new lines of crap^Wcode, mostly all in a big dump of media drivers from Intel". He notes that the Android low memory killer driver has finally been deleted "much to the celebration of the -mm developers".

Greg K-H also posted "TTY patches for 4.12-rc1", which wasn't big.

Dan Williams posted "libnvdimm for 4.12" which includes "Region media error reporting [a generic interface more friendly to use with multiple namespaces]", a new "struct dax_device" to allow drivers to have their own custom direct access operations, and various other updates. Dan also posted "libnvdimm: band aid btt vs clear posion locking", a patch which "continues the 4.11 status quo of disabling of error clearing from the BTT [Block Translation Table] I/O path" and notes that "A solution for tracking and handling media errors natively in the BTT is needed". The BTT or Block Translation Table is a mechanism used by NV-DIMMs to handle "torn sectors" (partially complete writes) in hardware during error or power failure. As the "btt.txt" in the kernel documentation notes, NV-DIMMs do not have the same atomicity guarantees as regular flash drives do. Flash drives have internal logic and store enough energy in capacitors to complete outstanding writes during a power failure (rotational drives have similar for flushing their memory based caches and syncing remap block state) but NV-DIMMs are designed differently. Thus the BTT provides a level of indirection that is used to provide for atomic sector semantics.

Separately, Dan posted "libnvdimm fixes for 4.12-rc1" which includes "incremental fixes and a small feature addition relative to the libnvdimm 4.12 pull request". Gert had "noticed hat tinyconfig was bloated by BLOCK selecting DAX [Direct Acess Execution]", while "Vishal adds a feature that missed the initial pull due to pending review feedback. It allows the kernel to clear media errors when initializing a BTT (atomic sector update driver) instance on a pmem namespace".

Dave Airlie posted "drm tegra for 4.12-rc1" containing additional updates due because he missed a pull from Thierry Reding for NVidia Tegra patches. He also followed up with a "drm document code of conduct" patch that describes a code of conduct for graphics written by freedesktop.org.

Stafford Horne posted "Initramfs fix for 4.12-rc1" containing a fix "for an issue that has caused 4.11 to not boot on OpenRISC".

Catalin Marinas posted "arm64 updates for 4.12" including kdump support, "ARMv8.3 HWCAP bits for JavaScript conversion instructions, complex numbers and weaker release consistency [memory ordering]", and support for platform (non-enumerated buses) MSI support when using ACPI, among other patches. He also removes support for ASID-tagged VIVT [Virtually Indexed, Virtually Tagged] caches since "no ARMv8 implementation using it and deprecated in the architecture" [caches are PIPT - Physically Indexed, Physically Tagged - except that an implementation might do VIPT or otherwise internally using various CAM optimizations].

Catalin later posted "arm64 2nd set of updates for 4.12", which include "Silence module allocation failures when CONFIG_ARM*_MODULE_PLTS is enabled".

Olof Johansson posted "ARM: SoC contents for 4.12 merge window". In his pull request, Olof notes that "It's been a relatively quiet release cycle here. Patch count is about the usual (818 commits, which includes merges)."
He goes on to add, "Besides dts [DeviceTree files], the mach-gemini cleanup by Linus Walleij is the only platform that pops up on its own". He called out the separate post for the TEE [Trusted Execution Environment] subsystem. Olof also removed Alexandre Courbot and Stephen Warren from NVidia Tega maintainership, and added Jon Hunter in their place.

Rob Herring posted "DeviceTree for 4.12", which includes updates to the Device Tree Compiler (dtc), and more DeviceTree overlay unit tests, among various other changes.

Darrick J. Wong posted "xfs: updates for 4.12", which includes the "big new feature for this release" of a "new space mapping ioctl that we've been discussing since LSF2016 [Linux Storage and Filesystem conference]".

Max Filippov posted "Xtensa improvements for 4.12".

Ted Ts'o posted "ext4 updates for 4.12", which adds "GETFSMAP support" (discussed previously in this podcast) among other new features.

Ted also posted "fscrypt updates for 4.12" which has "only bug fixes".

Paul Moore posted "Audit patches for 4.12" which includes 14 patches that "span the full range of fixes, new featuresm and internal cleanups". These include a move to 64-bit timestamps, converting refcounts to the new refcount_t type from atomic_t, and so on.

Wolfram Sang posted "i2c for 4.12".

Mark Brown posted "regulator updates for 4.12", which includes "Quite a lot going on with the regulator API for this release, much more in the core than in the drivers for a change". This includes "Fixes for voltage change propagation through dumb power switches, a notification when regulators are enabled, a new settling time property for regulators where the time taken to move to a new voltage is not related to the size of the change", etc.

Mark also posted "SPI updates for 4.12", which includs "quite a lot of small
driver specific fixes and enhancements".

Jessica Yu posted "module updates for 4.12", containing minor fixes.

Mauro Carvalho Chehab posted "media updates" including mostly driver updates and the removal of "two staging LIRC drivers for obscure hardware". He also posted a 5 part patch series entitled "Conver more books to ReST", which converted three kernel DocBook format documentation file sets to RST, the new format being used for kernel documentation (on the kernel-doc mailing list, and maintained by Jonathan Corbet of LWN): librs, mtdnand, and sh. He noted that "After this series, there will be just one DocBook pending conversion: " lsm (Linux Security Modules)". He also notes that the existing LSM documentation is very out of date and no longer describes the current API.

Michael Ellerman posted "Please pull powerpc/linux.git powerpc-4.12-1 tag", which includes suppot for "Larger virtual address space on 64-bit server CPUs. By default we use a 128TB virtual address space, but a process can request access to the full 512TB by passing a hint to mmap() [this seems very similar to the 56-bit la57 feature from Intel]". It also includes "TLB flushing optimisations for the radix MMU on Power9" and "Support for CAPI cards on Power9, using the "Coherent Accelerator Interface Architecture 2.0″ [which definitely sounds like juicy reading]".

Separately Michael Ellerman posted "Please pull powerpc-linux.git powerpc-4.12-2 tag" which includes "rework the Linux page table geometry to lower memory usage on 64-bit Book3S (IBM chips) using the Hash MMU [IBM uses a special inverse page tables "reverse lookup" hashing format]".

Eric W. Biederman posted "namespace related changes for v4.12-rc1", which includes a "set of small fixes that were mostly stumbled over during more significant development. This proc fix and the fix to posix-timers are the most significant of the lot. There is a lot of good development going on but unfortunately it didn't quite make the merge window".

Takashi Iwai posted "sound updates for 4.12-rc1", noting that it was "a relatively calm development cycle, no scaring changes are seen".

Steven Rostedt posted "tracing: Updates for v4.12" which includes "Pretty much a full rewrite of the process of function probes". He followed up with "Three more updates for 4.12" that contained "three simple changes".

Martin Schwidefsky posted "s390 patches for 4.12 merge window" which includes improvements to VFIO support on mainframe(!) [this author was recently amazed to see there are also DPDK ports for s390x], a new true random number generator, perf counters for the new z13 CPU, and many others besides.

Geert Uytterhoeven posted "m68k updates for 4.12" with a couple fixes.

Jacek Anaszewski posted "LED updates for 4.12" with various fixes.

Kees Cook posted "usercopy updates for v4.12-rc1" with a couple fixes.

Kees also posted "pstore updates for v4.12-rc1", which included "large
internal refactoring along with several smaller fixes".

James Morris posted "Security subsystem updates for v4.12".

Sebastian Reichel posted "hsi changes for hsi-4.12".

Sebastian also posted "power-supply changes for 4.12", which includes a couple of new drivers and various fixes.

Separately, Sebastian poted "power-supply changes for 4.12 (part 2), which includes some new drivers and some fixes.

Paolo Bonzini posted "First batch of KVM changes for 4.12 merge window" which includes kexec/kdump support on 32-bit ARM, support for a userspace virtual interrupt controller to handle the "weird" Raspberry Pi 3, in-kernel acceleration for VFIO on POWER, nested EPT support for accessed and dirty bits on x86, and many other fixes and improvements besides.

Separately Paolo posted "Second round of KVM changes for 4.12", which include various ARM (32 and 64-bit) cleanups, support for PPC [POWER] XIVE (eXternal Interrupt Virtualization Engine), and "x86: nVMX improvements, including emulated page modification logging (PML) which brings nice performance improvements [under nested virtualization] on some workloads".

Ilya Dryomov posted "Ceph updates for 4.12-rc1", which include "support for disabling automatic rbd [resilent block device] exclusive lock transfers" and "the long awaited -ENOSPC [no space] handling series". The latter finally handles out of space situations by aborting with -ENOSPC rather than "having them [writers] block indefinitely".

Miklos Szeredi posted "fuse updates for 4.12", which "contains support for pid namespaces from Seth and refcount_t work from Elena".

Miklos also posted "overlayfs update for 4.12", which includes "making st_dev/st_ino on the overlay behave like a normal filesystem". "Currently this only wokrs if all layers are on the same filesystem, but future work will move the general case towards more sane behavior".

Bjorn Helgaas posted "PCI changes for v4.12" which includes a framework for supporting PCIe devices in Endpoint mode from Kishon Vjiay Abraham, fixes for using non-posted PCI config space on ARM from Lorenzo Pieralisi, allowing slots below PCI-to-PCIe "reverse bridges", a bunch of quirks, and many other fixes and enhancements.

Jaegeuk Kim posted "f2fs for 4.12-rc1", which "focused on enhancing performance with regards to block allocation, GC [Garbage Collection], and discard/in-place-update IO controls".

Shuah Khan posted "Kselftest update for 4.12-rc1" with a few fixes.

Richard Weinberg posted "UML changes for v4.12-rc1" which includes "No new stuff, just fixes" to the "User Mode Linux" architecture in 4.12. Separately, Masami Hiramatsu posted an RFC patch entitled "Output messages to stderr and support quiet option" intended to "fix[] some boot time printf output to stderr by adding os_info() and os_warn(). The information-level messages via os_info() are suppressed when "quiet" kernel option is specified".

Richard also postd "UBI/UBIFS updates for 4.12-rc1", which "contains updates for both UBI and UBIFS". It has a new CONFIG_UBIFS_FS_SECURITY option, among "minor improvements" and "random fixes".

Thierry Reding posted "pwm: Changes for v4.12-rc1", which amongst other things includes "a new driver for the PWM controller found on MediaTek SoCs".

Vinod Koul posted "dmaengine updates" which includes "a smaller update consisting of support for TI DA8xx dma controller" among others.

Chris Mason posted "Btrfs" which "Has fixes and cleanups" as well as "The biggest functional fixes [being] between btrfs raid5/6 and scrub".

Trond Myklebust posted "Please pull NFS client fixes for 4.12", which includes various fixes, and new features (such as "Remove the v3-only data server limitation on pNFS/flexfiles").

J. Bruce Fields posted "nfsd changes for 4.12", which includes various RDMA updates from Chuck Lever.

Stephen Boyd posted "clk changes for v4.12". Of the changes, the "biggest things are the TI clk driver rework to lay the groundwork for clkctrl support in the next merge window and the AmLogic audio/graphics clk support".

Alexandre Belloni posted "RTC [Real Time Clock] for 4.12", which uses a new GPG subkey that he also let Linus know about at the same time.

Nicholas A. Bellinger posted "target updates for v4.12-rc1", which was "a lot more calm than previously expected. It's primarily fixes in various areas, with most of the new functionality centering around TCMU [TCM - Linux iSCSI Target Support in Userspace] backend work with Xiubo Li has been driving".

Zhang Rui posted "Thermal management updates for v4.12-rc1", which includes a number of fixes, as well as some new drivers, and a new interface in "thermal devfreq_cooling code so that the driver can provide more precise data regarding actual power to the thermal governor every time the power budget is calculated".

4.12 git pulls for new subsystems and features

David Howells posted "Hardware module parameter annotation for secure boot" in which he requested that Linus pull in new "kmod" macros (the same name is used for the userspace module tooling, but in this case refers to the in-kernel kernel module infrastructure of the same name). The new macros add annotations to "module_param" of the new form "module_param_hw" with a "hwtype" such as "ioport" or "iomem", and so forth. These are used by the kernel to prevent those parameters from being used under a UEFI Secure Boot situation in which the kernel is "locked down" (to prevent someone from loading a signed kernel image and then compromising it to circumvent the secure boot mechanism).

Arnd Bergmann sent a special pull request to Linus Torvalds for "TEE driver infrastructure and OP-TEE drivers", which "introduces a generic TEE [Trusted Execution Environment] framework in the kernel, to handle trusted environ[ments] (security coprocessor or software implementations such as OP-TEE/TrustZone)". He sent the pull separately from the other arm-soc pull specifically to call it out, and to make sure everyone knew that this was finally headed upstream, but he noted it would probably be maintained through the arm-soc kernel tree. He included a lengthy defense of why now was the right time to merge TEE support into upstream Linux.

Saving TLB flushes on Intel x86 Architecture

Andy Lutomirski posted an RFC patch series entitled "x86 TLB flush cleanups, moving toward PCID support". Modern (non-legacy) architectures implement a per-process context identifier that can be used in order to tag VMA (Virtual Memory Area) translations that end up in the TLB (Translation Lookaside Buffer) caches within the microprocessor core. The processor's hardware (or in some mostly embedded cases, software) (page table) "walkers" will navigate the page tables for a process and populate the TLBs (except in the embedded software case, such as on certain PowerPC and MIPS processors, in which the kernel contains special assembly routines to perform this in software). On legacy architectures, the TLB is fairly simple, containing a simple virtual address to physical (or intermediate, in the case of virtualization) address. But on more sophisticated architectures, the TLB includes address space identification information that allows the TLB to distinguish between hits to the same virtual address that are from two different processes (known as tasks from within the kernel). Using additional tagging in the TLB avoids the traditional need to invalidate the entire TLB on process context switch.

Modern architectures, such as AArch64, have implemented context tagging support in their architecture code for some time, and now x86 is finally set to follow, enabling a feature that has actually been present in x86 for some time (but was not wired up), thanks to Andy's work on PCID (Process Context IDentifier) support. In his patch series, Andy notes that as he has been "polishing [his] PCID code, a major problem [he's] encountered is that there are too many x86 TLB flushing code paths and that they have too many inconsequential differences". This patch series aims to "clean up the mess". Now if x86 finally gains hardware broadcast TLB invalidations it will also be able to remove the wasted IPIs (Inter-Processor-Interrupts) that it implements to cause remote processors to invalidate TLB entries, too. Linus liked Andy's initial work, but said he is "always a bit nervous about TLB changes like this just because any potential bugs tend to be really really hard to see and catch". Those of us who have debugged nasty TLB issues on other architectures would be inclined to agree with him.

Ongoing Development

Laurent Dufour posted version 3 of a patch series entitled "Speculative page faults". This is a contemporary development inspired by Peter Zijstra's earlier work, which was based upon ideas of still others. The whole concept dates back to at least 2009 and generally involves removing the traditional locking constraints of updates to VMAs (Virtual Memory Areas) used by Linux tasks (processes) to represent the memory of running programs. Essentially, a "speculative fault" means "not holding mmap_sem" (a semaphore guarding a tasks' current memory map). Laurent (and Peter) make VMA lookups lockless, and perform updats speculatively, using a seqlock to detect a change to the underlying VMA during the fault. "Once we've obtained the page and are ready to update the PTE, we validate if the state we started the fault with is still valid, if not, we'll fail the fault with VM_FAULT_RETRY, otherwise we update the PTE and we're done". Earlier testing showed very significant performance upside to this work due to the reduced lock contention.

Aaron Lu posted "smp: do not send IPI if call_single_queue not empty". The Linux kernel (and most others) uses a construct known as an IPI - or Inter-Processor-Interrupt - a form of software generated interrupt that a processor will send to one or more others when it needs them to perform some housekeeping work on the kernel's behalf. Usually, this is to handle such things as a TLB shootdown (invalidating a virtual address translation in a remote processor due to a virtual address space being removed), especially on less sophisticated legacy architectures that do not feature invalidation of TLBs through hardware broadcast, though there are many other uses for IPIs. Aaron's patch realizes, effectively, that if a remote processor is already going to process a queue of CSD (call_single_data) function calls it has been asked to via IPI then there is no need to send another IPI and generate additional interrupts - the queue will be drained of this entry as well as existing entries by the IPI management code.

Romain Perier posted version 8 of "Replace PCI pool by DMA pool API" which realizes that the current PCI pool API uses "simple macro functions direct expanded to the appropriate dma pool functions", so it simply replaces them with a direct use of the corresponding DMA pool API instead.

Sandhya Bankar posted "vfs: Convert file allocation code to use the IDR". This replaces existing filesystem code that allocates file descriptors using a
custom allocator with Matthew (Willy) Wilcox's idr (ID Radix) tree allocator.

Serge E. Hallyn posted a resend of version 2 of a patch series entitled "Introduce v3 namespaced file capabilities". We covered this last time.

Heinrich Schuchardt posted "arm64: Always provide "model name" in /proc/cpuinfo", which was quickly shot down (for the moment).

Christian König posted verision 5 of his "Resizeable PCI BAR support" patch series. We have featured this in a previous episode of the podcast.

Prakash Sangappa posted "hugetlbfs 'noautofill' mount option" which aims to allow (optionally) for hugetlbfs pseudo-filesystems to be mounted with an option which will not automatically populate holes in files with zeros during a page fault when the file is accessed though the mapped address. This is intended to benefit applications such as Oracle databases, which make heavy use of such mechanisms but don't take kindly to the kernel having side effects that change on-disk files even if only zero fill. Dave Hansen pushed back against this change saying that it was "further specializing hugetlbfs" and that Oracle should be using userfaultfd or "an madvise() option that disallows backing allocations". Prakash replied that they had considered those but with a database there are such a large number of single threaded processes that "The concern with using userfaultfs is the overhead of setup and having an additional thread per process".

Sameer Goel posted "arm64: Add translation functions for /dev/mem read/write" which "Port architecture specific xlate [translate] and unxlate [untranslate] functions for /dev/mem read/write. This sets up the mapping for a valid physical address if a kernel direct mapping is not alread present". Depending upon the ARM platform, access to a bad address in /dev/mem could result in a synchronous exception in the core, or a System Error (SError) generated by a system memory controller interface. In either case, it is handled as a fatal error where the same is not true on x86. While access to /dev/mem is restricted, increasingly being deprecated, and has other semantics to prevent its used on 64-bit ARM systems, it still exists and is used. In this case, to read the ACPI FPDT table which provides performance pointer records. Nevertheless, both Will Deacon and Leif Lindholm objected to the reasoning given here, saying that the kernel should instead be taught how to parse this table and expose its information via /sys rather than having userspace tools go poking in /dev/mem to try to read from the table directly.

Minchan Kim posted "vmscan: scan pages until it f[inds] eligible pages" in which he notes that "There are premature OOM [Out Of Memory killer invocations] happening. Although there are ton of free swap and anonymous LRU list of eligible zones, OOM happened. With investigation, skipping page of isolate_lru_pages makes reclaim void because it returns zero nr_taken easily so LRU shrinking is effectively nothing and just increases priority aggressively. Finally, OOM happens".

Julius Werner posted version 3 of his "Memconsole changes for new coreboot format" which teaches the Google firmware driver for their memconsole to deal with the newer type of persistent ring buffer console they introduced.

Olliver Schinagl and Jamie Iles had a back and forth about the latter's work on "glue-code" (generic handling code) for the DW (DesignWare) 8250 (a type of serial port interface made popular by PC) IP block as used in many different designs. Depending upon how the block is configured, it can behave differently, and there was some discussion about how to handle that. In particular the location of the UART_USR register.

Xiao Guangrong posted "KVM: MMU: fast write protect" which "introduces a[n] extremely fast way to write protec all the guest memory. Comparing with the ordinary algorthim which write protects last level sptes [the page table entries used by the guest] based on the rmap [the "reverse" map, the means that Linux uses to encode page table information within the kernel] one by one, it just simply updates the generation number to ask all vCPUs to reload its root page table, particularly it can be done out of mmu-lock". The idea was apparently originally proposed by Avi (Kivity). Paolo Bonzini thought "This is clever" and wondered "how the alternative write protection mechanism would affect performance of the dirty page ring buffer patches". Xiao thought it could be used to speed up those patches after merging, too [Paolo noted that he aims to merge these early in 4.13 development].

Bogdan Mirea posted version 2 of"Add "Preserve Boot Time Support"", which follows up on a previous discussion about retaining "Boot Time Preservation between Bootloader and Linux Kernel. It is based on the idea that the Bootloader (or any other early firmware) will start the HW Timer and Linux Kernel will count the time starting with the cycles elapsed since timer start". By "Bootloader" he means "firmware" to those who live in x86-land.

Igor Stoppa posted "post-init-read-only protection for data allocated dynamically" which aims to provide a mechanism for dynamically allocated data which is similar to the "__read_only" special linker section that certain annotated (using special GCC directives) code will be placed into. That works great for read-only data (which is protected by the MMU locking down the corresponding region early in boot). His "wish" is to start with the "policy DB of SE Linux and the LSM Hooks, but eventually I would like to extend the protection also to other subsystems, in a way that can merged into mainline." His patch includes an analysis of how he feels he can be as "little invasive as possible", noting that "In most, if not all, the cases that could be enhanced, the code will be calling kmalloc/vmalloc, including GFP_KERNEL [Get Free Pages of Kernel Type Memory] as the desired type of memory". Consequently, he says, "I suspect/hope that the various maintainer[s] won't object too much if my changes are limited to replacing GFP_KERNEL with some other macro, for example what I previously called GFP_LOCKABLE". Michal Hocko had some feedback, largely along the lines of a "master toggle" (tha would allow protection to be disabled for small periods in order to make changes to "read only" data) was largely pointless - due to it re-exposing the data. Instead, he wanted to see the protection being done at the kmem_cache_create time by adding a "SLAB_SEAL" parameter that would later be enabled on a per kmem_cache basis using "kmem_cache_seal(cache)" or a similar mechanism.

Bharat Bhushan posted "ARM64/PCI: Allow userspace to mmap PCI resources", which Lorenzo Pieralisi noted was already implemented by another patch.

A lengthy, and "spirited" discussion took place between Timur Tabi and the various maintainers of the 64-bit ARM Architecture and SoC platform trees over the desire for the maintainers to have changes to "defconfigs" for the architecture go through a special "arm@kernel.org" alias. Except that after they had told Timur to use that, they objected to him posting a patch informing others of this alias in the kernel documentation. Instead, as Timur put it "without a MAINTAINERS entry, how would anyone know to CC: that address? I posted 3 versions of my defconfig patchset before someone told me that I had to send it to arm@kernel.org." The discussion thread is entitled "MAINTAINERS: add arm@kernel.org as the list for arm64 defconfig changes".

Xunlei Pang posted version 3 of his "x86/mm/ident_map: Add PUD level 1GB page support" which helps "kernel_ident_mapping_init" to create a single and very large identitiy page mapping in order to reduce TLB (Translation Lookaside Buffer - the caches that store virtual to physical memory lookups performed by hardware) pressure on an architecture that is currently using many 2MB (PMD - Page Middle Directory) level pages for this process.

Anju T Sudhakar posted version 8 of "IMC Instrumentation Support", which provides support for POWER9's "In-Memory-Collection" or IMC infrastructure, which "contains various Performance Monitoring Units (PMUs) at Nest level (these are on-chip but off-core), Core level and Thread level."

Greg K-H (Kroah-Hartman) posted an RFC patch entitled "add more new kernel pointer filter options" which "implemnt[s] some new restrictions when printing out kernel pointers, as well as the ability to whitelist kernel pointers where needed."

Kees Cook posted "x86/refcount: Implement fast refcount overflow protection", which seeks to upstream a "modified version of the x86 PAX_REFCOUNT defense from PaX/grsecurity. This speeds up the refcount_t API by duplicating the existing atomic_t implementation with a single instruction added to detect if the refcount has wrapped past INT_MAC (or below 0) resuling in a negative value, where the handler then restores the refcount_t to INT_MAX".

David Howlls posted an RFC patch entitled "VFS: Introduce superblock configuration context" which is a "set of patches to create a superblock configuration contenxt prior to setting up a new mount, populating it with the parsed options/binary data, creating the superblock and then effecting the mount. This allows namespaces and other information to be conveyed through the mount procedure. It also allows extra error information".

The Google Chromebook team let folks know that they were (rarely, like one in a million) seeing "Threads stuck in zap_pid_ns_processes()". Guenter Roeck noted that the "Problem is that if the main task [which has children that are being ptraced] doesn't exit, it [the child] hangs forever. Chrome OS (where we see the problem in the field, and the application is chrome) is configured to reboot on hung tasks - if a task is hung for 120 seconds on those systems, it tends to be in a bad shape. This makes it a quite severe problem for us". He asked "Are there other conditions besides ptrace where a task isn't reaped?". Reaping refers to the behavior in which tasks, when they exit are reparented to the init task, which "reaps" them (cleans up and makes sure the state that exit with is seen), except under ptrace in this case where the parent task spawning the children "was outside of the pid namespace and was choosing not to reap the child". Various proposals as to how to deal with this in the namespace code were discussed.

Mahesh Bandewar posted "kmod: don't load module unless req process has CAP_SYS_MODULE" which notes that "A process inside random user-ns [a user namespace] should not load a module, which is currently possible". He shows how a user namespace can be created that causes the kernel to load a module upon access to a file node indirectly. This could be a security risk if this approach were used to cause a host kernel to load a vulnerable but otherwise not loaded kernel driver through the privileged permissions in the namespace.

Marc Zyngier posted "irqdomain: Improve irq_domain_mapping facility" in which he "Update[s] IRQ-domain.txt to document irq_domain_mapping" among otherwise seeking to make it easier to access and understand this kernel feature.

Jens Axboe accepted a patch from Ulf Hansson adding Paolo Valente as a MAINTAINER of the BFQ I/O scheduler.

Cyrille Pitchen updated the git repos for the SPI NOR subsystem, which is "now hosted on MTD repos, spi-nor/next is on l2-mtd and spi-nor/fixes will be on linux-mtd".

Alexandre Courbot posted "MAINTAINERS: remove self from GPIO maintainers".

The folks at Codeaurora posted a lengthy analysis of the Linux kernel scheduler and specific problems with load_balance that will be covered next time around, along with work by Peter Zijlstra on the "cgroup/PELT overhaul (again).

Finally, Paul McKenney previously posted "Make SRCU be once again optional", after having noted that the need to build it in by default (caused by other recent changes in header files) increased the kernel by 2K. Nico(las) Pitre was happy to hear this, saying "If every maintainer finds a way to (optionally) reduce the size of the code they maintain by 2K then we'll get a much smaller kernel pretty soon".

15 May 2017 4:07am GMT

13 May 2017

feedKernel Planet

Linux Plumbers Conference: Today is the very last day for Plumbers refereed track submissions

The submission site

https://linuxplumbersconf.org/2017/ocw/events/LPC2017TALKS/proposals

Will close at midnight pacific tonight

13 May 2017 3:48pm GMT

09 May 2017

feedKernel Planet

Matthew Garrett: Intel AMT on wireless networks

More details about Intel's AMT vulnerablity have been released - it's about the worst case scenario, in that it's a total authentication bypass that appears to exist independent of whether the AMT is being used in Small Business or Enterprise modes (more background in my previous post here). One thing I claimed was that even though this was pretty bad it probably wasn't super bad, since Shodan indicated that there were only a small number of thousand machines on the public internet and accessible via AMT. Most deployments were probably behind corporate firewalls, which meant that it was plausibly a vector for spreading within a company but probably wasn't a likely initial vector.

I've since done some more playing and come to the conclusion that it's rather worse than that. AMT actually supports being accessed over wireless networks. Enabling this is a separate option - if you simply provision AMT it won't be accessible over wireless by default, you need to perform additional configuration (although this is as simple as logging into the web UI and turning on the option). Once enabled, there are two cases:

  1. The system is not running an operating system, or the operating system has not taken control of the wireless hardware. In this case AMT will attempt to join any network that it's been explicitly told about. Note that in default configuration, joining a wireless network from the OS is not sufficient for AMT to know about it - there needs to be explicit synchronisation of the network credentials to AMT. Intel provide a wireless manager that does this, but the stock behaviour in Windows (even after you've installed the AMT support drivers) is not to do this.
  2. The system is running an operating system that has taken control of the wireless hardware. In this state, AMT is no longer able to drive the wireless hardware directly and counts on OS support to pass packets on. Under Linux, Intel's wireless drivers do not appear to implement this feature. Under Windows, they do. This does not require any application level support, and uninstalling LMS will not disable this functionality. This also appears to happen at the driver level, which means it bypasses the Windows firewall.

Case 2 is the scary one. If you have a laptop that supports AMT, and if AMT has been provisioned, and if AMT has had wireless support turned on, and if you're running Windows, then connecting your laptop to a public wireless network means that AMT is accessible to anyone else on that network[1]. If it hasn't received a firmware update, they'll be able to do so without needing any valid credentials.

If you're a corporate IT department, and if you have AMT enabled over wifi, turn it off. Now.

[1] Assuming that the network doesn't block client to client traffic, of course

comment count unavailable comments

09 May 2017 8:18pm GMT