29 Sep 2016

feedPlanet Debian

Dirk Eddelbuettel: gcbd 0.2.6

A pure maintenance release 0.2.6 of the gcbd package is now on CRAN. The gcbd proposes a benchmarking framework for LAPACK and BLAS operations (as the library can exchanged in a plug-and-play sense on suitable OSs) and records result in local database. Recent / upcoming changes to DBMI and RSQLite let me to update the package; there are no actual functionality changes in this release.

CRANberries also provides a diffstat report for the latest release.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

29 Sep 2016 3:39am GMT

Dirk Eddelbuettel: RcppCNPy 0.2.6

A new version of the RcppCNPy package arrived on CRAN a few days ago.

RcppCNPy provides R with read and write access to NumPy files thanks to the cnpy library by Carl Rogers.

This new release reflects all the suggestions and comments I received during the review process for the Journal of Open Source Software submission. I am happy to say that about twenty-nine days after I submitted, the paper was accepted and is now published.

Changes in version 0.2.6 (2016-09-25)

  • Expanded documentation in README.md

  • Added examples to help page

  • Added CITATION file for JOSS paper

CRANberries also provides a diffstat report for the latest release. As always, feedback is welcome and the best place to start a discussion may be the GitHub issue tickets page.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

29 Sep 2016 3:34am GMT

28 Sep 2016

feedPlanet Debian

Kees Cook: security things in Linux v4.5

Some things I found interesting in the Linux kernel v4.5:

ptrace fsuid checking

Jann Horn fixed some corner-cases in how ptrace access checks were handled on special files in /proc. For example, prior to this fix, if a setuid process temporarily dropped privileges to perform actions as a regular user, the ptrace checks would not notice the reduced privilege, possibly allowing a regular user to trick a privileged process into disclosing things out of /proc (ASLR offsets, restricted directories, etc) that they normally would be restricted from seeing.

ASLR entropy sysctl

Daniel Cashman standardized the way architectures declare their maximum user-space ASLR entropy (CONFIG_ARCH_MMAP_RND_BITS_MAX) and then created a sysctl (/proc/sys/vm/mmap_rnd_bits) so that system owners could crank up entropy. For example, the default entropy on 32-bit ARM was 8 bits, but the maximum could be as much as 16. If your 64-bit kernel is built with CONFIG_COMPAT, there's a compat version of the sysctl as well, for controlling the ASLR entropy of 32-bit processes: /proc/sys/vm/mmap_rnd_compat_bits.

Here's how to crank your entropy to the max, without regard to what architecture you're on:

for i in "" "compat_"; do f=/proc/sys/vm/mmap_rnd_${i}bits; n=$(cat $f); while echo $n > $f ; do n=$(( n + 1 )); done; done

strict sysctl writes

Two years ago I added a sysctl for treating sysctl writes more like regular files (i.e. what's written first is what appears at the start), rather than like a ring-buffer (what's written last is what appears first). At the time it wasn't clear what might break if this was enabled, so a WARN was added to the kernel. Since only one such string showed up in searches over the last two years, the strict writing mode was made the default. The setting remains available as /proc/sys/kernel/sysctl_writes_strict.

seccomp NNP vs TSYNC fix

Jann Horn noticed and fixed a problem where if a seccomp filter was already in place on a process (after being installed by a privileged process like systemd, a container launcher, etc) then the setting of the "no new privs" flag could be bypassed when adding filters with the SECCOMP_FILTER_FLAG_TSYNC flag set. Bypassing NNP meant it might be possible to trick a buggy setuid program into doing things as root after a seccomp filter forced a privilege drop to fail (generally referred to as the "sendmail setuid flaw"). With NNP set, a setuid program can't be run in the first place.

That's it! Tomorrow I'll cover v4.6…

© 2016, Kees Cook. This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 License.
Creative Commons License

28 Sep 2016 9:58pm GMT

Alessio Treglia: Emptiness and Form

Tweet

Being_ParmenidesIn the perennial search of the meaning of life and the fundamental laws that govern nature, man was always faced - for millennia - with the mysterious concept of emptiness. What is emptiness? Does it really exist in nature? Is emptiness the non-being, as theorized by Parmenides?

Until the early years of the last century, technology had not yet been able to equip scientists with the necessary tools to investigate the innermost structure of matter, so the concept of emptiness was always faced with insights and metaphors that led, over the centuries, to a broad philosophical debate.

For the ancient atomist Greek philosophers, the existence of emptiness was not only possible but had become a necessity, becoming the ontological principle for the existence of being: for them, actually, the emptiness that permeates the atoms is what allows movement.

<Read More…[by Fabio Marzocca]>

Tweet

28 Sep 2016 8:33pm GMT

Sean Whitton: 'Do you really need to do that?'

A new postdoc student arrived at our department this semester, and after learning that he uses GNU/Linux for all his computing, I invited him along to TFUG. During some of our meetings people asked "how could I do X on my GNU/Linux desktop?" and, jokingly, the postdoc would respond "the answer to your question is 'do you really need to do that?'" Sometimes the more experienced GNU/Linux users at the table would respond to questions by suggesting that the user should simply give up on doing X, and the postdoc would slap his thigh and laugh and say "see? I told you that's the answer!"

The phenomenon here is that people who have at some point made a commitment to at least try to use GNU/Linux for all their computing quickly find that they have come to value using GNU/Linux more than they value engaging in certain activities that only work well/at all under a proprietary operating system. I think that this is because they get used to being treated with respect by their computer. And indeed, one of the reasons I've almost entirely given up on computer gaming is that computer games are non-free software. "Are you sure you need to do that?" starts sounding like a genuine question rather than simply a polite way of saying that what someone wants to do can't be achieved.

I suggest that this is a blessing in disguise. The majority of the things that you can only do under a proprietary operating system are things that it would be good for you if you were to switch them out for other activities. I'm not suggesting that switching to a GNU/Linux is a good way to give up on the entertainment industry. It's a good way of moderating your engagement with the entertainment industry. Rather than logging onto Netflix, you might instead pop in a DVD of a movie. You can still engage with contemporary popular culture, but the technical barriers give you an opportunity to moderate your consumption: once you've finished watching the movie, the software won't try to get you to watch something else by making a calculation as to what you're most likely to assent to watching next based on what you've watched before. For this behaviour of the Netflix software is just another example of non-free software working against its user's interests: watching a movie is good for you, but binge-watching a TV series probably isn't. In cases like this, living in the world of Free Software makes it easier to engage with media healthily.

28 Sep 2016 3:25pm GMT

Chris Lamb: Diffoscope progress bar

Diffoscope is a diff utility which recursively unpacks archives, ISOs, etc., transforming a wide variety of files into human-readable forms before comparison instead of simply showing the raw difference in hexadecimal.

I recently added a progress bar when diffoscope is run on a terminal:

https://i.imgur.com/LDe4FNS.gif

Note that as diffoscope can, at any point, encounter an archive or format that requires unpacking, the progress will always be approximate and may even appear to go "backwards".

The implementation, available in version 61, is simple (see #1, #2, #3 & #4) but takes into account of a number of subtleties by using context managers to correctly track the state throughout.

28 Sep 2016 11:45am GMT

27 Sep 2016

feedPlanet Debian

Kees Cook: security things in Linux v4.4

Continuing with interesting security things in the Linux kernel, here's v4.4. As before, if you think there's stuff I missed that should get some attention, please let me know.

CONFIG_IO_STRICT_DEVMEM

The CONFIG_STRICT_DEVMEM setting that has existed for a long time already protects system RAM from being accessible through the /dev/mem device node to root in user-space. Dan Williams added CONFIG_IO_STRICT_DEVMEM to extend this so that if a kernel driver has reserved a device memory region for use, it will become unavailable to /dev/mem also. The reservation in the kernel was to keep other kernel things from using the memory, so this is just common sense to make sure user-space can't stomp on it either. Everyone should have this enabled.

If you're looking to create a very bright line between user-space having access to device memory, it's worth noting that if a device driver is a module, a malicious root user can just unload the module (freeing the kernel memory reservation), fiddle with the device memory, and then reload the driver module. So either just leave out /dev/mem entirely (not currently possible with upstream), build a monolithic kernel (no modules), or otherwise block (un)loading of modules (/proc/sys/kernel/modules_disabled).

seccomp UM

Mickaël Salaün added seccomp support (and selftests) for user-mode Linux. Moar architectures!

seccomp Checkpoint/Restore-In-Userspace

Tycho Andersen added a way to extract and restore seccomp filters from running processes via PTRACE_SECCOMP_GET_FILTER under CONFIG_CHECKPOINT_RESTORE. This is a continuation of his work (that I failed to mention in my prior post) from v4.3, which introduced a way to suspend and resume seccomp filters. As I mentioned at the time (and for which he continues to quote me) "this feature gives me the creeps." :)

x86 W^X corrections

Stephen Smalley noticed that there was still a range of kernel memory (just past the end of the kernel code itself) that was incorrectly marked writable and executable, defeating the point of CONFIG_DEBUG_RODATA which seeks to eliminate these kinds of memory ranges. He corrected this and added CONFIG_DEBUG_WX which performs a scan of memory at boot time and yells loudly if unexpected memory protection are found. To nobody's delight, it was shortly discovered the UEFI leaves chunks of memory in this state too, which posed an ugly-to-solve problem (which Matt Fleming addressed in v4.6).

x86_64 vsyscall CONFIG

I introduced a way to control the mode of the x86_64 vsyscall with a build-time CONFIG selection, though the choice I really care about is CONFIG_LEGACY_VSYSCALL_NONE, to force the vsyscall memory region off by default. The vsyscall memory region was always mapped into process memory at a fixed location, and it originally posed a security risk as a ROP gadget execution target. The vsyscall emulation mode was added to mitigate the problem, but it still left fixed-position static memory content in all processes, which could still pose a security risk. The good news is that glibc since version 2.15 doesn't need vsyscall at all, so it can just be removed entirely. Any kernel built this way that discovered they needed to support a pre-2.15 glibc could still re-enable it at the kernel command line with "vsyscall=emulate".

That's it for v4.4. Tune in tomorrow for v4.5!

© 2016, Kees Cook. This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 License.
Creative Commons License

27 Sep 2016 10:47pm GMT

C.J. Adams-Collier: OpenDaylight Symposium 2016

I'll write this later. Keywords and notes inline

Cloud

5G

AT&T

Ericsson

SDN

FD.io

Cisco

Intel

Linux

Debian

Ubuntu

Open Source

Hydrogen

Helium

Lithium

OSI stack

Usability

Developers and users working together.

Fast Dev/Test cycles

Sessions start at 10:30

Message bus

Minimal install needs to be smaller

Most functionalities must be implemented in modules

Upgrade path complexity

Out of band control channel.
OPNFV customers don't run stock releases, make customisations and plugins easier.

Message bus

Distribution vendors should not be expected to perform package maintenance. Get the distribution ready for fast pathing in to backports, security, testing and unstable.

Tweet

27 Sep 2016 5:17pm GMT

Ben Hutchings: Debian LTS work, August 2016

I was assigned 14.75 hours of work by Freexian's Debian LTS initiative and carried over 0.7 from last month. I worked a total of 14 hours, carrying over 1.45 hours.

I finished preparing and finally uploaded an update for linux (3.2.81-2). This took longer than expected due to the difficulty of reproducing CVE-2016-5696 and verifying the backported fix. I also released an upstream stable update (3.2.82) which will go into the next update in wheezy LTS.

I discussed a few other security updates and issues on the debian-lts mailing list.

27 Sep 2016 10:41am GMT

26 Sep 2016

feedPlanet Debian

Kees Cook: security things in Linux v4.3

When I gave my State of the Kernel Self-Protection Project presentation at the 2016 Linux Security Summit, I included some slides covering some quick bullet points on things I found of interest in recent Linux kernel releases. Since there wasn't a lot of time to talk about them all, I figured I'd make some short blog posts here about the stuff I was paying attention to, along with links to more information. This certainly isn't everything security-related or generally of interest, but they're the things I thought needed to be pointed out. If there's something security-related you think I should cover from v4.3, please mention it in the comments. I'm sure I haven't caught everything. :)

A note on timing and context: the momentum for starting the Kernel Self Protection Project got rolling well before it was officially announced on November 5th last year. To that end, I included stuff from v4.3 (which was developed in the months leading up to November) under the umbrella of the project, since the goals of KSPP aren't unique to the project nor must the goals be met by people that are explicitly participating in it. Additionally, not everything I think worth mentioning here technically falls under the "kernel self-protection" ideal anyway - some things are just really interesting userspace-facing features.

So, to that end, here are things I found interesting in v4.3:

CONFIG_CPU_SW_DOMAIN_PAN

Russell King implemented this feature for ARM which provides emulated segregation of user-space memory when running in kernel mode, by using the ARM Domain access control feature. This is similar to a combination of Privileged eXecute Never (PXN, in later ARMv7 CPUs) and Privileged Access Never (PAN, coming in future ARMv8.1 CPUs): the kernel cannot execute user-space memory, and cannot read/write user-space memory unless it was explicitly prepared to do so. This stops a huge set of common kernel exploitation methods, where either a malicious executable payload has been built in user-space memory and the kernel was redirected to run it, or where malicious data structures have been built in user-space memory and the kernel was tricked into dereferencing the memory, ultimately leading to a redirection of execution flow.

This raises the bar for attackers since they can no longer trivially build code or structures in user-space where they control the memory layout, locations, etc. Instead, an attacker must find areas in kernel memory that are writable (and in the case of code, executable), where they can discover the location as well. For an attacker, there are vastly fewer places where this is possible in kernel memory as opposed to user-space memory. And as we continue to reduce the attack surface of the kernel, these opportunities will continue to shrink.

While hardware support for this kind of segregation exists in s390 (natively separate memory spaces), ARM (PXN and PAN as mentioned above), and very recent x86 (SMEP since Ivy-Bridge, SMAP since Skylake), ARM is the first upstream architecture to provide this emulation for existing hardware. Everyone running ARMv7 CPUs with this kernel feature enabled suddenly gains the protection. Similar emulation protections (PAX_MEMORY_UDEREF) have been available in PaX/Grsecurity for a while, and I'm delighted to see a form of this land in upstream finally.

To test this kernel protection, the ACCESS_USERSPACE and EXEC_USERSPACE triggers for lkdtm have existed since Linux v3.13, when they were introduced in anticipation of the x86 SMEP and SMAP features.

Ambient Capabilities

Andy Lutomirski (with Christoph Lameter and Serge Hallyn) implemented a way for processes to pass capabilities across exec() in a sensible manner. Until Ambient Capabilities, any capabilities available to a process would only be passed to a child process if the new executable was correctly marked with filesystem capability bits. This turns out to be a real headache for anyone trying to build an even marginally complex "least privilege" execution environment. The case that Chrome OS ran into was having a network service daemon responsible for calling out to helper tools that would perform various networking operations. Keeping the daemon not running as root and retaining the needed capabilities in children required conflicting or crazy filesystem capabilities organized across all the binaries in the expected tree of privileged processes. (For example you may need to set filesystem capabilities on bash!) By being able to explicitly pass capabilities at runtime (instead of based on filesystem markings), this becomes much easier.

For more details, the commit message is well-written, almost twice as long as than the code changes, and contains a test case. If that isn't enough, there is a self-test available in tools/testing/selftests/capabilities/ too.

PowerPC and Tile support for seccomp filter

Michael Ellerman added support for seccomp to PowerPC, and Chris Metcalf added support to Tile. As the seccomp maintainer, I get excited when an architecture adds support, so here we are with two. Also included were updates to the seccomp self-tests (in tools/testing/selftests/seccomp), to help make sure everything continues working correctly.

That's it for v4.3. If I missed stuff you found interesting, please let me know! I'm going to try to get more per-version posts out in time to catch up to v4.8, which appears to be tentatively scheduled for release this coming weekend.

© 2016, Kees Cook. This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 License.
Creative Commons License

26 Sep 2016 10:54pm GMT

Reproducible builds folks: Reproducible Builds: week 74 in Stretch cycle

Here is what happened in the Reproducible Builds effort between Sunday September 18 and Saturday September 24 2016:

Outreachy

We intend to participate in Outreachy Round 13 and look forward for new enthusiastic applications to contribute to reproducible builds. We're offering four different areas to work on:

Reproducible Builds World summit #2

We are planning e a similar event to our Athens 2015 summit and expect to reveal more information soon. If you haven't been contacted yet but would like to attend, please contact holger.

Toolchain development and fixes

Mattia uploaded dpkg/1.18.10.0~reproducible1 to our experimental repository. and covered the details for the upload in a mailing list post.

The most important change is the incorporation of improvements made by Guillem Jover (dpkg maintainer) to the .buildinfo generator. This is also in the hope that it will speed up the merge in the upstream.

One of the other relevant changes from before is that .buildinfo files generated from binary-only builds will no longer include the hash of the .dsc file in Checksums-Sha256 as documented in the specification.

Even if it was considered important to include a checksum of the source package in .buildinfo, storing it that way breaks other assumptions (eg. that Checksums-Sha256 contains only files part of that are part of a single upload, wheras the .dsc might not be part of that upload), thus we look forward for another solution to store the source checksum in .buildinfo.

Bugs filed

Reviews of unreproducible packages

250 package reviews have been added, 4 have been updated and 4 have been removed in this week, adding to our knowledge about identified issues.

4 issue types have been added:

3 issue types have been updated:

Weekly QA work

FTBFS bugs have been reported by:

Documentation updates

h01ger created a new Jenkins job so that every commit pushed to the master branch for the website will update reproducible-builds.org.

diffoscope development

strip-nondeterminism development

reprotest development

tests.reproducible-builds.org

Misc.

This week's edition was written by Chris Lamb, Holger Levsen and Mattia Rizzolo and reviewed by a bunch of Reproducible Builds folks on IRC.

26 Sep 2016 9:25pm GMT

Rhonda D'Vine: LP

I guess you know by now that I simply love music. It is powerful, it can move you, change your mood in a lot of direction, make you wanna move your body to it, even unknowingly have this happen, and remind you of situations you want to keep in mind. The singer I present to you was introduce to me by a dear friend with the following words: So this hasn't happened to me in a looooong time: I hear a voice and can't stop crying. I can't decide which song I should send to you thus I send three of which the last one let me think of you.

And I have to agree, that voice is really great. Thanks a lot for sharing LP with me, dear! And given that I got sent three songs and I am not good at holding excitement back, I want to share it with you, so here are the songs:

Like always, enjoy!

/music | permanent link | Comments: 0 | Flattr this

26 Sep 2016 10:00am GMT

25 Sep 2016

feedPlanet Debian

Clint Adams: Collect the towers

Why is openbmap's North American coverage so sad? Is there a reason that RadioBeacon doesn't also submit to OpenCellID? Is there a free software Android app that submits data to OpenCellID?

25 Sep 2016 11:57pm GMT

Vincent Sanders: I'll huff, and I'll puff, and I'll blow your house in

Sometimes it really helps to have a different view on a problem and after my recent writings on my Public Suffix List (PSL) library I was fortunate to receive a suggestion from my friend Enrico Zini.

I had asked for suggestions on reducing the size of the library further and Enrico simply suggested Huffman coding. This was a technique I had learned about long ago in connection with data compression and the intervening years had made all the details fuzzy which explains why it had not immediately sprung to mind.

A small subset of the Public Suffix List as stored within libnspslHuffman coding named for David A. Huffman is an algorithm that enables a representation of data which is very efficient. In a normal array of characters every character takes the same eight bits to represent which is the best we can do when any of the 256 values possible is equally likely. If your data is not evenly distributed this is not the case for example if the data was english text then the value is fifteen times more likely to be that for e than k.

every step of huffman encoding tree build for the example string tableSo if we have some data with a non uniform distribution of probabilities we need a way the data be encoded with fewer bits for the common values and more bits for the rarer values. To be efficient we would need some way of having variable length representations without storing the length separately. The term for this data representation is a prefix code and there are several ways to generate them.

Such is the influence of Huffman on the area of prefix codes they are often called Huffman codes even if they were not created using his algorithm. One can dream of becoming immortalised like this, to join the ranks of those whose names are given to units or whole ideas in a field must be immensely rewarding, however given Huffman invented his algorithm and proved it to be optimal to answer a question on a term paper in his early twenties I fear I may already be a bit too late.

The algorithm itself is relatively straightforward. First a frequency analysis is performed, a fancy way of saying count how many of each character is in the input data. Next a binary tree is created by using a priority queue initialised with the nodes sorted by frequency.

The resulting huffman tree and the binary representation of the input symbols

The two least frequent items count is summed together and a node placed in the tree with the two original entries as child nodes. This step is repeated until a single node exists with a count value equal to the length of the input.

To encode data once simply walks the tree outputting a 0 for a left node or 1 for right node until reaching the original value. This generates a mapping of values to bit output, the input is then simply converted value by value to the bit output. To decode the data the data is used bit by bit to walk the tree to arrive at values.

If we perform this algorithm on the example string table *!asiabvcomcoopitamazonawsarsaves-the-whalescomputebasilicata we can reduce the 488 bits (61 * 8 bit characters) to 282 bits or 40% reduction. Obviously in a real application the huffman tree would need to be stored which would probably exceed this saving but for larger data sets it is probable this technique would yield excellent results on this kind of data.

Once I proved this to myself I implemented the encoder within the existing conversion program. Although my perl encoder is not very efficient it can process the entire PSL string table (around six thousand labels using 40KB or so) in less than a second, so unless the table grows massively an inelegant approach will suffice.

The resulting bits were packed into 32bit values to improve decode performance (most systems prefer to deal with larger memory fetches less frequently) and resulted in 18KB of output or 47% of the original size. This is a great improvement in size and means the statically linked test program is now 59KB and is actually smaller than the gzipped source data.

$ ls -alh test_nspsl
-rwxr-xr-x 1 vince vince 59K Sep 25 23:58 test_nspsl
$ ls -al public_suffix_list.dat.gz
-rw-r--r-- 1 vince vince 62K Sep 1 08:52 public_suffix_list.dat.gz


To be clear the statically linked program can determine if a domain is in the PSL with no additional heap allocations and includes the entire PSL ordered tree, the domain label string table and the huffman decode table to read it.

An unexpected side effect is that because the decode loop is small it sits in the processor cache. This appears to cause the string comparison function huffcasecmp() (which is not locale dependant because we know the data will be limited ASCII) performance to be close to using strcasecmp() indeed on ARM32 systems there is a very modest improvement in performance.

I think this is as much work as I am willing to put into this library but I am pleased to have achieved a result which is on par with the best of breed (libpsl still has a data representation 20KB smaller than libnspsl but requires additional libraries for additional functionality) and I got to (re)learn an important algorithm too.

25 Sep 2016 11:23pm GMT

Julian Andres Klode: Introducing TrieHash, a order-preserving minimal perfect hash function generator for C(++)

Abstract

I introduce TrieHash an algorithm for constructing perfect hash functions from tries. The generated hash functions are pure C code, minimal, order-preserving and outperform existing alternatives. Together with the generated header files,they can also be used as a generic string to enumeration mapper (enums are created by the tool).

Introduction

APT (and dpkg) spend a lot of time in parsing various files, especially Packages files. APT currently uses a function called AlphaHash which hashes the last 8 bytes of a word in a case-insensitive manner to hash fields in those files (dpkg just compares strings in an array of structs).

There is one obvious drawback to using a normal hash function: When we want to access the data in the hash table, we have to hash the key again, causing us to hash every accessed key at least twice. It turned out that this affects something like 5 to 10% of the cache generation performance.

Enter perfect hash functions: A perfect hash function matches a set of words to constant values without collisions. You can thus just use the index to index into your hash table directly, and do not have to hash again (if you generate the function at compile time and store key constants) or handle collision resolution.

As #debian-apt people know, I happened to play a bit around with tries this week before guillem suggested perfect hashing. Let me tell you one thing: My trie implementation was very naive, that did not really improve things a lot…

Enter TrieHash

Now, how is this related to hashing? The answer is simple: I wrote a perfect hash function generator that is based on tries. You give it a list of words, it puts them in a trie, and generates C code out of it, using recursive switch statements (see code generation below). The function achieves competitive performance with other hash functions, it even usually outperforms them.

Given a dictionary, it generates an enumeration (a C enum or C++ enum class) of all words in the dictionary, with the values corresponding to the order in the dictionary (the order-preserving property), and a function mapping strings to members of that enumeration.

By default, the first word is considered to be 0 and each word increases a counter by one (that is, it generates a minimal hash function). You can tweak that however:

= 0
WordLabel ~ Word
OtherWord = 9

will return 0 for an unknown value, map "Word" to the enum member WordLabel and map OtherWord to 9. That is, the input list functions like the body of a C enumeration. If no label is specified for a word, it will be generated from the word. For more details see the documentation

C code generation

switch(string[0] | 32) {
case 't':
    switch(string[1] | 32) {
    case 'a':
        switch(string[2] | 32) {
        case 'g':
            return Tag;
        }
    }
}
return Unknown;

Yes, really recursive switches - they directly represent the trie. Now, we did not really do a straightforward translation, there are some optimisations to make the whole thing faster and easier to look at:

First of all, the 32 you see is used to make the check case insensitive in case all cases of the switch body are alphabetical characters. If there are non-alphabetical characters, it will generate two cases per character, one upper case and one lowercase (with one break in it). I did not know that lowercase and uppercase characters differed by only one bit before, thanks to the clang compiler for pointing that out in its generated assembler code!

Secondly, we only insert breaks only between cases. Initially, each case ended with a return Unknown, but guillem (the dpkg developer) suggested it might be faster to let them fallthrough where possible. Turns out it was not faster on a good compiler, but it's still more readable anywhere.

Finally, we build one trie per word length, and switch by the word length first. Like the 32 trick, his gives a huge improvement in performance.

Digging into the assembler code

The whole code translates to roughly 4 instructions per byte:

  1. A memory load,
  2. an or with 32
  3. a comparison, and
  4. a conditional jump.

(On x86, the case sensitive version actually only has a cmp-with-memory and a conditional jump).

Due to https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77729 this may be one instruction more: On some architectures an unneeded zero-extend-byte instruction is inserted - this causes a 20% performance loss.

Performance evaluation

I run the hash against all 82 words understood by APT in Packages and Sources files, 1,000,000 times for each word, and summed up the average run-time:

host arch Trie TrieCase GPerfCase GPerf DJB
plummer ppc64el 540 601 1914 2000 1345
eller mipsel 4728 5255 12018 7837 4087
asachi arm64 1000 1603 4333 2401 1625
asachi armhf 1230 1350 5593 5002 1784
barriere amd64 689 950 3218 1982 1776
x230 amd64 465 504 1200 837 693

Suffice to say, GPerf does not really come close.

All hosts except the x230 are Debian porterboxes. The x230 is my laptop with a a Core i5-3320M, barriere has an Opteron 23xx. I included the DJB hash function for another reference.

Source code

The generator is written in Perl, licensed under the MIT license and available from https://github.com/julian-klode/triehash - I initially prototyped it in Python, but guillem complained that this would add new build dependencies to dpkg, so I rewrote it in Perl.

Benchmark is available from https://github.com/julian-klode/hashbench

Usage

See the script for POD documentation.


Filed under: General

25 Sep 2016 6:44pm GMT

Steinar H. Gunderson: Nageru @ Fyrrom

When Samfundet wanted to make their own Boiler Room spinoff (called "Fyrrom"-more or less a direct translation), it was a great opportunity to try out the new multitrack code in Nageru. After all, what can go wrong with a pretty much untested and unfinished git branch, right?

So we cobbled together a bunch of random equipment from here and there:

Video equipment

Hooked it up to Nageru:

Nageru screenshot

and together with some great work from the people actually pulling together the event, this was the result. Lots of fun.

And yes, some bugs were discovered-of course, field testing without followup patches is meaningless (that would either mean you're not actually taking your test experience into account, or that your testing gave no actionable feedback and thus was useless), so they will be fixed in due time for the 1.4.0 release.

Edit: Fixed a screenshot link.

25 Sep 2016 2:01pm GMT