24 Jan 2020

feedPlanet Debian

Steve Kemp: procmail for gmail?

After 10+ years I'm in the process of retiring my mail-host. In the future I'll no longer be running exim4/dovecot/similar, and handling my own mail. Instead it'll all go to a (paid) Google account.

It feels like the end of an era, as it means a lot of my daily life will not be spent inside a single host no longer will I run:

ssh steve@mail.steve.org.uk

I'm still within my Gsuite trial, but I've mostly finished importing my vast mail archive, via mbsync.

The only outstanding thing I need is some scripting for the mail. Since my mail has been self-hosted I've evolved a large and complex procmail configuration file which sorted incoming messages into Maildir folders.

Having a quick look around last night I couldn't find anything similar for the brave new world of Google Mail. So I hacked up a quick script which will automatically add labels to new messages that don't have any.

Finding messages which are new/unread and which don't have labels is a matter of searching for:

is:unread -has:userlabels

From there adding labels is pretty simple, if you decide what you want. For the moment I'm keeping it simple:

Both labels will be created if they don't already exist, and the actual coding part was pretty simple. To be more complex/flexible I would probably need to integrate a scripting language (oh, I have one of those), and let the user decide what to do for each message.

The biggest annoyance is setting up the Google project, and all the OAUTH magic. I've documented briefly what I did but I don't actually know if anybody else could run the damn thing - there's just too much "magic" involved in these APIs.

Anyway procmail-lite for gmail. Job done.

24 Jan 2020 10:30am GMT

Bdale Garbee: Digital Photo Creation Dates

I learned something new yesterday, that probably shouldn't have shocked me as much as it did. For legacy reasons, the "creation time" in the Exif metadata attached to digital camera pictures is not expressed in absolute time, but rather in some arbitrary expression of "local" time! This caused me to spend a long evening learning how to twiddle Exif data, and then how to convince Piwigo to use the updated metadata. In case I or someone else need to do this in the future, it seems worth taking the time to document what I learned and what I did to "make things right".

The reason photo creation time matters to me is that my wife Karen and I are currently in the midst of creating a "best of" subset of photos taken on our recently concluded family expedition to Antarctica and Argentina. Karen loves taking (sometimes award-winning) nature photos, and during this trip she took thousands of photos using her relatively new Nikon COOLPIX P900 camera. At the same time, both of us and our kids also took many photos using the cameras built into our respective Android phones. To build our "best of" list, we wanted to be able to pick and choose from the complete set of photos taken, so I started by uploading all of them to the Piwigo instance I host on a virtual machine on behalf of the family, where we assigned a new tag for the subset and started to pick photos to include.

Unfortunately, to our dismay, we noted that all the photos taken on the P900 weren't aligning correctly in the time-line. This was completely unexpected, since one of the features of the P900 is that it includes a GPS chip and adds geo-tags to every photo taken, including a GPS time stamp.

Background

We've grown accustomed to the idea that our phones always know the correct time due to their behavior on the mobile networks around the world. And for most of us, the camera in our phone is probably the best camera we own. Naively, my wife and I assumed the GPS time stamps on the photos taken by the P900 would allow it to behave similarly and all our photos would just automatically align in time... but that's not how it worked out!

The GPS time stamp implemented by Nikon is included as an Exif extension separate from the "creation time", which is expressed in the local time known by the camera. While my tiny little mind revolts at this and thinks all digital photos should just have a GPS-derived UTC creation time whenever possible... after thinking about it for a while, I think I understand how we got here.

In the early days of Exif, most photos were taken using chemical processes and any associated metadata was created and added manually after the photo existed. That's probably why there are separate tags for creation time and digitization time, for example. As cameras went digital and got clocks, it became common to expect the photographer to set the date and time in their camera, and of course most people would choose the local time since that's what they knew.

With the advent of GPS chips in cameras, the hardware now has access to an outstanding source of "absolute time". But the Nikon guys aren't actually using that directly to set image creation time. Instead, they still assume the photographer is going to manually set the local time, but added a function buried in one of the setup menus to allow a one-time set of the camera's clock from GPS satellite data.

So, what my wife needs to do in the future is remember at the start of any photo shooting period where time sync of her photos with those of others is important, she needs to make sure her camera's time is correctly set, taking advantage of the function that allows here to set the local time from the GPS time. But of course, that only helps future photos...

How I fixed the problem

So the problem in front of me was several thousand images taken with the camera's clock "off" by 15 hours and 5 minutes. We figured that out by a combinaton of noting the amount the camera's clock skewed by when we used the GPS function to set the clock, then noticing that we still had to account for the time zone to make everything line up right. As far as I can tell, 12 hours of that was due to AM vs PM confusion when my wife originally set the time by hand, less 1 hour of daylight savings time not accounted for, plus 4 time zones from home to where the photos were taken. And the remaining 5 minutes probably amount to some combination of imprecision when the clock was originally set by hand, and drift of the camera's clock in the many months since then.

I thought briefly about hacking Piwigo to use the GPS time stamps, but quickly realized that wouldn't actually solve the problem, since they're in UTC and the pictures from our phone cameras were all using local time. There's probably a solution lurking there somewhere, but just fixing up the times in the photo files that were wrong seemed like an easier path forward.

A Google search or two later, and I found jhead, which fortunately was already packaged for Debian. It makes changing Exif timestamps of an on-disk Jpeg image file really easy. Highly recommended!

Compounding my problem was that my wife had already spent many hours tagging her photos in the Piwigo web GUI, so it really seemed necessary to fix the images "in place" on the Piwigo server. The first problem with that is that as you upload photos to the server, they are assigned unique filenames on disk based on the upload date and time plus a random hash, and the original filename becomes just an element of metadata in the Piwigo database. Piwigo scans the Exif data at image import time and stuffs the database with a number of useful values from there, including the image creation time that is fundamental to aligning images taken by different cameras on a timeline.

I could find no Piwigo interface to easily extract the on-disk filenames for a given set of photos, so I ended up playing with the underlying database directly. The Piwigo source tree contains a file piwigo_structure-mysql.sql used in the installation process to set up the database tables that served as a handy reference for figuring out the database schema. Looking at the piwigo_categories table, I learned that the "folder" I had uploaded all of the raw photos from my wife's camera to was category 109. After a couple hours of re-learning mysql/mariadb query semantics and just trying things against the database, this is the command that gave me the list of all the files I wanted:

select piwigo_images.path into outfile '/tmp/imagefiles' from piwigo_image_category, piwigo_images where piwigo_image_category.category_id=109 and piwigo_images.date_creation >= '2019-12-14' and piwigo_image_category.image_id=piwigo_images.id;

That gave me a list of the on-disk file paths (relative to the Piwigo installation root) of images uploaded from my wife's camera since the start of this trip in a file. A trivial shell script loop using that list of paths quickly followed:

        cd /var/www/html/piwigo
        for i in `cat /tmp/imagefiles`
        do
                echo $i
                sudo -u www-data jhead -ta+15:05 $i
        done

At this point, all the files on disk were updated, as a little quick checking with exif and exiv2 at the command line confirmed. But my second problem was figuring out how to get Piwigo to notice and incorporate the changes. That turned out to be easier than I thought! Using the admin interface to go into the photos batch manager, I was able to select all the photos in the folder we upload raw pictures from Karen's camera to that were taken in the relevant date range (which I expressed as taken:2019-12-14..2021), then selected all photos in the resulting set, and performed action "synchronize metadata". All the selected image files were rescanned, the database got updated...

Voila! Happy wife!

24 Jan 2020 9:09am GMT

23 Jan 2020

feedPlanet Debian

Raphaël Hertzog: Freexian’s report about Debian Long Term Support, December 2019

A Debian LTS logo Like each month, here comes a report about the work of paid contributors to Debian LTS.

Individual reports

In December, 208.00 work hours have been dispatched among 14 paid contributors. Their reports are available:

Evolution of the situation

Though December was as quiet as to be expected due to the holiday season, the usual amount of security updates were still released by our contributors.
We currently have 59 LTS sponsors each month sponsoring 219h. Still, as always we are welcoming new LTS sponsors!

The security tracker currently lists 34 packages with a known CVE and the dla-needed.txt file has 33 packages needing an update.

Thanks to our sponsors

New sponsors are in bold.

No comment | Liked this article? Click here. | My blog is Flattr-enabled.

23 Jan 2020 6:19pm GMT

21 Jan 2020

feedPlanet Debian

Keith Packard: lca2020

Linux.conf.au 2020

I just got back from linux.conf.au 2020 on Saturday and am still adjusting to being home again. I had the opportunity to give three presentations during the conference and wanted to provide links to the slides and videos.

Picolibc

My first presentation was part of the Open ISA miniconf on Monday. I summarized the work I've been doing on a fork of Newlib called Picolibc which targets 32- and 64- bit embedded processors.

Snek

Wednesday morning, I presented on my snek language, which is a small Python designed for introducing programming in an embedded environment. I've been using this for the last year or more in a middle-school environment (grades 5-7) as a part of a LEGO robotics class.

X History and Politics

Bradley Kuhn has been encouraging me to talk about the early politics of X and how that has shaped my views on the benefits of copyleft licenses in building strong communities, especially in driving corporate cooperation and collaboration. I would have loved to also give this talk as a part of the Copyleft Conference being held in Brussels after FOSDEM, but I won't be at that event. This talk spans the early years of X, covering events up through 1992 or so.

21 Jan 2020 11:02pm GMT

Chris Lamb: Tour d'Orwell: Sutton Courtenay

(Previously in George Orwell-themed travel posts: Marrakesh, Hampstead, Paris, Southwold & Ipswich.)

George Orwell spent the last chapter of his life at the University College Hospital in London. Despite being gravely ill, arrangements were underway for him to travel to Switzerland for treatment. He had clearly not surrendered all hope as he had acquired his own fishing rod as well as some "proper" English tea to accompany him on the trip, although this is more likely to be a rare failure of facing unpleasant facts. In the end, he died in the early hours of 21st January 1950 from complications resulting from his chronic pneumonia.

He was buried in Sutton Courtenay, a small village approximately ten miles south of Oxford. Orwell had no personal connection to the village, but a lifelong love of the countryside must have encouraged a collaboration between David Astor (a longtime editor of The Observer newspaper) and Malcolm Muggeridge, better known today for introducing Mother Teresa to an international audience and some inexpensive commentary on Monty Python's The Life Of Brian.

I was expecting a few more fellow travellers to be there seventy years to the day of his death but in recompense I had a frosty yet beautifully quiet churchyard to myself. The surrounding Thames had swollen its banks onto the floodplain and the winter sunset a few hours later had a quality all its own.

21 Jan 2020 10:44pm GMT

Michael Stapelberg: distri: 20x faster initramfs (initrd) from scratch

In case you are not yet familiar with why an initramfs (or initrd, or initial ramdisk) is typically used when starting Linux, let me quote the wikipedia definition:

"[…] initrd is a scheme for loading a temporary root file system into memory, which may be used as part of the Linux startup process […] to make preparations before the real root file system can be mounted."

Many Linux distributions do not compile all file system drivers into the kernel, but instead load them on-demand from an initramfs, which saves memory.

Another common scenario, in which an initramfs is required, is full-disk encryption: the disk must be unlocked from userspace, but since userspace is encrypted, an initramfs is used.

Motivation

Thus far, building a distri disk image was quite slow:

This is on an AMD Ryzen 3900X 12-core processor (2019):

distri % time make cryptimage serial=1
80.29s user 13.56s system 186% cpu 50.419 total # 19s image, 31s initrd

Of these 50 seconds, dracut's initramfs generation accounts for 31 seconds (62%)!

Initramfs generation time drops to 8.7 seconds once dracut no longer needs to use the single-threaded gzip(1) , but the multi-threaded replacement pigz(1) :

This brings the total time to build a distri disk image down to:

distri % time make cryptimage serial=1
76.85s user 13.23s system 327% cpu 27.509 total # 19s image, 8.7s initrd

Clearly, when you use dracut on any modern computer, you should make pigz available. dracut should fail to compile unless one explicitly opts into the known-slower gzip. For more thoughts on optional dependencies, see "Optional dependencies don't work".

But why does it take 8.7 seconds still? Can we go faster?

The answer is Yes! I recently built a distri-specific initramfs I'm calling minitrd. I wrote both big parts from scratch:

  1. the initramfs generator program (distri initrd)
  2. a custom Go userland (cmd/minitrd), running as /init in the initramfs.

minitrd generates the initramfs image in ≈400ms, bringing the total time down to:

distri % time make cryptimage serial=1
50.09s user 8.80s system 314% cpu 18.739 total # 18s image, 400ms initrd

(The remaining time is spent in preparing the file system, then installing and configuring the distri system, i.e. preparing a disk image you can run on real hardware.)

How can minitrd be 20 times faster than dracut?

dracut is mainly written in shell, with a C helper program. It drives the generation process by spawning lots of external dependencies (e.g. ldd or the dracut-install helper program). I assume that the combination of using an interpreted language (shell) that spawns lots of processes and precludes a concurrent architecture is to blame for the poor performance.

minitrd is written in Go, with speed as a goal. It leverages concurrency and uses no external dependencies; everything happens within a single process (but with enough threads to saturate modern hardware).

Measuring early boot time using qemu, I measured the dracut-generated initramfs taking 588ms to display the full disk encryption passphrase prompt, whereas minitrd took only 195ms.

The rest of this article dives deeper into how minitrd works.

What does an initramfs do?

Ultimately, the job of an initramfs is to make the root file system available and continue booting the system from there. Depending on the system setup, this involves the following 5 steps:

1. Load kernel modules to access the block devices with the root file system

Depending on the system, the block devices with the root file system might already be present when the initramfs runs, or some kernel modules might need to be loaded first. On my Dell XPS 9360 laptop, the NVMe system disk is already present when the initramfs starts, whereas in qemu, we need to load the virtio_pci module, followed by the virtio_scsi module.

How will our userland program know which kernel modules to load? Linux kernel modules declare patterns for their supported hardware as an alias, e.g.:

initrd# grep virtio_pci lib/modules/5.4.6/modules.alias
alias pci:v00001AF4d*sv*sd*bc*sc*i* virtio_pci

Devices in sysfs have a modalias file whose content can be matched against these declarations to identify the module to load:

initrd# cat /sys/devices/pci0000:00/*/modalias
pci:v00001AF4d00001005sv00001AF4sd00000004bc00scFFi00
pci:v00001AF4d00001004sv00001AF4sd00000008bc01sc00i00
[…]

Hence, for the initial round of module loading, it is sufficient to locate all modalias files within sysfs and load the responsible modules.

Loading a kernel module can result in new devices appearing. When that happens, the kernel sends a uevent, which the uevent consumer in userspace receives via a netlink socket. Typically, this consumer is udev(7) , but in our case, it's minitrd.

For each uevent messages that comes with a MODALIAS variable, minitrd will load the relevant kernel module(s).

When loading a kernel module, its dependencies need to be loaded first. Dependency information is stored in the modules.dep file in a Makefile-like syntax:

initrd# grep virtio_pci lib/modules/5.4.6/modules.dep
kernel/drivers/virtio/virtio_pci.ko: kernel/drivers/virtio/virtio_ring.ko kernel/drivers/virtio/virtio.ko

To load a module, we can open its file and then call the Linux-specific finit_module(2) system call. Some modules are expected to return an error code, e.g. ENODEV or ENOENT when some hardware device is not actually present.

Side note: next to the textual versions, there are also binary versions of the modules.alias and modules.dep files. Presumably, those can be queried more quickly, but for simplicitly, I have not (yet?) implemented support in minitrd.

2. Console settings: font, keyboard layout

Setting a legible font is necessary for hi-dpi displays. On my Dell XPS 9360 (3200 x 1800 QHD+ display), the following works well:

initrd# setfont latarcyrheb-sun32

Setting the user's keyboard layout is necessary for entering the LUKS full-disk encryption passphrase in their preferred keyboard layout. I use the NEO layout:

initrd# loadkeys neo

3. Block device identification

In the Linux kernel, block device enumeration order is not necessarily the same on each boot. Even if it was deterministic, device order could still be changed when users modify their computer's device topology (e.g. connect a new disk to a formerly unused port).

Hence, it is good style to refer to disks and their partitions with stable identifiers. This also applies to boot loader configuration, and so most distributions will set a kernel parameter such as root=UUID=1fa04de7-30a9-4183-93e9-1b0061567121.

Identifying the block device or partition with the specified UUID is the initramfs's job.

Depending on what the device contains, the UUID comes from a different place. For example, ext4 file systems have a UUID field in their file system superblock, whereas LUKS volumes have a UUID in their LUKS header.

Canonically, probing a device to extract the UUID is done by libblkid from the util-linux package, but the logic can easily be re-implemented in other languages and changes rarely. minitrd comes with its own implementation to avoid cgo or running the blkid(8) program.

4. LUKS full-disk encryption unlocking (only on encrypted systems)

Unlocking a LUKS-encrypted volume is done in userspace. The kernel handles the crypto, but reading the metadata, obtaining the passphrase (or e.g. key material from a file) and setting up the device mapper table entries are done in user space.

initrd# modprobe algif_skcipher
initrd# cryptsetup luksOpen /dev/sda4 cryptroot1

After the user entered their passphrase, the root file system can be mounted:

initrd# mount /dev/dm-0 /mnt

5. Continuing the boot process (switch_root)

Now that everything is set up, we need to pass execution to the init program on the root file system with a careful sequence of chdir(2) , mount(2) , chroot(2) , chdir(2) and execve(2) system calls that is explained in this busybox switch_root comment.

initrd# mount -t devtmpfs dev /mnt/dev
initrd# exec switch_root -c /dev/console /mnt /init

To conserve RAM, the files in the temporary file system to which the initramfs archive is extracted are typically deleted.

How is an initramfs generated?

An initramfs "image" (more accurately: archive) is a compressed cpio archive. Typically, gzip compression is used, but the kernel supports a bunch of different algorithms and distributions such as Ubuntu are switching to lz4.

Generators typically prepare a temporary directory and feed it to the cpio(1) program. In minitrd, we read the files into memory and generate the cpio archive using the go-cpio package. We use the pgzip package for parallel gzip compression.

The following files need to go into the cpio archive:

minitrd Go userland

The minitrd binary is copied into the cpio archive as /init and will be run by the kernel after extracting the archive.

Like the rest of distri, minitrd is built statically without cgo, which means it can be copied as-is into the cpio archive.

Linux kernel modules

Aside from the modules.alias and modules.dep metadata files, the kernel modules themselves reside in e.g. /lib/modules/5.4.6/kernel and need to be copied into the cpio archive.

Copying all modules results in a ≈80 MiB archive, so it is common to only copy modules that are relevant to the initramfs's features. This reduces archive size to ≈24 MiB.

The filtering relies on hard-coded patterns and module names. For example, disk encryption related modules are all kernel modules underneath kernel/crypto, plus kernel/drivers/md/dm-crypt.ko.

When generating a host-only initramfs (works on precisely the computer that generated it), some initramfs generators look at the currently loaded modules and just copy those.

Console Fonts and Keymaps

The kbd package's setfont(8) and loadkeys(1) programs load console fonts and keymaps from /usr/share/consolefonts and /usr/share/keymaps, respectively.

Hence, these directories need to be copied into the cpio archive. Depending on whether the initramfs should be generic (work on many computers) or host-only (works on precisely the computer/settings that generated it), the entire directories are copied, or only the required font/keymap.

cryptsetup, setfont, loadkeys

These programs are (currently) required because minitrd does not implement their functionality.

As they are dynamically linked, not only the programs themselves need to be copied, but also the ELF dynamic linking loader (path stored in the .interp ELF section) and any ELF library dependencies.

For example, cryptsetup in distri declares the ELF interpreter /ro/glibc-amd64-2.27-3/out/lib/ld-linux-x86-64.so.2 and declares dependencies on shared libraries libcryptsetup.so.12, libblkid.so.1 and others. Luckily, in distri, packages contain a lib subdirectory containing symbolic links to the resolved shared library paths (hermetic packaging), so it is sufficient to mirror the lib directory into the cpio archive, recursing into shared library dependencies of shared libraries.

cryptsetup also requires the GCC runtime library libgcc_s.so.1 to be present at runtime, and will abort with an error message about not being able to call pthread_cancel(3) if it is unavailable.

time zone data

To print log messages in the correct time zone, we copy /etc/localtime from the host into the cpio archive.

minitrd outside of distri?

I currently have no desire to make minitrd available outside of distri. While the technical challenges (such as extending the generator to not rely on distri's hermetic packages) are surmountable, I don't want to support people's initramfs remotely.

Also, I think that people's efforts should in general be spent on rallying behind dracut and making it work faster, thereby benefiting all Linux distributions that use dracut (increasingly more). With minitrd, I have demonstrated that significant speed-ups are achievable.

Conclusion

It was interesting to dive into how an initramfs really works. I had been working with the concept for many years, from small tasks such as "debug why the encrypted root file system is not unlocked" to more complicated tasks such as "set up a root file system on DRBD for a high-availability setup". But even with that sort of experience, I didn't know all the details, until I was forced to implement every little thing.

As I suspected going into this exercise, dracut is much slower than it needs to be. Re-implementing its generation stage in a modern language instead of shell helps a lot.

Of course, my minitrd does a bit less than dracut, but not drastically so. The overall architecture is the same.

I hope my effort helps with two things:

  1. As a teaching implementation: instead of wading through the various components that make up a modern initramfs (udev, systemd, various shell scripts, …), people can learn about how an initramfs works in a single place.

  2. I hope the significant time difference motivates people to improve dracut.

Appendix: qemu development environment

Before writing any Go code, I did some manual prototyping. Learning how other people prototype is often immensely useful to me, so I'm sharing my notes here.

First, I copied all kernel modules and a statically built busybox binary:

% mkdir -p lib/modules/5.4.6
% cp -Lr /ro/lib/modules/5.4.6/* lib/modules/5.4.6/
% cp ~/busybox-1.22.0-amd64/busybox sh

To generate an initramfs from the current directory, I used:

% find . | cpio -o -H newc | pigz > /tmp/initrd

In distri's Makefile, I append these flags to the QEMU invocation:

-kernel /tmp/kernel \
-initrd /tmp/initrd \
-append "root=/dev/mapper/cryptroot1 rdinit=/sh ro console=ttyS0,115200 rd.luks=1 rd.luks.uuid=63051f8a-54b9-4996-b94f-3cf105af2900 rd.luks.name=63051f8a-54b9-4996-b94f-3cf105af2900=cryptroot1 rd.vconsole.keymap=neo rd.vconsole.font=latarcyrheb-sun32 init=/init systemd.setenv=PATH=/bin rw vga=836"

The vga= mode parameter is required for loading font latarcyrheb-sun32.

Once in the busybox shell, I manually prepared the required mount points and kernel modules:

ln -s sh mount
ln -s sh lsmod
mkdir /proc /sys /run /mnt
mount -t proc proc /proc
mount -t sysfs sys /sys
mount -t devtmpfs dev /dev
modprobe virtio_pci
modprobe virtio_scsi

As a next step, I copied cryptsetup and dependencies into the initramfs directory:

% for f in /ro/cryptsetup-amd64-2.0.4-6/lib/*; do full=$(readlink -f $f); rel=$(echo $full | sed 's,^/,,g'); mkdir -p $(dirname $rel); install $full $rel; done
% ln -s ld-2.27.so ro/glibc-amd64-2.27-3/out/lib/ld-linux-x86-64.so.2
% cp /ro/glibc-amd64-2.27-3/out/lib/ld-2.27.so ro/glibc-amd64-2.27-3/out/lib/ld-2.27.so
% cp -r /ro/cryptsetup-amd64-2.0.4-6/lib ro/cryptsetup-amd64-2.0.4-6/
% mkdir -p ro/gcc-libs-amd64-8.2.0-3/out/lib64/
% cp /ro/gcc-libs-amd64-8.2.0-3/out/lib64/libgcc_s.so.1 ro/gcc-libs-amd64-8.2.0-3/out/lib64/libgcc_s.so.1
% ln -s /ro/gcc-libs-amd64-8.2.0-3/out/lib64/libgcc_s.so.1 ro/cryptsetup-amd64-2.0.4-6/lib
% cp -r /ro/lvm2-amd64-2.03.00-6/lib ro/lvm2-amd64-2.03.00-6/

In busybox, I used the following commands to unlock the root file system:

modprobe algif_skcipher
./cryptsetup luksOpen /dev/sda4 cryptroot1
mount /dev/dm-0 /mnt

21 Jan 2020 5:19pm GMT

20 Jan 2020

feedPlanet Debian

Jonathan Dowland: Self-hosted web fonts

Today on Lobsters I found a link to Kev Quirk's blog post How to self-host your web fonts. For the last nine years I've been using Google's font-hosting service, which whilst very convenient, carried some privacy concerns (which Joey Hess originally brought to my attention) and (it turns out) does not appear to have been faster, in network terms, than bundling what I was using locally. This is something I've been meaning to get around to doing for almost that long.

20 Jan 2020 3:08pm GMT

Dirk Eddelbuettel: anytime 0.3.7

A fresh minor release of the anytime package is arriving on CRAN right now. This is the eighteenth release, and it comes roughly five months after the previous showing the relative feature-stability we have now.

anytime is a very focused package aiming to do just one thing really well: to convert anything in integer, numeric, character, factor, ordered, … format to either POSIXct or Date objects - and to do so without requiring a format string. See the anytime page, or the GitHub README.md for a few examples.

This release brings a clever new option, thanks to Stephen Froehlich. If you know your input has (lots) of duplicates you can now say so and anytime() (and the other entry points for times and dates, UTC or not) will only parse the unique entries leading to potentially rather large speed gains (as in Stephen's case where he often has more than 95% of the data as duplicates). We also tweaked the test setup some more, but as we are still unable to replicate what is happening with the Fedora test boxen at CRAN due to the non-reproducible setup so this remains a bit of guess work. Lastly, I am making use of a new Rcpp #define to speed up compilation a little bit too.

The full list of changes follows.

Changes in anytime version 0.3.7 (2019-01-20)

  • Test and possibly condition away one more test file.

  • Small enhancement for compilation by setting no-rtti define via Rcpp.

  • New option calcUnique for speed-up by parseing only unique timestamps (Stephen Froehlich in #110 fixing #109).

Courtesy of CRANberries, there is a comparison to the previous release. More information is on the anytime page. The issue tracker tracker off the GitHub repo can be use for questions and comments.

If you like this or other open-source work I do, you can now sponsor me at GitHub. For the first year, GitHub will match your contributions.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

20 Jan 2020 1:58pm GMT

Matthew Garrett: Verifying your system state in a secure and private way

Most modern PCs have a Trusted Platform Module (TPM) and firmware that, together, support something called Trusted Boot. In Trusted Boot, each component in the boot chain generates a series of measurements of next component of the boot process and relevant configuration. These measurements are pushed to the TPM where they're combined with the existing values stored in a series of Platform Configuration Registers (PCRs) in such a way that the final PCR value depends on both the value and the order of the measurements it's given. If any measurements change, the final PCR value changes.

Windows takes advantage of this with its Bitlocker disk encryption technology. The disk encryption key is stored in the TPM along with a policy that tells it to release it only if a specific set of PCR values is correct. By default, the TPM will release the encryption key automatically if the PCR values match and the system will just transparently boot. If someone tampers with the boot process or configuration, the PCR values will no longer match and boot will halt to allow the user to provide the disk key in some other way.

Unfortunately the TPM keeps no record of how it got to a specific state. If the PCR values don't match, that's all we know - the TPM is unable to tell us what changed to result in this breakage. Fortunately, the system firmware maintains an event log as we go along. Each measurement that's pushed to the TPM is accompanied by a new entry in the event log, containing not only the hash that was pushed to the TPM but also metadata that tells us what was measured and why. Since the algorithm the TPM uses to calculate the hash values is known, we can replay the same values from the event log and verify that we end up with the same final value that's in the TPM. We can then examine the event log to see what changed.

Unfortunately, the event log is stored in unprotected system RAM. In order to be able to trust it we need to compare the values in the event log (which can be tampered with) with the values in the TPM (which are much harder to tamper with). Unfortunately if someone has tampered with the event log then they could also have tampered with the bits of the OS that are doing that comparison. Put simply, if the machine is in a potentially untrustworthy state, we can't trust that machine to tell us anything about itself.

This is solved using a procedure called Remote Attestation. The TPM can be asked to provide a digital signature of the PCR values, and this can be passed to a remote system along with the event log. That remote system can then examine the event log, make sure it corresponds to the signed PCR values and make a security decision based on the contents of the event log rather than just on the final PCR values. This makes the system significantly more flexible and aids diagnostics. Unfortunately, it also means you need a remote server and an internet connection and then some way for that remote server to tell you whether it thinks your system is trustworthy and also you need some way to believe that the remote server is trustworthy and all of this is well not ideal if you're not an enterprise.

Last week I gave a talk at linux.conf.au on one way around this. Basically, remote attestation places no constraints on the network protocol in use - while the implementations that exist all do this over IP, there's no requirement for them to do so. So I wrote an implementation that runs over Bluetooth, in theory allowing you to use your phone to serve as the remote agent. If you trust your phone, you can use it as a tool for determining if you should trust your laptop.

I've pushed some code that demos this. The current implementation does nothing other than tell you whether UEFI Secure Boot was enabled or not, and it's also not currently running on a phone. The phone bit of this is pretty straightforward to fix, but the rest is somewhat harder.

The big issue we face is that we frequently don't know what event log values we should be seeing. The first few values are produced by the system firmware and there's no standardised way to publish the expected values. The Linux Vendor Firmware Service has support for publishing these values, so for some systems we can get hold of this. But then you get to measurements of your bootloader and kernel, and those change every time you do an update. Ideally we'd have tooling for Linux distributions to publish known good values for each package version and for that to be common across distributions. This would allow tools to download metadata and verify that measurements correspond to legitimate builds from the distribution in question.

This does still leave the problem of the initramfs. Since initramfs files are usually generated locally, and depend on the locally installed versions of tools at the point they're built, we end up with no good way to precalculate those values. I proposed a possible solution to this a while back, but have done absolutely nothing to help make that happen. I suck. The right way to do this may actually just be to turn initramfs images into pre-built artifacts and figure out the config at runtime (dracut actually supports a bunch of this already), so I'm going to spend a while playing with that.

If we can pull these pieces together then we can get to a place where you can boot your laptop and then, before typing any authentication details, have your phone compare each component in the boot process to expected values. Assistance in all of this extremely gratefully received.

comment count unavailable comments

20 Jan 2020 12:53pm GMT

Russ Allbery: DocKnot 3.03

DocKnot is the software that I use to generate package documentation and web pages, and increasingly to generate release tarballs.

The main change in this release is to use IO::Uncompress::Gunzip and IO::Compress::Xz to generate a missing xz tarball when needed, instead of forking external programs (which causes all sorts of portability issues). Thanks to Slaven Rezić for the testing and report.

This release adds two new badges to README.md files: a version badge for CPAN packages pushed to GitHub, and a Debian version badge for packages with a corresponding Debian package.

This release also makes the tarball checking done as part of the release process (to ensure all files are properly included in the release) a bit more flexible by adding a distribution/ignore metadata setting containing a list of regular expressions matching files to ignore for checking purposes.

Finally, this release fixes a bug that leaked $@ modifications to the caller of App::DocKnot::Config.

You can get the latest version from the DocKnot distribution page.

20 Jan 2020 3:45am GMT

19 Jan 2020

feedPlanet Debian

Enrico Zini: Food links

Periodic graphics: The chemistry of frozen desserts

food chart archive.org
2020-01-20

Chemical educator and Compound Interest blogger Andy Brunning samples the science behind ice cream and other icy treats

Play with your food: How to Make Sconic Sections | Evil Mad Scientist Laboratories
food archive.org
2020-01-20

The conic sections are the four classic geometric curves that can occur at the intersection between a cone and a plane: the circle, ellipse, parabola, and hyperbola.

Scott And Scurvy
health history archive.org
2020-01-20

«Writing about the first winter the men spent on the ice, Cherry-Garrard casually mentions an astonishing lecture on scurvy by one of the expedition's doctors…»

Mineral waters à la carte
food chart archive.org
2020-01-20

Cloning popular brands of mineral water is now simpler then ever before with the updated version of the mineral water calculator! When I blogged about DIY mineral water last year it was mainly a th…

Perfect egg yolks
food chart archive.org
2020-01-20

Maybe I have a hangup on soft boiled eggs, but I'm deeply fascinated by how something simple as an egg can be transformed into such a wide range of textures. I'm talking about pure eggs…

Perfect egg yolks (part 2)
food chart archive.org
2020-01-20

Egg cooked for 40 min at 63.0 °C. The pictures were taken within 6 seconds and are shown in the order they were taken. My immersion circulator is working again! And the first thing I decided to do …

Recreational kitchen mathematics: Cookie tessellations
archive.org
2020-01-20

Is there a way to avoid all that extra dough in between the cookies? (Photo: Christmas Tree Cookie Cutter from Bigstock) It should come as no surprise that food, chemistry and mathematics meet in b…

Il coriandolo che divide - Scienza in cucina - Blog - Le Scienze
food archive.org
2020-01-20

Il coriandolo (Coriandrum sativum), noto anche come cilantro, suo nome spagnolo, è una delle erbe aromatiche più antiche che si conoscano. Citato nella Bibbia, i suoi semi sono stati ritrovati in tombe egizie e in epoca romana era usato sia come erba medicinale che come condimento. L'uso culinario delle foglie fresche e dei semi essiccati si è poi diffuso in tutto il mondo e ora è molto utilizzato nelle cucine del Messico e dell'America Latina per preparare la tradizionale salsa che accompagna le tortillas, nel Medio Oriente e in alcuni paesi asiatici come la Tailandia e l'India per aromatizzare molte ricette.

19 Jan 2020 11:00pm GMT

Dirk Eddelbuettel: RPushbullet 0.3.3

RPpushbullet demo

Release 0.3.3 of the RPushbullet package just got to CRAN. RPushbullet offers an interface to the neat Pushbullet service for inter-device messaging, communication, and more. It lets you easily send (programmatic) alerts like the one to the left to your browser, phone, tablet, … - or all at once.

This release further robustifies operations via two contributed PRs. The first by Chan-Yub ensures we set UTF-8 encoding on pushes. The second by Alexandre permits to downgrade from http/2 to http/1.1 which he needed for some operations with a particular backend. I made that PR a bit more general by turning the downgrade into one driven by a new options() toggle. Special thanks also to Jeroen in help debugging this issue. See below for more details.

Changes in version 0.3.3 (2020-01-18)

  • UTF-8 encoding is now used (Chan-Yub Park in #55).

  • Curl can use HTTP/1.1 (Alexandre Shannon in #59 fixing #57, plus Dirk in #60 making it optional).

Courtesy of CRANberries, there is also a diffstat report for this release. More details about the package are at the RPushbullet webpage and the RPushbullet GitHub repo.

If you like this or other open-source work I do, you can now sponsor me at GitHub. For the first year, GitHub will match your contributions.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

19 Jan 2020 6:14pm GMT

18 Jan 2020

feedPlanet Debian

Ingo Juergensmann: XMPP - Fun with Clients

As I already wrote in my last blog post there's much development in XMPP, not only on the server side, but also on the client side. It's surely not exaggerated to say that Conversations on Android is the de-facto standard client-wise. So, if you have an Android phone, that's the client you want to try&use. As I don't have Android, I can't comment on it. The situation on Linux is good as well: there are such clients as Gajim, which is an old player in the "market" and is available on other platforms as well, but there is with Dino a new/modern client as well that you may want to try out.

The situation for macOS and iOS users are not that good as for Windows, Linux or Android users. But in the end all clients have their pro and cons... I'll try to summarize a few clients on Linux, macOS and iOS...

LinuxGajim

Fully featured multiprotol client with lots of available plugins. If you want to use OMEMO with Gajim you need to enable it in your plugin settings. There is even a plugin for letting the keyboard LED blink when there are new/unread messages. I found that a little bit annoying, so I disabled that. Gajim is an old-style client with the well-known layout of a contact list window and one for the actual chats. Gajim has some nice features like service discovery on your or remote servers.

Dino

Dino is available in Debian as dino-im and is a quite new client, which you will find out at first start: it's a single window app where the focus is on the chats. There is no contact list at first glance where you can see whether or not your contacts are online. You can find your contacts when you want to start a conversation with your contact. I don't find this bad or good. It's just different and puts the chat into focus, as said, maybe similar to WhatApp, Signal or other messengers nowadays where you just sent messages back and forth and don't care if the contact is online. The contact will receive and read your message when the contact is online and will eventually answer your message.

macOSMonal

Monal is an actively developed and maintained client for macOS and iOS. If you want to try out Monal, don't waste your time with the older client, but focus on Monal Catalyst (direct download link). Catalyst shares the same code as the iOS version of Monal and will become the default Monal download within the next few weeks. It's far easier for the developers to focus on one codebase than on two different ones. Monal has great potential, but also has some issues. For some reason it seems that some messages from time to time will be sent multiple times to a contact or a MUC. The developers are very helpful and supportive. So when you find a bug or issue, please report back.

BeagleIM

BeagleIM is a free XMPP client by Tigase, which business is to sell their XMPP Communication suite and professional support for it. They provide clients for Android, macOS and iOS. Maybe for that reason their clients seems to be very mature, but of course will work best with their own server software. That doesn't mean that the clients won't work well with other 3rd party XMPP servers, just that their main focus will be their own server software, However, BeagleIM seems to work well with Prosody and ejabberd and when you have issues you can also reach out to Tigase on Mastodon, which I find a very big plus! They are really helpful there as well. The BeagleIM client is currently my main client on macOS and it works quite well. As you can see it's more or less chat-focused as well by default, but you can open a contact list window if you want to see all of available/all clients. Only issue I personally have at the moment is, that it seems to have problems with ejabberd: in the contact list and account preferences I see the accounts/contacts with ejabberd going offline/online every few minutes. There are some log entries in ejabberd that seem to be timeout related. I'm not sure whether this is an issue with ejabberd or with BeagleIM - or a rare combination of both.

iOSChatSecure

ChatSecure is one of my first XMPP clients on iOS I installed and used for a long time. It works mostly very well, supports OMEMO (like all the other clients I mention here) and it seems to be able to work well with bookmarks (i.e. use a list of MUCs to join). Only issues I have with ChatSecure currently are: 1) when ChatSecure comes to front after a deep sleep state of the client on iPhone it presents an empty screen and no accounts in settings. You need to quit ChatSecure and restart it to have it working again. Quite annoying. 2) when restarted it polls all messages again from MAM (message archives) over and over again. A small annoying where you can decide if it's better to have duplicated messages or may miss a message.

Monal

What was valid for Monal Catalyst on macOS is also (more or less) true for Monal on iOS. As said: they share the same code base. It is usuable, but has sometimes issues with joining MUCs: it appears as if you can't join a MUC, but suddenly you receive a message in the MUC and then it is listed under Chats and you can use it.

SiskinIM

Siskin is the iOS client from Tigase. It also seems to be very mature, but has some special caveats: in the settings you can configure Push Notifications and HTTP Uploads for each and every clients. Other clients make this automatically and I leave it to you to decide whether this is a nice feature that you can configure it or if it is a little bit annoying, because when you don't know that, you will be wondering why you can't upload files/fotos in a chat. Maybe uploading files will work with Tigase XMPP server, but it doesn't seem to work on my servers.

So, in the end, there are good/promising clients on iOS and macOS, but every client seem to have its own pitfalls. On iOS all three clients do support and use Apple Push Notifications, but you should choose carefully for which one you want to enable it. I can tell you it's a little bit annoying to test three clients and have Push Notifications turned on for all of them and have joined in several MUCs and get all notifications three times for every message... ;-)

MUCs are a special topic I need to investigate a little more in the future: when your server supports Bookmarks for MUCs I would assume that it should be working for all supporting clients and you only need to join MUC on one client and have that MUC at least in your list of bookmarks. If you want to join that MUC on every client might be another story. But I don't know if this is the intended behaviour of the XEP in question or if my assumption how it should work is just wrong.

In the end the situation for XMPP clients on macOS and iOS is much better than it was 1-2 years ago. Though, it is not as good as on Android, but you can help improving the situation by testing the available clients and give feedback to the developers by either joining the approrpriate MUCs or - even better - file issues on their Github pages!

Kategorie:
Debian
Tags:
Debian
XMPP
Software

18 Jan 2020 5:58pm GMT

Russ Allbery: Term::ANSIColor 5.01

This is the module included in Perl core that provides support for ANSI color escape sequences.

This release adds support for the NO_COLOR environment variable (thanks, Andrea Telatin) and fixes an error in the example of uncolor() in the documentation (thanks, Joe Smith). It also documents that color aliases are expanded during alias definition, so while you can define an alias in terms of another alias, they don't remain linked during future changes.

You can get the latest release from CPAN or from the Term::ANSIColor distribution page.

18 Jan 2020 3:20am GMT

Mike Hommey: Announcing git-cinnabar 0.5.3

Git-cinnabar is a git remote helper to interact with mercurial repositories. It allows to clone, pull and push from/to mercurial remote repositories, using git.

Get it on github.

These release notes are also available on the git-cinnabar wiki.

What's new since 0.5.2?

18 Jan 2020 2:49am GMT

17 Jan 2020

feedPlanet Debian

Jonathan McDowell: A beginner tries PCB assembly

I wrote last year about my experience with making my first PCB using JLCPCB. I've now got 5 of the boards in production around my house, and another couple assembled on my desk for testing. I also did a much simpler board to mount a GPS module on my MapleBoard - basically just with a suitable DIP connector and mount point for the GPS module. At that point I ended up having to pay for shipping; not being in a hurry I went for the cheapest option which mean the total process took 2 weeks from order until it arrived. Still not bad for under $8!

Just before Christmas I discovered that JLCPCB had expanded their SMT assembly option to beyond the Chinese market, and were offering coupons off (but even without that had much, much lower assembly/setup fees than anywhere else I'd seen). Despite being part of LCSC the parts library can be a bit limited (partly it seems there's nothing complex to assemble such as connectors), with a set of "basic" components without setup fee and then "extended" options which have a $3 setup fee (because they're not permanently loaded, AIUI).

To test out the service I decided to revise my IoT board. First, I've used a few for 12V LED strip control which has meant the 3.3V LDO is working harder than ideal, so I wanted to switch (ha ha) to a buck converter. I worked back from the JLCPCB basic parts list and chose an MP2451, which had a handy data sheet with an example implementation. I also upgraded the ESP module to an ESP32-WROOM - I've had some issues with non-flickery PWM on the ESP8266 and the ESP32 has hardware PWM. I also have some applications the Bluetooth would be useful for. Once again I turned to KiCad to draw the schematic and lay out the board. I kept the same form factor for ease, as I knew I could get a case for it. The more complex circuitry was a bit harder to lay out in the same space, and the assembly has a limitation of being single sided which complicates things further, but the fact it was being done for me meant I could drop to 0603 parts.

All-in-all I ended up with 17 parts for the board assembly, with the ESP32 module and power connectors being my responsibility (JLCPCB only have the basic ESP32 chip and I did not feel like trying to design a PCB antenna). I managed to get everything except the inductor from the basic library, which kept costs down. Total cost for 10 boards, parts, assembly, shipping + customs fees was just under $29 which is amazing value to me. What's even better is that the DFM (design for manufacturing) checks they did realised I'd placed the MP2451 the wrong way round and they automatically rotated it 180° for me. Phew!

The order was placed in the middle of December and arrived just before New Year - again, about 2 weeks total time end to end. Very impressive. Soldering the ESP32 module on was more fiddly than the ESP-07, but it all worked first time with both 5V + 12V power supplies, so I'm very pleased with the results.

ESP32 IoT PCB

Being able to do cheap PCB assembly is a game changer for me. There are various things I feel confident enough to design for my own use that I'd never be able to solder up myself; and for these prices it's well worth a try. I find myself currently looking at some of the basic STM32 offerings (many of them in JLCPCB's basic component range) and pondering building a slightly more advanced dev board around one. I'm sure my PCB design will cause those I know in the industry to shudder, but don't worry, I've no plans to do this other than for my own amusement!

17 Jan 2020 7:34pm GMT