03 Jun 2020

feedPlanet Debian

Keith Packard: picolibc-ryu

Float/String Conversion in Picolibc: Enter "Ryū"

I recently wrote about this topic having concluded that the best route for now was to use the malloc-free, but imprecise, conversion routines in the tinystdio alternative.

A few days later, Sreepathi Pai pointed me at some very recent work in this area:

This is amazing! Thirty years after the papers referenced in the previous post, Ulf Adams came up with some really cool ideas and managed to reduce the math required for 64-bit conversion to 128 bit integers. This is a huge leap forward; we were doing long multi-precision computations before, and now it's all short enough to fit in registers (ok, a lot of registers, but still).

Getting the Ryū Code

The code is available on github: https://github.com/ulfjack/ryu. Reading through it, it's very clear that the author focuses on performance with lots of tuning for common cases. Still, it's quite readable, especially compared with the newlib multi-precision based code.

Picolibc String/Float conversion interface

Picolibc has some pretty basic needs for the float/string conversion code, it wants four functions:

  1. __dtoa_engine

    __dtoa_engine(double x, struct dtoa *dtoa, uint8_t max_digits, uint8_t max_decimals);

    This converts the double x to a string of decimal digits and a decimal exponent stored inside the 'dtoa' struct. It limits the total number of digits to max_digits and, optionally (when max_decimals is non-zero), limits the number of fractional digits to max_decimals - 1. This latter supports 'f' formats. Returns the number of digits stored, which is <= max_digits. Less if the number can be accurately represented in fewer digits.

  2. __ftoa_engine

    __ftoa_engine(float x, struct ftoa *ftoa, uint8_t max_digits, uint8_t max_decimals);

    The same as __dtoa_engine, except for floats.

  3. __atod_engine

    __atod_engine(uint64_t m10, int e10);

    To avoid needing to handle stdio inside the conversion function, __atod_engine receives fully parsed values, the base-10 significand (m10) and exponent (e10). The value to convert is m10 * pow(10, e10).

  4. __atof_engine

    __atof_engine(uint32_t m10, int e10);

    The same as __atod_engine, except for floats.

With these, it can do printf, scanf, ecvt, fcvt, gcvt, strtod, strtof and atof.

Porting Ryū to Picolibc

The existing Ryū float-to-string code always generates the number of digits necessary for accurate output. I had to hack it up to generate correctly rounded shorter output when max_digits or max_decimals were smaller. I'm not sure I managed to do that correctly, but at least it appears to be passing all of the test cases I have. In normal operation, Ryū iteratively removes digits from the answer that aren't necessary to disambiguate with neighboring values.

What I changed was to keep removing digits using that method until the answer had few enough digits to fit in the desired length. There's some tricky rounding code that adjusts the final result and I had to bypass that if I'd removed extra digits.

That was about the only change necessary to the core algorithm. I also trimmed the code to only include the general case and not the performance improvements, then wrapped it with code to provide the _engine interface.

On the string-to-float side, most of what I needed to do was remove the string parsing bits at the start of the function and switch from performance-optimized to space-optimized versions of a couple of internal routines.

Correctness Results

Because these new functions are now 'exact', I was able to adjust the picolibc tests to compare all of the bits for string/float conversion instead of having to permit a bit of slop in the answers. With those changes, the picolibc test suite passes, which offers some assurance that things aren't completely broken.

Size Results

Snek uses the 32-bit float versions of the conversion routines, and for that, the size difference is:

   text    data     bss     dec     hex filename
  59068      44   37968   97080   17b38 snek-qemu-riscv-orig.elf
  59430      44   37968   97442   17ca2 snek-qemu-riscv-ryu.elf

362 bytes added to gain accurate printf/strtof results seems like a good trade-off in this case.


I haven't measured performance at all, but I suspect that it won't be nearly as problematic on most platforms as the source code makes it appear. And that's because Ryū is entirely integer arithmetic with no floating point at all. This avoids using the soft fp code for platforms without hardware float support.

Pointers to the Code

I haven't merged this to picolibc master yet, it's on the ryu branch:

Review, especially of the hack above to return short results, would be greatly appreciated!

Thanks again to Ulf Adams for creating this code and to Sreepathi Pai for sending me a note about it!

03 Jun 2020 1:33am GMT

Dima Kogan: vnlog now functional on *BSD and OSX

So somebody finally bugged me about supporting vnlog tools on OSX. I was pretty sure that between all the redirection, process communication, and file descriptor management something was Linux-specific, but apparently not: everything just works. There were a few uninteresting issues with tool paths, core tool and linker flags and so on, but it was all really minor. I have a report that the test suite passes on OSX, and I verified it on FreeBSD.

I made a new 1.28 release tag, but it exists mostly for the benefit of any OSX or *BSD people who'd want to make a package for their system. Progress!

03 Jun 2020 1:20am GMT

02 Jun 2020

feedPlanet Debian

Olivier Berger: Mixing NRELab’s Antidote and Eclipse Che on the same k8s cluster

You may have heard of my search for Cloud solutions to run labs in an academic context, with a focus on free an open source solutions . You may read previous installments of this blog, or for a shorter, check the presentation I've recorded last week.

I've become quite interested, in the latest month, in 2 projects: NRELab's Antidote and Eclipse Che.

Antidote is the software that powers NRELabs, a labs platform for learning network automation, which runs on top of Kubernetes (k8s). The interesting thing is that for each learner, there can be a dedicated k8s namespace with multiple virtual nodes running on a separate network. This can be used in the context of virtual classes/labs where our students will perform network labs in parallel on the same cluster.

Eclipse Che powers Eclipse "on the Cloud", making available software development environments, for developers, on a Kubernetes Cloud. Developers typically work from a Web page instead of installing local development tools.

Both projects seem quite complementary. For one, we both teach networks and software developments. So that would naturally appeal for many professors.

Furthermore, Eclipse Che provides a few features that Antidote is lacking : authenticating users (with keycloak), and persisting their work in workspaces, between work sessions. Typically what we need in our academic context where students will work on the same labs during scheduled classes, week after week, during or off-hours.

Thus it would be great to have more integration between the 2 environments.

I intend to work on that front, but that takes time, as running stuff on Kubernetes isn't exactly trivial, at least when you're like me and want to use a "vanilla" kubernetes.

I've mainly relied on running k8s inside VMs using Vagrant and/or minikube so far.

A first milestone I've achieved is making sure that Antidote and Eclipse Che aren't incompatible. Antidote's "selfmedicate" script was actually running inside a Vagrant VM, where I had difficulties installing Eclipse Che (probably because of old software, or particular networking setup details). I've overcome this hurdle, as I'm now able to install both environments on a single Kubernetes VM (using my own Vagrant setup).

Running Eclipse Che (alongsite Antidote) on a k8s Vagrant VM.

This proves only that there's no show stopper there, but a lot of work remains.

Stay tuned.

02 Jun 2020 10:06pm GMT