25 Sep 2017

feedPlanet Openmoko

Holger "zecke" Freyther: Brain dump what fascinates me

A small brain dump of topics that currently fascinate me. These are mostly pointers and maybe it is interesting to follow it.

Books/Reading:

My kobo ebook reader has the Site Reliability Engineering book and I am now mostly done. It is kind of a revelation and explains my interest to write code but also to operate infrastructure (like struggling with ruby, rmagick, nginx…). I am interested in backends since… well ever. The first time I noticed it when we talked about Kolab at LinuxTag and I was more interested in the backend than the KDE client. At sysmocom we built an IoT product and the backend was quite some fun, especially the scale of one instance and many devices/users, capacity planning and disk commissioning, lossless upgrades.

It can be seen in my non FOSS SS7 map work on traffic masquerading and transparent rewriting. It is also clear to see which part of engineering is needed for scale (instead of just installing and restarting servers).

Lang VM design

One technology that made Java fast (Hotspot) and has seen its way into JavaScript is dynamic optimization. Most Just in Time Compilers start with generating native code per method, either directly or after the first couple of calls when the methods size is significant enough. The VM records which call paths are hot, which types are used and then can generate optimized code (e.g. specialized for integers, remove type checks). A technique pioneered at Sun for the "Self" language (and then implemented for Strongtalk and then brought to Java) was "adaptive optimization and deoptimization" and was the Phd topic of Urs Hoelzle (Google's VP of Engineering). One of the key aspects is inlining across method boundaries as this removes method look-up, call stack handling and opens the way for code optimization across method boundaries (at the cost of RAM usage).

In OpenJDK, V8 and JavaScriptCore this adaptive optimization is typically implemented in C++ and requires quite some code. The code is complicated as it needs to optimize but also need to return to a basic function (deoptimize, e.g. if a method changed or the types passed don't match anymore), e.g. in the middle of a for loop with tons of inlined code (think of Array.map being inlined but then need to be de-inlined). A nice and long blog post of JSC can be found here describing the On Stack Replacement (OSR).

Long introduction and now to the new thing. In the OpensmalltalkVM a new approach called Sista has been picked and I find it is genius. Like with many problems the point of view and approach really matters. Instead of writing a lot of code in the VM the optimizer runs next to the application code. The key parts seem to be:

The revelation is the last part. By just optimizing from bytecode to bytecode the VM remains in charge of creating and managing machine code. The next part is that tooling in the higher language is better or at least the roundtrip is more quick (edit code and just compile the new method instead of running make, c++, ld). The productivity thanks to the abstraction and tooling is likely higher.

As last part the OSR is easier as well. In Smalltalk thisContext (the current stack frame, activation record) is an object as well. At the right point (when the JIT has either written back variables from register to the stack or at least knows where the value is) one can just manipulate thisContext, create and link news ones and then resume execution without all the magic in other VMs.

Go, Go and escape analysis

Ken Thompson and Robert Pike are well known persons and their Go programming language is a very interesting system programming language. Like with all new languages I try to get real world experience with the language, the tooling and which kind of problems can be solved with it. I have debugged and patched some bigger projects and written two small applications with it.

There is plenty I like. The escape analysis of the compiler is fun (especially now that I know it was translated from the Plan9 C compiler from C to Go), the concurrency model is good (though allowing shared state), the module system makes sense (but makes forking harder than necessary), being able to cross compile to any target from any system.

Knowing a bit of Erlang (and continuing to read the Phd Thesis of Joe Armstrong) and being a heavy Smalltalk user there are plenty of things missing. It starts with vague runtime error messages (e.g. panicslice not having parameters) and goes to runtime and post-runtime inspection. In Smalltalk thanks to the abstraction a lot of hard things are easy and I would have wished for some of them to be in Go. Serialize all unrecovered panics? Debugging someone else's code seems like pre 1980…

So for many developers Go is a big improvement but for some people with a wider view it might look like a lost opportunity. But that can only be felt by developers that have experienced higher abstraction and productivity.

Unsupervised machine learning

but that is for another dump…

25 Sep 2017 10:11am GMT

02 Sep 2017

feedPlanet Openmoko

Harald "LaF0rge" Welte: Purism Librem 5 campaign

There's a new project currently undergoing crowd funding that might be of interest to the former Openmoko community: The Purism Librem 5 campaign.

Similar to Openmoko a decade ago, they are aiming to build a FOSS based smartphone built on GNU/Linux without any proprietary drivers/blobs on the application processor, from bootloader to userspace.

Furthermore (just like Openmoko) the baseband processor is fully isolated, with no shared memory and with the Linux-running application processor being in full control.

They go beyond what we wanted to do at Openmoko in offering hardware kill switches for camera/phone/baseband/bluetooth. During Openmoko days we assumed it is sufficient to simply control all those bits from the trusted Linux domain, but of course once that might be compromised, a physical kill switch provides a completely different level of security.

I wish them all the best, and hope they can leave a better track record than Openmoko. Sure, we sold some thousands of phones, but the company quickly died, and the state of software was far from end-user-ready. I think the primary obstacles/complexities are verification of the hardware design as well as the software stack all the way up to the UI.

The budget of ~ 1.5 million seems extremely tight from my point of view, but then I have no information about how much Puri.sm is able to invest from other sources outside of the campaign.

If you're a FOSS developer with a strong interest in a Free/Open privacy-first smartphone, please note that they have several job openings, from Kernel Developer to OS Developer to UI Developer. I'd love to see some talents at work in that area.

It's a bit of a pity that almost all of the actual technical details are unspecified at this point (except RAM/flash/main-cpu). No details on the cellular modem/chipset used, no details on the camera, neither on the bluetooth chipset, wifi chipset, etc. This might be an indication of the early stage of their plannings. I would have expected that one has ironed out those questions before looking for funding - but then, it's their campaign and they can run it as they see it fit!

I for my part have just put in a pledge for one phone. Let's see what will come of it. In case you feel motivated by this post to join in: Please keep in mind that any crowdfunding campaign bears significant financial risks. So please make sure you made up your mind and don't blame my blog post for luring you into spending money :)

02 Sep 2017 10:00pm GMT

01 Sep 2017

feedPlanet Openmoko

Harald "LaF0rge" Welte: First actual XMOS / XCORE project

For many years I've been fascinated by the XMOS XCore architecture. It offers a surprisingly refreshing alternative virtually any other classic microcontroller architectures out there. However, despite reading a lot about it years ago, being fascinated by it, and even giving a short informal presentation about it once, I've so far never used it. Too much "real" work imposes a high barrier to spending time learning about new architectures, languages, toolchains and the like.

Introduction into XCore

Rather than having lots of fixed-purpose built-in "hard core" peripherals for interfaces such as SPI, I2C, I2S, etc. the XCore controllers have a combination of

  • I/O ports for 1/4/8/16/32 bit wide signals, with SERDES, FIFO, hardware strobe generation, etc
  • Clock blocks for using/dividing internal or external clocks
  • hardware multi-threading that presents 8 logical threads on each core
  • xCONNECT links that can be used to connect multiple processors over 2 or 5 wires per direction
  • channels as a means of communication (similar to sockets) between threads, whether on the same xCORE or a remote core via xCONNECT
  • an extended C (xC) programming language to make use of parallelism, channels and the I/O ports

In spirit, it is like a 21st century implementation of some of the concepts established first with Transputers.

My main interest in xMOS has been the flexibility that you get in implementing not-so-standard electronics interfaces. For regular I2C, UART, SPI, etc. there is of course no such need. But every so often one encounters some interface that's very rately found (like the output of an E1/T1 Line Interface Unit).

Also, quite often I run into use cases where it's simply impossible to find a microcontroller with a sufficient number of the related peripherals built-in. Try finding a microcontroller with 8 UARTs, for example. Or one with four different PCM/I2S interfaces, which all can run in different clock domains.

The existing options of solving such problems basically boil down to either implementing it in hard-wired logic (unrealistic, complex, expensive) or going to programmable logic with CPLD or FPGAs. While the latter is certainly also quite interesting, the learning curve is steep, the tools anything but easy to use and the synthesising time (and thus development cycles) long. Furthermore, your board design will be more complex as you have that FPGA/CPLD and a microcontroller, need to interface the two, etc (yes, in high-end use cases there's the Zynq, but I'm thinking of several orders of magnitude less complex designs).

Of course one can also take a "pure software" approach and go for high-speed bit-banging. There are some ARM SoCs that can toggle their pins. People have reported rates like 14 MHz being possible on a Raspberry Pi. However, when running a general-purpose OS in parallel, this kind of speed is hard to do reliably over long term, and the related software implementations are going to be anything but nice to write.

So the XCore is looking like a nice alternative for a lot of those use cases. Where you want a microcontroller with more programmability in terms of its I/O capabilities, but not go as far as to go full-on with FPGA/CPLD development in Verilog or VHDL.

My current use case

My current use case is to implement a board that can accept four independent PCM inputs (all in slave mode, i.e. clock provided by external master) and present them via USB to a host PC. The final goal is to have a board that can be combined with the sysmoQMOD and which can interface the PCM audio of four cellular modems concurrently.

While XMOS is quite strong in the Audio field and you can find existing examples and app notes for I2S and S/PDIF, I couldn't find any existing code for a PCM slave of the given requirements (short frame sync, 8kHz sample rate, 16bit samples, 2.048 MHz bit clock, MSB first).

I wanted to get a feeling how well one can implement the related PCM slave. In order to test the slave, I decided to develop the matching PCM master and run the two against each other. Despite having never written any code for XMOS before, nor having used any of the toolchain, I was able to implement the PCM master and PCM slave within something like ~6 hours, including simulation and verification. Sure, one can certainly do that in much less time, but only once you're familiar with the tools, programming environment, language, etc. I think it's not bad.

The biggest problem was that the clock phase for a clocked output port cannot be configured, i.e. the XCore insists on always clocking out a new bit at the falling edge, while my use case of course required the opposite: Clocking oout new signals at the rising edge. I had to use a second clock block to generate the inverted clock in order to achieve that goal.

Beyond that 4xPCM use case, I also have other ideas like finally putting the osmo-e1-xcvr to use by combining it with an XMOS device to build a portable E1-to-USB adapter. I have no clue if and when I'll find time for that, but if somebody wants to join in: Let me know!

The good parts

Documentation excellent

I found the various pieces of documentation extremely useful and very well written.

Fast progress

I was able to make fast progress in solving the first task using the XMOS / Xcore approach.

Soft Cores developed in public, with commit log

You can find plenty of soft cores that XMOS has been developing on github at https://github.com/xcore, including the full commit history.

This type of development is a big improvement over what most vendors of smaller microcontrollers like Atmel are doing (infrequent tar-ball code-drops without commit history). And in the case of the classic uC vendors, we're talking about drivers only. In the XMOS case it's about the entire logic of the peripheral!

You can for example see that for their I2C core, the very active commit history goes back to January 2011.

xSIM simulation extremely helpful

The xTIMEcomposer IDE (based on Eclipse) contains extensive tracing support and an extensible near cycle accurate simulator (xSIM). I've implemented a PCM mater and PCM slave in xC and was able to simulate the program while looking at the waveforms of the logic signals between those two.

The bad parts

Unfortunately, my extremely enthusiastic reception of XMOS has suffered quite a bit over time. Let me explain why:

Hard to get XCore chips

While the product portfolio on on the xMOS website looks extremely comprehensive, the vast majority of the parts is not available from stock at distributors. You won't even get samples, and lead times are 12 weeks (!). If you check at digikey, they have listed a total of 302 different XMOS controllers, but only 35 of them are in stock. USB capable are 15. With other distributors like Farnell it's even worse.

I've seen this with other semiconductor vendors before, but never to such a large extent. Sure, some packages/configurations are not standard products, but having only 11% of the portfolio actually available is pretty bad.

In such situations, where it's difficult to convince distributors to stock parts, it would be a good idea for XMOS to stock parts themselves and provide samples / low quantities directly. Not everyone is able to order large trays and/or capable to wait 12 weeks, especially during the R&D phase of a board.

Extremely limited number of single-bit ports

In the smaller / lower pin-count parts, like the XU[F]-208 series in QFN/LQFP-64, the number of usable, exposed single-bit ports is ridiculously low. Out of the total 33 I/O lines available, only 7 can be used as single-bit I/O ports. All other lines can only be used for 4-, 8-, or 16-bit ports. If you're dealing primarily with serial interfaces like I2C, SPI, I2S, UART/USART and the like, those parallel ports are of no use, and you have to go for a mechanically much larger part (like XU[F]-216 in TQFP-128) in order to have a decent number of single-bit ports exposed. Those parts also come with twice the number of cores, memory, etc- which you don't need for slow-speed serial interfaces...

Insufficient number of exposed xLINKs

The smaller parts like XU[F]-208 only have one xLINK exposed. Of what use is that? If you don't have at least two links available, you cannot daisy-chain them to each other, let alone build more complex structures like cubes (at least 3 xLINKs).

So once again you have to go to much larger packages, where you will not use 80% of the pins or resources, just to get the required number of xLINKs for interconnection.

Change to a non-FOSS License

XMOS deserved a lot of praise for releasing all their soft IP cores as Free / Open Source Software on github at https://github.com/xcore. The License has basically been a 3-clause BSD license. This was a good move, as it meant that anyone could create derivative versions, whether proprietary or FOSS, and there would be virtually no license incompatibilities with whatever code people wanted to write.

However, to my very big disappointment, more recently XMOS seems to have changed their policy on this. New soft cores (released at https://github.com/xmos as opposed to the old https://github.com/xcore) are made available under a non-free license. This license is nothing like BSD 3-clause license or any other Free Software or Open Source license. It restricts the license to use the code together with an XMOS product, requires the user to contribute fixes back to XMOS and contains references to importand export control. This license is incopatible with probably any FOSS license in existance, making it impossible to write FOSS code on XMOS while using any of the new soft cores released by XMOS.

But even beyond that license change, not even all code is provided in source code format anymore. The new USB library (lib_usb) is provided as binary-only library, for example.

If you know anyone at XMOS management or XMOS legal with whom I could raise this topic of license change when transitioning from older sc_* software to later lib_* code, I would appreciate this a lot.

Proprietary Compiler

While a lot of the toolchain and IDE is based on open source (Eclipse, LLVM, ...), the actual xC compiler is proprietary.

Further Reading

01 Sep 2017 10:00pm GMT