13 Sep 2024
Planet Lisp
Patrick Stein: I Broke It
Unfortunately, I managed to delete my WordPress database at a time when the most recent backup I had was from 11 years ago.
So… I will hopefully get some newer information uploaded again sometime.
But, most of my content is gone.
13 Sep 2024 3:46pm GMT
Yukari Hafner: Porting SBCL to the Nintendo Switch
For the past two years Charles Zhang and I have been working on getting my game engine, Trial, running on the Nintendo Switch. The primary challenge in doing this is porting the underlying Common Lisp runtime to work on this platform. We knew going into this that it was going to be hard, but it has proven to be quite a bit more tricky than expected. I'd like to outline some of the challenges of the platform here for posterity, though please also understand that due to Nintendo's NDA I can't go into too much detail.
Current Status
I want to start off with where we are at, at the time of writing this article. We managed to port the runtime and compiler to the point where we can compile and execute arbitrary lisp code directly on the Switch. We can also interface with shared libraries, and I've ported a variety of operating system portability libraries that Trial needs to work on the Switch as well.
The above photo shows Trial's REPL example running on the Switch devkit. Trial is setting up the OpenGL context, managing input, allocating shaders, all that good stuff, to get the text shown on screen; the Switch does not offer a terminal of its own.
Unfortunately it also crashes shortly after as SBCL is trying to engage its garbage collector. The Switch has some unique constraints in that regard that we haven't managed to work around quite yet. We also can't output any audio yet, since the C callback mechanism is also broken. And of course, there's potentially a lot of other issues yet to rear their head, especially with regards to performance.
Whatever the case, we've gotten pretty far! This work hasn't been free, however. While I'm fine not paying myself a fair salary, I can't in good conscience have Charles invest so much of his valuable time into this for nothing. So I've been paying him on a monthly basis for all the work he's been doing on this port. Up until now that has cost me ~17'000 USD. As you may or may not know, I'm self-employed. All of my income stems from sales of Kandria and donations from generous supporters on Patreon, GitHub, and Ko-Fi. On a good month this totals about 1'200 USD. On a bad month this totals to about 600 USD. That would be hard to get by in a cheap country, and it's practically impossible in Zürich, Switzerland.
I manage to get by by living with my parents and being relatively frugal with my own personal expenses. Everything I actually earn and more goes back into hiring people like Charles to do cool stuff. Now, I'm ostensibly a game developer by trade, and I am working on a currently unannounced project. Games are very expensive to produce, and I do not have enough reserves to bankroll it anymore. As such, it has become very difficult to decide what to spend my limited resources on, and especially a project like this is much more likely to be axed given that I doubt Kandria sales on the Switch would even recoup the porting costs.
To get to the point: if you think this is a cool project and you would like to help us make the last few hurdles for it to be completed, please consider supporting me on Patreon, GitHub, or Ko-Fi. On Patreon you get news for every new library I release (usually at least one a month) and an exclusive monthly roundup of the current development progress of the unannounced game. Thanks!
An Overview
First, here's what's publicly known about the Switch's environment: user code runs on an ARM64 Cortex-A57 chip with four cores and 4 GB RAM, and on top of a proprietary microkernel operating system that was initially developed for the Nintendo 3Ds.
SBCL already has an ARM64 Linux port, so the code generation side is already solved. Kandria also easily fits into 4GB RAM, so there's no issues there either. The difficulties in the port reside entirely in interfacing with the surrounding proprietary operating system of the switch. The system has some constraints that usual PC operating systems do not have, which are especially problematic for something like Lisp as you'll see in the next section.
Fortunately for us, and this is the reason I even considered a port in the first place, the Switch is also the only console to support the OpenGL graphics library for rendering, which Trial is based upon. Porting Trial itself to another graphics library would be a gigantic effort that I don't intend on undertaking any time soon. The Xbox only supports DirectX, though supposedly there's an OpenGL -> DirectX layer that Microsoft developed, so that might be possible. The Playstation on the other hand apparently still sports a completely proprietary graphics API, so I don't even want to think about porting to that platform.
Anyway, in order to get started developing I had to first get access. I was lucky enough that Nintendo of Europe is fairly accommodating to indies and did grant my request. I then had to buy a devkit, which costs somewhere around 400 USD. The devkit and its SDK only run on Windows, which isn't surprising, but will also be a relevant headache later.
Before we can get on to the difficulties in building SBCL for the Switch, let's first take a look at how SBCL is normally built on a PC.
Building SBCL
SBCL is primarily written in Lisp itself. There is a small C runtime as well, which you use a usual C compiler to compile, but before it can do that, there's some things it needs to know about the operating system environment it compiles for. The runtime also doesn't have a compiler of its own, so it can't compile any Lisp code. In order to get the whole process kicked off, SBCL requires another Lisp implementation to bootstrap with, ideally another version of itself.
The build then proceeds in roughly five phases:
-
build-config
This step just gathers whatever build configuration options you want for your target and spits them out into a readable format for the rest of the build process. -
make-host-1
Now we build the cross-compiler with the host Lisp compiler, and at the same time emit C header files describing Lisp object layouts in memory as C structs for the next step.
-
make-target-1
Next we run the target C compiler to create the C runtime. As mentioned, this uses a standard C compiler, which can itself be a cross-compiler. The C runtime includes the garbage collector and other glue to the operating system environment. This step also produces some constants the target Lisp compiler and runtime needs to know about by using the C compiler to read out relevant operating system headers.
-
make-host-2
With the target runtime built, we build the target Lisp system (compiler and the standard library) using the Lisp cross-compiler built by the Lisp host compiler in
make-host-1
. This step produces a "cold core" that the runtime can jump into, and can be done purely on the host machine. This cold core is not complete, and needs to be executed on the target machine with the target runtime to finish bootstrapping, notably to initialize the object system, which requires runtime compilation. This is done in -
make-target-2
The cold core produced in the last step is loaded into the target runtime, and finishes the bootstrapping procedure to compile and load the rest of the Lisp system. After the Lisp system is loaded into memory, the memory is dumped out into a "warm core", which can be loaded back into memory in a new process with the target runtime. From this point on, you can load new code and dump new images at will.
Notable here is the need to run Lisp code on the target machine itself. We can't cross-compile "purely" on the host, not in the least because user Lisp code cannot be compiled without also being run like batch-compiled C code can, and when it is run it assumes that it is in the target environment. So we really don't have much of a choice in the matter.
In order to deploy an application, we proceed similar to make-target-2
: We compile in Lisp code incrementally and then when we have everything we need we dump out a core with the runtime attached to it. This results in a single binary with a data blob attached.
When the SBCL runtime starts up it looks for a core blob, maps it into memory, marks pages with code in them as executable, and then jumps to the entry function the user designated. This all is a problem for the Switch.
Building for the Switch
The Switch is not a PC environment. It doesn't have a shell, command line, or compiler suite on it to run the build as we usually do. Worse still, its operating system does not allow you to create executable pages, so even if we could run the compilation steps on there we couldn't incrementally compile anything on it like we usually do for Lisp code.
But all is not lost. Most of the code is not platform dependent and can simply be compiled for ARM64 as usual. All we need to do is make sure that anything that touches the surrounding environment in some way knows that we're actually trying to compile for the Switch, then we can use another ARM64 environment like Linux to create our implementation.
With that in mind, here's what our steps look like:
-
build-config
We run this on some host system, using a special flag to indicate that we're building for the Switch. We also enable thefasteval
contrib. We needfasteval
to step in for any place where we would usually invoke the compiler at runtime, since we absolutely cannot do that on the Switch. -
make-host-1
This step doesn't change. We just get different headers that prep for the Switch platform.
-
make-target-1
Now we use the C compiler the Nintendo SDK provides for us, which can cross-compile for the Switch. Unfortunately the OS is not POSIX compliant, so we had to create a custom runtime target in SBCL that stubs out and papers over the operating system environment differences that we care about, like dynamic linking, mapping pages, and so on.
Here is where things get a bit weird. We are now moving on to compiling Lisp code, and we want to do so on a Linux host system. So we have to... -
build-config
(2)We now create a normal ARM64 Linux system with the same feature set as for the Switch. This involves the usual steps as before, though with a special flag to inform some parts of the Lisp process that we're going to ultimately target the Switch.
-
make-host-1
(2) -
make-target-1
(2) -
make-host-2
-
make-target-2
With all of this done we now have a slightly special SBCL build for Linux ARM64. We can now move on to compiling user code.
-
For user code we now perform some tricks to make it think it's running on the Switch, rather than on Linux. In particular we modify
*features*
to include:nx
(the Switch code name) and not:linux
,:unix
, or:posix
. Once that is set up and ASDF has been neutered, we can compile our program (like Trial) "as usual" and at the end dump out a new core.
We've solved the problem of actually compiling the code, but we still need to figure out how to get the code started on the Switch, since it does not allow us to do the usual core-mapping strategy. As such, attaching the new core to the runtime we made for the Switch won't work.
To make this work, we make use of two relatively unknown features of SBCL: immobile-code, and elfination. Usually when SBCL compiles code at runtime, it sticks it into a page somewhere, and marks that page executable. The code itself however could become unneeded at some point, at which point we'd like to garbage collect it. We can then reclaim the space it took up, and to do so compact the rest of the code around it. The immobile-code feature allows SBCL to take up a different strategy, where code is put into special reserved code pages and remains there. This means it can't be garbage collected, but it instead can take advantage of more traditional operating system support. Typically executables have pre-marked sections that the operating system knows to contain code, so it can take care of the mapping when the program is started, rather than the program doing it on its own like SBCL usually does.
OK, so we can generate code and prevent it from being moved. But we still have a core at the end of our build that we now need to transform into the separate code and data sections needed for a typical executable. This is done with the elfination step.
The elfinator looks at a core and performs assembly rewriting to make the code position-independent (a requirement for Address Space Layout Randomisation), and then tears it out into two separate files, a pure code assembly file, and a pure data payload file.
We can now take those two files and link them together with the runtime that the C compiler produced and get a completed SBCL that runs like any other executable would. So here's the last steps of the build process:
-
Run the elfinator to generate the assembly files
-
Link the final binary
-
Run the Nintendo SDK's authoring tools to bundle metadata, shared libraries, assets, and the application binary into one final package
That's quite an involved build setup. Not to mention that we need at least an ARM64 Linux machine to run most of the build on, as well as either an AMD64 Windows machine (or an AMD64 Linux machine with Wine) to run the Nintendo SDK compiler and authoring tools.
I usually use an AMD64 Linux machine, so there's a total of three machines involved: The AMD64 "driver," the ARM64 build host, and a Windows VM to talk to the devkit with.
I wrote a special build system with all sorts of messed up caching and cross-machine synchronisation logic to automate all of this, which was quite a bit of work to get going, especially since the build should also be drivable from an MSYS2/Windows setup. Lots of fun with path mangling!
So now we have a full Lisp system, including user code, compiling for and being able to run on the Switch. Wow! I've skipped over a lot of the nitty-gritty dealing with getting the build properly aware of which target it's building for, making the elfinator and immobile-code working on ARM64, and porting all of the support libraries like pathname-utils, libmixed, cl-gamepad, etc. Again, most of the details we can't openly talk about due to the NDA. However, we have upstreamed what work we could, and all of the Lisp libraries don't have a private fork.
It's worth noting though that elfination wasn't initially designed to produce position independent executable Lisp code, which is usually full of absolute pointers. So we needed to do a lot of work in the SBCL compiler and runtime to support load time relocation of absolute pointers and make sure code objects (which usually contain code constants) no longer have absolute pointers, as the GC can't modify executable sections. Not even the OS loader is allowed to modify executable sections to relocate absolute pointer. We did this by relocating absolute pointers like code constants outside of the text space into a read-writable space close enough to rewrite constant references in code to load from this r/w space instead, which the loader and the moving GC can fixup pointers at.
Instead of interfacing directly with the Nintendo SDK, I've opted to create my own C libraries that have a custom interface the Lisp libraries interface with in order to access the operating system functionality it needs. That way I can at least publish the Lisp bits openly, and only keep the small C library private. Anyway, now that we can run stuff we're not done yet. Our system actually needs to keep running, too, and that brings us to
The Garbage Collector
Garbage collection is a huge topic in itself and there's a ton of different techniques to make it work efficiently. The standard GC for SBCL is called "gencgc", a Generational Garbage Collector. Generational meaning it keeps separate "generations" of objects and scans the generations in different frequencies, copying them over to another generation's location to compact the space. None of this is inherently an issue for the Switch, if it weren't for multithreading.
When multiple threads are involved, we can't just move objects around, as another thread could be accessing it at any time. The easiest way to resolve this conflict is to park all threads before engaging garbage collection. So the question becomes: when a thread wants to start garbage collection, how does it get the other threads to park?
On Unix systems a pretty handy trick is used: we can use the signalling mechanism to send a signal to the other threads, which then take that hint to park.
On the Switch we don't have any signal mechanism. In fact, we can't interrupt threads at all. So we instead need to somehow get each thread to figure out that it should park on its own. The typical strategy for this is called "safepoints".
Essentially we modify the compiler a little bit to inject some extra code that checks whether the thread should park or not. This strategy has some issues, namely:
-
Adding a check isn't free. So we want to check as little as possible
-
If we don't check frequently enough, we are going to stall all the other threads because GC can't begin until they're all parked
-
If we have to inject a lot of instructions for a check, it is going to disrupt CPU cache lines and pipelining optimisations
The current safepoint system in SBCL was written for Windows, which similarly does not have inter-process signal handlers. However, unlike the Switch, it does still have signal handling for the current thread. So the current safepoint implementation was written with this strategy:
Each thread keeps a page around that a safepoint just writes a word to. When GC is engaged, those pages are marked as read-only, so that when the safepoint is hit and the other thread tries to write to the page, a segmentation fault is triggered and the thread can park. This is efficient, since we only need a single instruction to write into the page.
On the Switch we can't use this trick either, so we have to actually insert a more complex check, which can be tricky to get working as intended, as all parallel algorithms tend to be.
Since safepoints aren't necessary on any other platform than Windows, it also hasn't been tested anywhere else, so aside from modifying it for this new platform it's also just unstable. It is apparently quite a big mess in the code base and would ideally be redone from scratch, but hopefully we don't have to go quite that far.
I'd also like to give special mention to the issue that CLOS presents. Usually SBCL defers compilation of the "discriminating function" that is needed to dispatch to methods to the first call of the generic function. This is done because CLOS is highly dynamic and allows adding and removing methods pretty much at any time, and there's usually no good point in time that the system knows it is complete. Of course, on the Switch we can't invoke the compiler, so we can't really do this. For now our strategy has been to instead rely on the fast evaluator. We stub out the compile
function to create a lambda that executes the code via the evaluator instead. This has the advantage of working with any user code that relies on compile
as well, though it is obviously much slower for execution than it would be if we could actually compile.
This neatly brings us to
Future Work
The fasteval trick is mostly a fallback. Ideally I'd like to explore options to freeze as much of CLOS in place as possible right before the final image is dumped and compile as much as possible ahead of time. I'd also like to investigate the block compilation mode that Charles restored some years back more closely.
It's very possible that the Switch's underpowered processor will also force us to implement further optimisations, especially on the side of my engine and the code in Kandria itself. Up until now I've been able to get away with comparatively little optimisation, since even computers of ten years ago are more than fast enough to run what I need for the game. However, I'm not so sure that the Switch could match up to that even if it didn't also introduce additional constraints on performance with its lack of operating system support.
First, though, we need to get the garbage collector running fully. It runs enough to boot up and get into Trial's main loop, but as soon as it hits multi-generation compaction, it falls flat on its face.
Next we need to get callbacks from C working again. Apparently this is a part of the SBCL codebase that can only be described as "a mess," involving lots of hand-rolled assembly routines, which probably need some adjustments to work correctly with immobile-code and elfination. Callbacks fortunately are relatively rare, Trial only needs them for sound playback via libmixed.
There's also been some other issues that we've kept in the back of our heads but don't require our immediate attention, as well as some extra portability features I know I'll have to work on in Trial before its selftest suite fully passes on the Switch.
Conclusion
I'll be sure to add an addendum here should the state of the port significantly change in the future. Some people have also asked me if the work could be made public, or if I'd be willing to share it.
The answer to that is that while I would desperately like to share it all publicly, the NDA prevents us from doing so. We still upstream and publicise whatever we can, but some bits that tie directly into the Nintendo SDK cannot be shared with anyone that hasn't also signed the NDA. So, in the very remote possibility that someone other than me is crazy enough to want to publish a Common Lisp game on the Nintendo Switch, they can reach out to me and I'll happily give them access to our porting work once the NDA has been signed.
Naturally, I'll also keep people updated more closely on what's going on in the monthly updates for Patrons. With that all said, I once again plead with you to consider supporting me on Patreon, GitHub, or Ko-Fi. All the income from these will, for the foreseeable future, be going towards funding the SBCL port to the Switch as well as the current game project.
Thank you as always for reading, and I hope to share more exciting news with you soon!
13 Sep 2024 9:03am GMT
05 Sep 2024
Planet Lisp
Scott L. Burson: Equality and Comparison in FSet
This post is somewhat prompted by a recent blog post by vindarel, about Common Lisp's various built-in equality predicates. It is aleo related to Marco Antoniotti's CDR 8, Generic Equality and Comparison for Common Lisp, implemented by Charles Zhang; Alex Gutev's GENERIC-CL; and Henry Baker's well-known 1992 paper on equality.
Let me start by summarizing those designs. CDR 8 proposes a generic equaity function equals, and a comparison function compare. These are both CLOS generic functions intended to be user-extended, though they also have some predefined methods. equals has several keyword parameters controlling its exact behavior. One of these is case-sensitive, which controls string comparison. Another is recursive, which controls its behavior on conses; if recursive is false (the default), conses are compared by eq, but if it's true, a tree comparison is done. compare is specified to return one of the symbols <, >, =, or /= to indicate the relative order of its arguments; it also has keyword parameters such as case-sensitive and recursive.
GENERIC-CL replaces many CL operations with CLOS generic functions, and also adds new ones. It touches many parts of the language other than equality and comparison, but I'll leave those aside for now. It has two generic equality functions: equalp, which, notwithstanding the name, is case-sensitive for characters and strings, and likep, which is case-insensitive. It also has comparison predicates lessp etc., along with a compare function (implemented using lessp) that can return :less, :equal, or :greater.
Henry's paper makes some interesting arguments about how a Common Lisp equality predicate should behave; he makes these concrete by defining a novel predicate egal. His most salient point, for my purposes, is that mutable objects, including vectors and conses, should always be compared with eq. I will argue below that FSet adheres to the spirit of this desideratum even though not to its letter.
FSet advertises itself as a "set-theoretic" collections library, and as such, requires a well-defined notion of equality. Also, since it is implemented using balanced binary trees, it requires an ordering function. FSet defines a generic function compare with these properties:
- It returns one of the symbols :less, :equal, :greater, or :unequal (:unequal is used in certain rare cases of values which are not equal but cannot be consistently ordered)
- It implements a strict weak ordering, with an additional constraint: along with incomparability (indicated by either :equal or :unequal) being transitive, equality is also transitive by itself
- It can compare any two Lisp objects; this is an element of FSet's design philosophy
- Being a generic function, it is of course user-extensible
FSet's equality predicate is equal?, which simply calls compare and checks that the result is :equal. Thus, the only step required to add a user-defined type to the FSet universe is to define a compare method for it. FSet provides a few utilities to help with this, which I'll go into below.
The cases in which compare returns :unequal to indicate unequal-but-incomparable arguments include:
- Numbers of equal value but different types; that is, = would return true on them, but eql would return false. Example: the integer 1 and the float 1.0.
- Distinct uninterned symbols (symbols whose symbol-package is nil) whose symbol-names are equal (by string=).
- Objects of a type for which no specific compare method has been defined, and which are distinct according to eql.
- If you create a package, rename it, then create a new package reusing the original name of the first package, the two packages compare :unequal. (FSet holds on to the original name, to protect itself from the effects of rename-package, which could otherwise be distastrous.) Also, two symbols with the same name, one from the old package and one from the new, also compare :unequal.
- Aggregates which are being compared component-wise, in the case where none of the component-wise comparisons returns :less or :greater, and at least one of them returns :unequal.
If compare's default method is called with objects of different classes, it returns a result based solely on the classes; the contents of the objects are not examined. Again, it is part of FSet's design philosophy to give you as much freedom as reasonably possible; this includes allowing you to have sets containing more than one kind of object.
(In general, FSet's built-in ordering has been chosen for performance, not for its likely usefulness to clients. For example, compare on two strings of different lengths orders the shorter one first, ignoring the contents, because this requires only an O(1) operation.)
Comparison with equal
FSet's equal? on built-in CL types behaves almost identically to CL's equal, with the one difference that on vectors (other than bit-vectors), equal just calls eq, but equal? compares the contents. (I just noticed that this is not true for multidimensional arrays, and have filed an FSet bug.) (On bit-vectors, they both compare the contents.)
Comparison with CDR 8
There are noticeable similarities between FSet and the CDR 8 proposal; the latter not only includes a comparison function, but even provides for it to return /=, corresponding to FSet's :unequal, to indicate unequal but incomparable arguments. But the idea that the behavior of equality and comparison could be modified via keyword parameters does not seem appropriate for FSet. I think it would make FSet quite a bit harder to use, for little gain. For example, FSet comparison on lists walks the lists, but CDR 8, by default, just calls eq on their heads; users would have to remember to pass :recursive t to get the behavior they probably expect. FSet collections would have to remember which options they were created with, and if you tried, say, to take the union of two sets which used different options, you'd get an error.
Years of programming experience - not only with FSet but also with Refine, the little-known proprietary language that inspired FSet - have left me with the clear impression that having a single global equality predicate is a great simplification and very rarely limiting, provided it was defined properly to begin with.
I also note that FSet has more predefined methods for its comparison function (and therefore for its equality predicate) than are proposed in CDR 8. In particular, CDR 8's default compare methods return /= in more cases (e.g. distinct symbols), which is not terribly useful, in my view; FSet tries to minimize its use of :unequal because its data structure code, in that case, has to fall back to using alists, which have much poorer time complexity than binary trees. (OTOH, Marco seems to have overlooked the other cases listed above that arguably should be treated as unequal but incomparable.)
Comparison with GENERIC-CL
Again, there are noticeable similarities between FSet's and GENERIC-CL's equality predicates and comparison functions. GENERIC-CL does have two different equality predicates, equalp and likep, but these have no parameters other than the objects to be compared; it does not follow the CDR 8 suggestion of specifying keyword parameters that alter their behavior. Its equalp is very similar to FSet's equal?, but not quite identical; one difference is that it returns true when called on the integer 1 and the float 1.0, where both fset:equal? and cl:equal return false.
That normally-minor discrepancy is related to a larger deficiency: GENERIC-CL's comparison operator has no defined return value corresponding to :unequal, to indicate unequal-but-incomparable arguments. That is, FSet and CDR 8 both recognize that comparison can't implement a total ordering over all possible pairs of objects, but GENERIC-CL overlooks this point.
There are other overlaps between FSet and GENERIC-CL, but I'll save an analysis of those for another time.
Comparison with EGAL
Henry is proposing an extension to Common Lisp, not an operator that can be written in portable CL. This shows up in two ways: first, some of his sample code implementing egal requires access to implementation internals; second, he proposes a distinction between mutable and immutable vectors and strings that does not exist in CL. The text also suggests adding an immutable cons type to CL, though the sample code doesn't mention this case.
I agree with Henry in principle: a mutable cons (or string, or vector) is a very different beast from an immutable one; as he puts it, "eq is correct for mutable cons cells and equal is correct for immutable cons cells". CL would have been a better language, in principle, had conses been immutable, and immutable strings and vectors been available (if perhaps not the default). But here I must invoke one of my favorite quips: "The difference between theory and practice is never great in theory, but in practice it can be very great indeed." The key design goal of CL, to unify the Lisp community by providing a language into which existing programs in various Lisp dialects could easily be ported, demanded that conses remain mutable by default. Adding immutable versions of these types was not, to my knowledge, a priority.
And as Henry himself points out, in the overwhelmingly most common usage pattern for these types, they are treated as immutable once fully constructed. For example, a common idiom is for a function to build a list in reverse order, then pass it through nreverse before returning it; at that point, it is fully constructed, and it won't be modified thereafter. Obviously, this is a generalization over real-world Lisp programs and won't always be true, but since Lisp encourages sharing of structure, I think Lisp programmers learn pretty early that they have to be very careful when mutating a list or string or vector that they can't easily prove they're holding the only pointer to (normally by virtue of having just created it). Given that this is pretty close to being true in practice, and that comparing these aggregates by their contents is usually what people want to do when they use them as members of collections, it would seem odd for FSet to distinguish them by identity.
Also, there's the simple fact that for these built-in types, CL provides no portable way to order or hash them by identity. Such functionality must exist internally for use by eq and eql hash tables, but the language does not expose any portable interface to it.
So in this case, both programming convenience and the hard constraints of implementability force a choice that is contrary to theoretical purity: FSet must compare these types by their contents. The catch, of course, is that one must be careful, once having used a list or string or vector as an element of an FSet collection, never to modify it, lest one break the collection's ordering invariant. But in practice, this rule doesn't seem at all onerous: if you found the object in the heap somewhere - as opposed to having just created it- don't mutate it.
When it comes to user-defined types, however, the situation is quite different. It is easy for the programmer, defining a class intended for mutation, to arrange for FSet to distinguish objects of the class by their identity rather than their contents. The recommended way to do this is to include a serial-number slot that is initialized, at object-creation time, to the next value from an integer sequence; then write a compare method that uses this slot. (I'll show some examples shortly.)
So if the design of your program involves some pieces of mutable state that are placed in collections, my strong recommendation is that such state should never be implemented as a bare list or string or vector, but should always be wrapped in an instance of a user-defined class. I believe this to be a good design principle in general, even when FSet is not involved, but it becomes imperative for programs using FSet.
Adding Support for User-Defined Classes
When adding FSet support for a user-defined class, the first question is whether instances of the class represent mutable objects or mathematical values. If it's a mathematical value, it should be treated as immutable once constructed. (Alas, CL provides no way to enforce immutability.) In that case, it should be compared by content. FSet provides a convenient macro compare-slots for this purpose. Here's an example:
This specifies that frobs shall be ordered first by position, then by color. compare-slots handles the details for you, including the complications that arise if one of the slot value comparisons returns :unequal.
For standard classes, best performance is obtained by supplying slot names as quoted symbols rather than function-quoted accessor names:
I am not sure whether to recommend the use of slot names for structure classes; the answer may depend on the implementation. At least on SBCL, you're probably better off using accessor functions for structs.
(Actually, the functions supplied don't have to be accessors; you could compare by some computed value instead, if you wanted. I haven't seen a use for this possibility in practice, though.)
Structure classes implementing mutable objects should do something like this:
For standard classes implementing mutable objects, FSet provides an especially convenient solution: just include identity-ordering-mixin as a superclass:
That's it!
More on FSet's Single Global Ordering
I sometimes get pushback, albeit mostly from people who haven't actually used FSet, about my design decision to have a single global ordering implemented by compare, rather than allowing collections to accept an ordering function when they are created. Let me defend this decision a little bit.
Because the ordering is extensible by defining new methods on compare, a programmer can always force a non-default ordering by defining a wrapper type. For example, if you have a map whose keys are strings and which you want to be maintained in lexicographic order, you can easily write a structure class to wrap the strings, and give that class a compare method that forces the strings to be compared lexicographically. (FSet even helps you out by providing a generic function compare-lexicographically, which you can just call.)
That said, I believe the need to write wrapper classes arises very rarely. It's needed only when there is a reason that a set or map needs to be continually maintained in the non-default order. If the non-default ordering is needed only occasionally - say, when the collection is being printed - it's usually easier to convert it to a list at that point (see FSet's generic function convert, about which I should write another blog post) and then just call sort or stable-sort on it.
And there is a wonderful simplicity to having the ordering be global. Ease of use is a very important design goal for FSet; collection-specific orderings would give the user another wrinkle to think about. I just don't see that the benefits, which seem to me very small, would outweigh the cost in cognitive load.
Perhaps the best way to put it is that FSet is primarily intended for application programming, not systems programming. The distinction is fuzzy, but broadly, if programmer productivity is more important to you than squeezing out the last few percent of performance, you're doing application programming, not systems programming. This is not necessarily a distinction about the kind of program being written - there certainly are applications that have performance-sensitive parts - but rather, about the amount of knowledge, experience, and mental effort required to write it. FSet is designed for general productivity, not necessarily for someone who needs maximal control to achieve maximal performance.
05 Sep 2024 6:58am GMT