13 Apr 2026

feedPlanet Lisp

Scott L. Burson: FSet v2.4.2: CHAMP Bags, and v1.0 of my FSet book!

A couple of weeks ago I released FSet 2.4.0, which brought a CHAMP implementation of bags, filling out the suite of CHAMP types. 🚀 FSet users should have a look at the release page, as it also contained a number of bug fixes and minor changes.

I've since released v2.4.1 and v2.4.2, with some more bug fixes.

But the big news is the book! It brings together all the introductory material I have written, plus a lot more, along with a complete API Reference chapter.

FSet is now in the state I decided last summer I wanted to get it into: faster, better tested and debugged, more feature-complete, and much better documented than it has ever been in its nearly two decades of existence. I am, of course, very much hoping that these months of work have made the library more interesting and accessible to CL programmers who haven't tried it yet. I am even hoping that its existence helps attract newcomers to the CL community. Time will tell!

13 Apr 2026 6:21am GMT

12 Apr 2026

feedPlanet Lisp

Wimpie Nortje: Dependency hell revisited, updating my Qlot workflow.

I wrote on this topic before but the landscape has changed a lot since then.

Skip to the new Qlot workflow.

When you work on projects that become even slightly complex it is a matter of time until you run into problems where the specific version of a particular library becomes important. This happens in most, if not all, programming languages.

In the Common Lisp environment Quicklisp has become the de facto standard for loading libraries, including fetching and loading their dependencies. Quicklisp distributes libraries in "distributions" which are point-in-time snapshots of all the known and working libraries at the time of distribution creation.

An advantage of this approach is that you are guaranteed that all libraries available in the distribution can be loaded with any of the others. Some disadvantages are that 1) if a library was included in an older distribution but no longer loads cleanly, it gets removed from the distribution, and 2) libraries are only added or updated when a new distribution is cut.

Though libraries can be put in ~/quicklisp/local-projects/ in order to supplement or override those in the distribution, Quicklisp does not provide any mechanism for pinning the state of ~/quicklisp/local-projects/.

Some Quicklisp attributes:

  1. All libraries in a distribution can be loaded with any of the others. Those that no longer do are removed from the distribution.
  2. Libraries are only added or updated at distribution cutting time.
  3. You specify a single distribution version, not individual library versions. The distribution is used globally, across all projects and across all lisp implementations.
  4. Libraries can be loaded from ~/quicklisp/local-projects/. Anything there will override the version in the distribution.
  5. Nothing about the ~/quicklisp/local-projects/ content can be specified using Quicklisp. It needs to be managed outside of Quicklisp.
  6. Libraries are loaded from the current Quicklisp distribution. There is no way to specify which distribution version a particular project must use.
  7. Quicklisp can create a bundle of all the loaded systems that can be committed with the project code and used from there rather than the distribution, i.e. vendoring.
  8. The source code for libraries included in a distribution are stored in a dedicated location for Quicklisp, they are not read from the original source repo.

Depending on your situation these attributes may be positive, negative or irrelevant.

There are projects like Ultralisp that have different philosophies regarding the distribution content but they still depend on Quicklisp for all other aspects. Thus they share most of the above attributes.

Since my previous post on this topic much has changed in the library version arena. There are now many projects that address different aspects of the above list; the topic of vendoring has gained momentum; and Qlot has changed a lot, to the extent that some code samples in older posts no longer work.

Vendoring is the idea that all the libraries your project depend on are actually part of your project and as such should be committed as part of the project code. Both Quicklisp and Qlot support this with QL:bundle and QLOT:bundle, and the Vend tool is entirely focused on vendoring.

The significant changes in Qlot broke my development workflow. Since I now had to spend time fixing this it was a good opportunity to evaluate some of the other library versioning tools. The issues that made me hesitant to adapt to the new Qlot without considering other options are:

I evaluated many of the other version management tools and came to the conclusion that Qlot is the closest to what I wanted and I set off trying to find a workflow that will adhere to my requirements listed here:

After some fiddling with Qlot I learned that:

  1. Qlot installs a complete and independent Quicklisp inside your project directory. This Quicklisp has no dependence, relation or knowledge about the global Quicklisp.
  2. $ qlot exec ccl mostly arranges things such that Quicklisp is loaded from the project local installation. If you can arrange for that to happen without using Qlot then you can use start your lisp normally.
  3. When the project local Quicklisp is loaded it doesn't know about the global or any other project local Quicklisps. This gives the 100% certainty of where libraries were loaded from.
  4. When the project local Quicklisp is loaded you use it exactly like you would the global one. That is, QL:quickload and not QLOT:quickload, nor do you use any other Qlot wrappers.
  5. Qlot does not need to be present in your lisp image at all.
  6. Loading Qlot inside you current REPL doesn't work well because:

    1. The Qlot REPL API is still in development.
    2. Doing this loads Qlot from the global Quicklisp instead of the project local one. Distribution version mismatches between the two Quicklisps could trigger issues with Qlot itself. It also voids the certainty of library location.
  7. Having Qlot as a standalone executable outside of your lisp image puts it on the same plane as other tools used for development such as make or git. It then doesn't matter which implementation it prefers because it doesn't affect your choices.
  8. Qlot has gained the ability to bundle libraries in case you want to go the vendoring route.
  9. Roswell is not needed for any of the above.

New workflow

Combining my requirements and my new understanding of Qlot, I modified my workflow for pinning library versions to be:

  1. Qlot executable must be available in your path.
  2. Specify the distribution and library versions in qlfile.
  3. qlot install at the CLI.
  4. Load lisp without executing init scripts, e.g. ccl -n.
  5. Load Quicklisp from the project local installation.

    (load "PROJECT-PATH/.qlot/setup.lisp")

  6. Use Quicklisp as before.

    (ql:quickload :alexandria)

If you would like to vendor your libraries then:

  1. Specify the distribution and library versions in qlfile.
  2. qlot install
  3. qlot bundle
  4. git commit
  5. ccl
  6. (load PROJECT-PATH/.bundle-libs/setup.lisp)

Qlot tasks

All Qlot related tasks such as initialising a project, installing libraries, upgrading libraries, etc must be performed in the CLI using the Qlot executable. These happen relatively infrequently and inside the REPL Qlot does not feature.

12 Apr 2026 12:00am GMT

08 Apr 2026

feedPlanet Lisp

Tim Bradshaw: Rules for Lisp programs

Some very serious rules. Very serious.

The essential rule. If you are not building languages in Lisp why are you even here?

The lesser rules.

  1. If you write a program which uses defclass you are probably making a mistake.
  2. If you write a program which uses the CLOS MOP you are making a mistake.
  3. If you write a program which uses LOOP for any purpose other than creating a better iteration construct you are making a mistake.
  4. If you write a program which uses LOOP only to create a better iteration construct you are probably making a mistake.
  5. If you write a program which uses explicit package-qualified names more than very infrequently you will be cast into the outer darkness along with your program.

I will not be taking questions.

08 Apr 2026 10:48am GMT

06 Apr 2026

feedPlanet Lisp

Patrick Stein: Nomic Coding Game

About 30 years ago, I had an idea for a coding game inspired by Nomic. It occurred to me last month that all of the tools I need are readily available now.

Pen-and-paper Nomic

The pen-and-paper game of Nomic (by Peter Suber) has an initial ruleset which describes how one proposes changes to the rules, how one gets those changes ratified, a way to award points when someone's rule change is ratified, and a rule declaring that the winner is the first player to amass 100 points. Some of the rules are mutable and some are immutable and there are rules about turning mutable rules into immutable ones and vice-versa.

The game was meant to show some of the paradoxes of self-amendment. It was meant to lead people into situations where it was clear that certain actions were both legal (or even mandatory) and illegal.

A drastically simplified starting set of rules might look like:

Nomic in Code

So, 30 years ago, I had the idea that it would be fabulous to write some code to referee a Nomic game. However, because interpretation of the rules is so horrendously human, it felt impossible. Today, in 2026, it seems one could maybe get Claude, Gemini, or some other LLM to referee. But, this doesn't much interest me, either, really. I cannot get any of them to keep track of something that I made them write down. I cannot imagine that I would be happy with their interpretation of whether my move is legal given the current state of the rules nor to amend the rules appropriately if my move is legal.

What felt slightly more attainable 30 years ago would be to make it a battle in code:

This was nice and all, but it was also too static. The rules about who can vote and how votes are tallied and such wouldn't be subject to change.

Nomic in Code in 2026

Fast-forward to last month. Last month, I realized that with the GitHub API interface, I could implement a very Nomic-ish pull request battle game. I can:

To be truly in Nomic's full spirit, it would be nice to allow the code in the repository to interact with the GitHub API on its own. Alas, that would immediately let the players vote in changes that expose my GitHub tokens, so it would be a gaping security hole-not only because it would let users impersonate me but because it would let them end-around the actual code in the repository to make changes to the main branch in the repository.

So, as it is, I have a supervisor written in Common Lisp which handles all of the interaction with GitHub and various game repositories (one to play in Common Lisp, one to play in JavaScript, and one to play in Python). The supervisor:

The game code, given a list of open pull requests can reply with one of the following messages:

{
"decision": "winner",
"name": name-of-winner,
"message": optional-reason-for-decision
}
{
"decision": "accept",
"id": id-number-of-pull-request-to-accept,
"message": optional-reason-for-decision
}
{
"decision": "reject",
"id": id-number-of-pull-request-to-reject,
"message": optional-reason-for-decision
}
{
"decision": "defer"
}

The "defer" decision means that there is not enough information at the moment. Maybe, in the future, with other pull requests or other comments or reviews we will be able to make some move.

If the game code replies with anything that isn't one of the four types of replies shown above, the supervisor assumes the latest merge broke the code and reverts the change.

The Ask

I haven't been able to drum up enough players for a game in any of my regular haunts. So, I am looking for tolerant players who will help me give it a test run or two to work out the kinks in the supervisor. Some areas where I forsee potential issues:

So, if you're tolerant of some bumps in the process, have a GitHub account (or will make one), and are interested in a Common Lisp battle of pull requests, let me know so we can get a game going.

The post Nomic Coding Game first appeared on nklein software.

06 Apr 2026 4:09am GMT

03 Apr 2026

feedPlanet Lisp

Marco Antoniotti: An Update on MK-DEFSYSTEM

There are still a few of us (at least two) who are using MK:DEFSYSTEM. The venerable system construction tool has accumulated a lot of ancient cruft, some of which quite convoluted.

Recently I went back to MK:DEFSYSTEM and "cleaned up" some of the code, especially regarding the pathname construction for each component. I also used some simpler hierarchical tricks using defstruct only.

The result should be more solid and clearer in the steps that comprise some "macro tasks". Of course, a rewrite using CLOS would change the coding style, but the choice has been made to keep the MK:DEFSYSTEM code base quite... retro (and somewhat simple).

Why did I went back to MK:DEFSYSTEM? As usual, it is because of a rabbit-hole I fell into: I will blog about it later on (hint: HEΛP).

MK-DEFSYSTEM quick history as of March 2026

MK-DEFSYSTEM (or MK:DEFSYSTEM, or MAKE:DEFSYSTEM) was originally written by Mark Kantrowitz as part of the original "CMU Lisp Utilities" collection; an early "public" set of Common Lisp code and utilities that, in the writer's opinion form one of the basis of most Common Lisp writing to date.

As stated (by M. Kantrowitz himself) in this file header, the original version of MK-DEFSYSTEM was inspired by the Symbolics DEFSYSTEM (or DEFSYS) tool. Yet, MK-DEFSYSTEM differs significantly from it.

In its original form, MK-DEFSYSTEM was built in the CLtL1 era, accommodated a lot of variance among filesystems and CL implementations and it still bears those idiosycrasies. CLtL2 (1992) first and ANSI (1994) next, started reshaping the code base then.

MK-DEFSYSTEM was originally distributed under a license agreement that made redistribution tricky. In 1999, the writer - that'd be me, Marco Antoniotti - contacted Mark Kantrowitz offering to become a maintainer while reworking the distribution license to hammer some FOSS into it. Mark Kantrowitz graciously agreed and, after that, the writer got literally and physically hugged by a few Common Lisp developers because they could use MK-DEFSYSTEM more freely.

Of course, ASDF came along and it solved the same problems that Symbolics (and Kent Pitman's) DEFSYS and MK-DEFSYSTEM solve, plus much more.

Yet, MK-DEFSYSTEM has some nice features (in the eye of the beholder).

MK-DEFSYSTEM still ships in one file - defsystem.lisp - that you can LOAD in your Common Lisp init file. Of course, a big chunk of its current code base is "backward compatibility" and new ok-we-miss-UIOP-and-or-at-least-CL-FAD functionality, plus an ever growing ongoing commentary like this one.

Given this background, the writer has been maintaining MK-DEFSYSTEM for a long time, and more recently, Madhu has made significant changes (and maintains himself a fork with some bells and whistles of his own) since 2008.

Of course, many other contributors helped over the years, and are acknowledged in the early Change Log and in comments in the code.

In early 2026, the writer cleaned up the code and reworked some of the logic, by factoring out some code from main functions. In particular, the CREATE-COMPONENT-PATHNAMES, GENERATE-COMPONENT-PATHNAMES, COMPONENT-FULL-PATHNAME, COMPONENT-FULL-NAMESTRING interplay is better organized; plus new structures, leveraging DEFSTRUCT :INCLUDE feature have been introduced, rendering the code TYPECASE-able.

MK-DEFSYSTEM is old, but it works. It is quirky but it works (at least for the two or three known users - which, in 2026, is already a big chunk of the Common Lisp users' community). Moreover, it does have, at least in the eye of the beholder, some more user friendly user API, for most use case, especially for plain Common Lisp code.

The current MK-DEFSYSTEM repository is at https://gitlab.common-lisp.net/mantoniotti/mk-defsystem

(*) It is assumed that the reader knows about all the acronyms, tools and systems referred to in the text.


'(cheers)

03 Apr 2026 1:04am GMT

02 Apr 2026

feedPlanet Lisp

Robert Smith: Idiomatic Lisp and the nbody benchmark

When talking to Lisp programmers, you often hear something like, "adapt Lisp to your problem, not your problem to Lisp." The basic idea is this: if Lisp doesn't let you easily write a solution to your problem because it lacks some fundamental constructs that make expressing solutions easy, then add them to Lisp first, then write your solution.

That sounds all good and well in the abstract, and maybe we could even come up with some toy examples-say, defining HTTP request routing logic in a nice DSL. But where's a real example of this that's not artificial or overengineered?

Recently, on Twitter, I butted into the middle of an exchange between @Ngnghm (a famous Lisp programmer) and @korulang (an account dedicated to a new language called Koru) about Lisp. I'm oversimplifying, but it went something like this:

Now, there's plenty of evidence online that Common Lisp has reasonably good compilers that produce reasonably good machine code, and so the question became more nuanced: Can Lisp be realistically competitive with C without ending up being a mess of unidiomatic code?

Our interlocutor @korulang proposed a benchmark, the "nbody" benchmark from the Computer Language Benchmarks Game. This was of particular interest to them, because they used it as an object of study for their Koru language. To quote their blog post:

We wanted Koru kernels to land in the same ballpark as idiomatic C, Rust, and Zig.

The result was stronger than that.

Our fused n-body kernel, written in straightforward Koru kernel style, came in faster than the plain reference implementations. Every implementation here is "naive" - the obvious, idiomatic version a competent programmer would write in each language. No tricks, no hand-tuning, no -ffast-math: […]

and they proceeded to show Koru being 14% faster than C and 106% faster than Lisp.

Now, putting aside that some of the code and blog post were written with LLMs, there are many questions that are left unanswered here, since computer architecture and operating system matter a lot (where did these benchmarks run?). Moreover, the author buries the lede a little bit and proceeds to show how we might write "unidiomatic" C to match the performance of Koru.

I'm not concerned about nitpicking their approach or rigorously evaluating their claims, but I would like to dwell on this common refrain: "idiomatic". What is that supposed to mean?

"Idiomatic code" in the context of programming means something like "representative of a fluent computer programmer" and "aligned with the peculiar characteristics of the language". In some sense, idiomatic code in a particular language shouldn't stand out amongst other code in that language, and idiomatic code should, in some sense, portray the identity of the language itself.

Idiomatic C is the C that uses terse names, simple loops, and unsafe arithmetic.

Idiomatic Haskell is the Haskell that uses short functions, higher-order abstractions, immutable data structures, and safe constructs.

What about idiomatic Lisp? Well, here's the rub. A fluent programmer at Lisp doesn't reach for one paradigmatic toolbox; they weave in and out of imperative, functional, object-oriented, etc. styles without much of a second thought. There's a sort of "meta" characteristic to Lisp programming: you're programming the language almost as much as you're programming the program.

Yes, Lisp has loops, but "loopy code" isn't intrinsically "Lispy code". Yes, Lisp has objects, but "OOPy code" isn't intrinsically "Lispy code". In my opinion, what makes code "Lispy" is whether or not the programmer used Lisp's metaprogramming and/or built-in multi-paradigm facilities to a reasonable degree to make the solution to their problem efficient and easy to understand in some global sense. For some problems, that may be "loopy" or "OOPy" or something else. It's finding a Pareto-efficient syntactic and semantic combination offered by the language, or perhaps one of the programmer's own creation.

So we get back to the @korulang benchmark challenge. Looking at their repository:

I hesitate to say nbody.lisp is idiomatic. It's reasonable, it's straightforward to any imperative-minded programmer, but it's not Lispy. That doesn't make it good or bad, but it does lead to the grand question:

Can we use Common Lisp to express a solution to the nbody benchmark in a way that reads more naturally than a direct-from-C port?

I would say that, at face value, Koru's solution is along the lines of what is more natural relative to the problem itself. Here are the essential bits.

~std.kernel:shape(Body) {
x: f64, y: f64, z: f64,
vx: f64, vy: f64, vz: f64,
mass: f64,
}
~std.kernel:init(Body) {
{ x: 0, y: 0, z: 0, vx: 0, vy: 0, vz: 0, mass: SOLAR_MASS },
{ x: 4.84143144246472090e+00, y: -1.16032004402742839e+00, z: -1.03622044471123109e-01, vx: 1.66007664274403694e-03 * DAYS_PER_YEAR, vy: 7.69901118419740425e-03 * DAYS_PER_YEAR, vz: -6.90460016972063023e-05 * DAYS_PER_YEAR, mass: 9.54791938424326609e-04 * SOLAR_MASS },
{ x: 8.34336671824457987e+00, y: 4.12479856412430479e+00, z: -4.03523417114321381e-01, vx: -2.76742510726862411e-03 * DAYS_PER_YEAR, vy: 4.99852801234917238e-03 * DAYS_PER_YEAR, vz: 2.30417297573763929e-05 * DAYS_PER_YEAR, mass: 2.85885980666130812e-04 * SOLAR_MASS },
{ x: 1.28943695621391310e+01, y: -1.51111514016986312e+01, z: -2.23307578892655734e-01, vx: 2.96460137564761618e-03 * DAYS_PER_YEAR, vy: 2.37847173959480950e-03 * DAYS_PER_YEAR, vz: -2.96589568540237556e-05 * DAYS_PER_YEAR, mass: 4.36624404335156298e-05 * SOLAR_MASS },
{ x: 1.53796971148509165e+01, y: -2.59193146099879641e+01, z: 1.79258772950371181e-01, vx: 2.68067772490389322e-03 * DAYS_PER_YEAR, vy: 1.62824170038242295e-03 * DAYS_PER_YEAR, vz: -9.51592254519715870e-05 * DAYS_PER_YEAR, mass: 5.15138902046611451e-05 * SOLAR_MASS },
}
| kernel k |>
std.kernel:step(0..iterations)
|> std.kernel:pairwise {
const dx = k.x - k.other.x;
const dy = k.y - k.other.y;
const dz = k.z - k.other.z;
const dsq = dx*dx + dy*dy + dz*dz;
const mag = DT / (dsq * @sqrt(dsq));
k.vx -= dx * k.other.mass * mag;
k.vy -= dy * k.other.mass * mag;
k.vz -= dz * k.other.mass * mag;
k.other.vx += dx * k.mass * mag;
k.other.vy += dy * k.mass * mag;
k.other.vz += dz * k.mass * mag;
}
|> std.kernel:self {
k.x += DT * k.vx;
k.y += DT * k.vy;
k.z += DT * k.vz;
}
| computed c |>
capture({ energy: @as(f64, 0) })
| as acc |>
for(0..5)
| each i |>
captured { energy: acc.energy + 0.5*c[i].mass*(c[i].vx*c[i].vx+c[i].vy*c[i].vy+c[i].vz*c[i].vz) }
|> for(i+1..5)
| each j |>
captured { energy: acc.energy - c[i].mass*c[j].mass / @sqrt((c[i].x-c[j].x)*(c[i].x-c[j].x)+(c[i].y-c[j].y)*(c[i].y-c[j].y)+(c[i].z-c[j].z)*(c[i].z-c[j].z)) }
| captured final |>
std.io:print.blk {
{{ final.energy:d:.9 }}
}

Can we achieve something similar in Lisp?

First, let's make a baseline. I'm running Ubuntu Noble with a "AMD RYZEN AI MAX+ PRO 395" with a clock speed that varies between 0.6-5 GHz. I am also using SBCL 2.6.3 and gcc 13.3. Using nbody.lisp as a starting point, I modified it for a few easy wins. I'll call this version nbody-lisp-conventional. A quick benchmark reveals that the loopy Lisp code is only about 20% slower than the C code compiled with gcc -O3 -ffast-math -march=native.

$ ./nbody-lisp-conventional 50000000
-0.169286396
timing: 2000 ms
$ ./nbody-c 50000000
-0.169286396
timing: 1662 ms

As a Lisp programmer, it's not surprising that it's a little slower. The number of person-years that have gone into C compilers to optimize idiomatic C code makes the development effort behind SBCL, the most popular open-source Lisp compiler, look like a rounding error.

Now that we have a baseline, our goal is to come up with a nicer Lisp program that also improves the timing.

Our approach will be simple. We will create a library.lisp that contains new language constructs of a similar ilk to Koru, and we will use them to implement the nbody benchmark in impl.lisp. Some rules:

The third rule is more rigorous than it looks. It means we can't just have a solve-nbody problem which dispatches to assembly.

To accomplish the above, we define a kernel DSL. The DSL allows us to express how elements of a composite transform, maintaining just enough invariants to allow them to be handled efficiently. These kernels are then compiled into efficient code, more efficient than ordinary loopy Lisp allows for.

Our attention will be focused on a proof-of-concept library of functionality for writing particle simulators. The operators we define are:

This collection of five operators forms a miniature, re-usable language. These broadly recapitulate those of Koru, and allow us to write something that looks like this:

(defconstant +solar-mass+ (* 4d0 pi pi))
(defconstant +days-per-year+ 365.24d0)
(defconstant +dt+ 0.01d0)
(define-kernel-shape body 5
x y z vx vy vz mass)
(defparameter *system*
(make-body-system
(list :x 0d0 :y 0d0 :z 0d0
:vx 0d0 :vy 0d0 :vz 0d0
:mass +solar-mass+)
...))
(define-pairwise-kernel advance-forces (s body dt)
(let* ((dx (- i.x j.x))
(dy (- i.y j.y))
(dz (- i.z j.z))
(dsq (+ (+ (* dx dx) (* dy dy)) (* dz dz)))
(mag (/ dt (* dsq (sqrt dsq)))))
(let ((dm-j (* mag j.mass))
(dm-i (* mag i.mass)))
(decf i.vx (* dx dm-j))
(decf i.vy (* dy dm-j))
(decf i.vz (* dz dm-j))
(incf j.vx (* dx dm-i))
(incf j.vy (* dy dm-i))
(incf j.vz (* dz dm-i)))))
(define-self-kernel advance-positions (s body dt)
(incf self.x (* dt self.vx))
(incf self.y (* dt self.vy))
(incf self.z (* dt self.vz)))
(define-reduction-kernel (energy e 0d0) (s body)
(:self
(+ e (* (* 0.5d0 self.mass)
(+ (+ (* self.vx self.vx) (* self.vy self.vy))
(* self.vz self.vz)))))
(:pair
(let* ((dx (- i.x j.x))
(dy (- i.y j.y))
(dz (- i.z j.z)))
(- e (/ (* i.mass j.mass)
(sqrt (+ (+ (* dx dx) (* dy dy))
(* dz dz))))))))
(define-kernel-step run-simulation (system body n :params ((dt double-float)))
(advance-forces dt)
(advance-positions dt))

Well, in fact, this isn't an ideal approximation, it's almost exactly how it turned out. Given this is a proof of concept, we sometimes have to write some Lisp things a little funny. For example, you'll notice we write:

(+ (+ (* dx dx) (* dy dy)) (* dz dz))

instead of the far more readable

(+ (* dx dx) (* dy dy) (* dz dz))

Both are completely valid and both can be used. So why the former? It is a result of a limitation of a little feature I built in: auto-vectorization. The vectorizer walks the mathematical expressions and replaces them with fast SIMD variants instead. Here's a little fragment showing this rewrite rule:

...
(case (car expr)
;; (+ a (* b c)) -> fmadd(a,b,c)
((+)
(let ((args (cdr expr)))
(cond
((and (= (length args) 2) (mul-p (second args)))
`(%%fmadd-pd ,(xf (first args))
,(xf (second (second args)))
,(xf (third (second args)))))
...

The implementation of these kernel macros in library.lisp weighs in at just under 700 lines, and includes optional x64 SIMD auto-vectorization.

Well, for the nail biting moment, how does it compare? I made a Makefile that compares the idiomatic C against the loopy Lisp against our kernel DSL Lisp. It does a median-of-3. Running this on my computer gives:

$ make bench
=== C (gcc -O3 -ffast-math) ===
-0.169286396
runs: 1657 1664 1653 ms
median: 1657 ms
=== Lisp (SBCL, conventional loops) ===
-0.169286396
runs: 1991 2009 2005 ms
median: 2005 ms
=== Lisp (SBCL, kernel syntax) ===
-0.169286396
runs: 1651 1651 1652 ms
median: 1651 ms

So, in fact, we have matched the performance of C almost exactly. Furthermore, the generated code is still not as lean as it could be. Not to put too fine a point on it, but, <100 lines of Lisp, supported by

has performance parity and greater readability/reusability than <100 lines of C, supported by

None of this is to make an argument that Lisp is "better", or that there isn't merit to avoiding custom DSLs in certain circumstances, or that the world doesn't have room for more custom home-grown compilers and parsers, but I think this is the clearest possible, quasi-realistic demonstration that idiomatic Lisp can be as fast as idiomatic C without tremendous work, whilst netting additional benefits unique to Lisp.

All code is available here.

02 Apr 2026 12:00am GMT

27 Mar 2026

feedPlanet Lisp

ECL News: ECL 26.3.27 release

We are announcing a new stable ECL release. This release highlights:

The release also incorporates many other bug fixes and performance improvements as well as an updated manual. We'd like to thank all people who contributed to ECL with code, testing, issue reports and otherwise.

People listed here contributed code in this iteration: Daniel Kochmański, Marius Gerbershagen, Tarn W. Burton, Kirill A. Korinsky, Dmitry Solomennikov, Kevin Zheng, Mark Shroyer and Sebastien Marie.

People listed here did extensive release candidate testing on various platforms: Marius Gerbershagen, Daniel Kochmański, Dima Pasechnik, Matthias Köppe, Jeremy List, Mark Damon Hughes and Paul Ruetz.

This release is available for download in a form of a source code archive (we do not ship prebuilt binaries):

Finally, a note on the release schedule: ECL releases often take some time to come out, partially because we do extensive testing against supported platforms and existing libraries to find regressions. In the meantime all improvements are incrementally incorporated in the branch develop. It is considered stable and it is tested and reviewed with necessary dilligence. If release cycle is too slow for your needs, then we suggest following the branch develop for the most recent changes.

Happy Hacking,
The ECL Developers

27 Mar 2026 10:00am GMT

12 Mar 2026

feedPlanet Lisp

Christoph Breitkopf: Functional Valhalla?

Pointer-rich data layouts lead to suboptimal performance on modern hardware. For an excellent introduction to this, see the article The Road to Valhalla. While it is specifically about Java, many parts of the article also apply to other languages. To summarize some of the key points of the article:

Consider a vector of records (or tuples, structures, product types - I'll stay with "record" in this article). A pointer-rich layout has each record allocated separately in the heap, with a vector containing pointers to the records. For example, given a "Point" record of two numbers:

pikchr diagram

The flat and dense layout has the records directly in the array:

pikchr diagram

(Note that there is another flat layout, namely, using one vector per field of the record. This is better suited to instruction-level parallelism or specialized hardware (e.g., GPUs), especially when the record fields have different sizes. But it is less suited for general-purpose computing, as reading a single vector element requires one memory access per field, whereas the "vector of records" layout above requires only one access per record. Such a layout can be easily implemented in any language that has arrays of native types, whether in the language itself or in a library (e.g., OCaml's Owl library). Thus, in this article, I will only consider the "array of records" layout above.)

Functional language considerations

Things should be much easier in functional languages than in Java: we have purity, referential transparency, and everything is a value. So it should be simple enough to store these values in memory in their native representation. But there are reasons that that is often not the case in practice:

Many implementations can not even lay out native types flat in records, so a Point record of IEEE 754 double-precision numbers may actually look like this in memory:

pikchr diagram

The (very short) List

So, given a record type, which functional languages allow a collection of values of that type to have a flat, linear memory layout? The number of programming languages that claim to be "functional" is huge, so the ones listed here are just a selection based on my preferences - mainly languages that allow that layout, and some I have some experience with and can speculate on how easy or hard it would be to add that as a library or extension.

Since the Point record can be misleading in its simplicity when it comes to the question of whether the functionality could be implemented as a library, I'll point out that there are records where the layout is a bit more interesting:

Pure languages:

Clean

Yes: Clean has unboxed arrays of records in the base language.

Caveat: it does not have integer types of specific sizes and only one floating-point type, making it harder to reduce memory usage by using the smallest type just large enough to support the required value range. It seems possible to implement such types in a library (the mTask system does that).

Futhark

No. Futhark does not intend to be a general-purpose language, so this is not surprising.

I mention it here because it does have arrays of records, but, since it targets GPUs and related hardware, it uses the "record of arrays" layout mentioned above.

Haskell

Yes. Not in the base language, but there is library support via Data.Vector.Unboxed. Types that implement the Unbox type class can be used in these vectors. Many basic types and tuples have an Unbox instance. However, when you care about efficiency, you probably do not want to use tuples but rather a data type with strict fields, i.e., not:

type Point = (Double, Double)

but:

data Point = Point !Double !Double

Writing an Unbox instance for such a type is not trivial. The vector-th-unbox library makes it easier, but requires Template Haskell. Unboxed vectors are implemented by marshalling the values to byte arrays, so records with pointer fields are not supported.

Impure Languages

F#

Yes, even records with pointer fields. Records have structural equality, and you can use structs or the [<Struct>] attribute to get a flat layout.


And that's all I could find. Unless I follow Wikipedia's list of functional programming languages, which contains languages such as C++, C#, Rust, or Swift, that allow the flat layout, but don't really fit my idea of a functional language. But SML, OCaml, Erlang (Elixir, Gleam), Scala? Not that I could see (but please correct me if I'm wrong).

Rolling your own

Since there is a library implementation for Haskell, maybe that's a possibility for other languages?

You should be able to implement flat layouts in any language that supports byte vectors. More interesting is how well such a library fits into the language, and whether a user of the library has to write code or annotations for user-defined record types, or whether the library can handle part or all of that automagically.

I'll only mention my beloved Lisp/Scheme here. Lisp's uniform syntax and macro system are a bonus here, but the lack of static typing makes things harder.

In Scheme, R6RS (and R7RS with the help of some SRFIs) has byte-vectors and marshalling to/from them in the standard library. But Scheme does not have type annotations, so you either need to offer a macro to define records with typed fields or to define how to marshal the fields of a regular (sealed) record. Since you can shadow standard procedures in a library, you can write code that looks like regular Scheme code, but, perhaps surprisingly, loses identity when storing/retrieving values from records:

(let ((vec (make-typed-vector 'point 1000))
      (pt (make-point x y)))
  (vector-set! vec 0 pt)
  (eq? (vector-ref vec 0) pt))
 ⇒ #f

(But then, you probably shouldn't be using eq? when doing functional programming in Scheme).

The same approach is possible in Common Lisp. In contrast to Scheme, it does have optional type annotations, and, together with a helper library for accessing the innards of floats and either the meta-object protocol to get type information or (probably better) a macro to define typed records, an implementation should be reasonably straightforward. Making it play nice with inheritance and the dynamic nature of Common Lisp (e.g., adding slots to classes or even changing an object's class at runtime) would be a much harder undertaking.

Conclusion

Of the functional languages I looked at, only F# fully supports flat and dense memory layouts. Among the pure languages, Haskell and Clean come close.

The question is how important this really is. There's a good argument to be made for turning to more specialized languages like Futhark if you mainly care about performance. On the other hand, having a uniform codebase in one language also has advantages.

Then, the performance story has changed, too. While the points Project Valhalla raises remain true in principle, processor designers are aware of this as well. They are doing their best to hide memory latency with techniques such as out-of-order execution or humongous caches. Thus, on a modern CPU, the effects of a pointer-rich layout are often only observable with large working set sizes.

Still, given the plethora of imperative language that can get you to Valhalla, support for this in the functional landscape seems lacking. In the future, I hope to see more languages or libraries that will make this possible.

12 Mar 2026 11:17am GMT

07 Mar 2026

feedPlanet Lisp

Scott L. Burson: FSet v2.3.0: Transients!

FSet v2.3.0 added transients! These make it faster to populate new collections with data, especially as the collections get large. I shamelessly stole the idea from Clojure.

They are currently implemented only for the CHAMP types ch-set, ch-map, ch-2-relation, ch-replay-set, and ch-replay-map.

The term "transient" contrasts with "persistent". I'm using the term "persistent" in its functional-data-structure sense, as Clojure does: a data structure is persistent if multiple states of it can coexist in memory efficiently. (The probably more familiar use of the term is in the database sense, where it refers to nonvolatile storage of data.) FSet collections have, up to now, all been persistent in this sense; a point modification to one, such as by with or less, takes only O(log n) space and time to return a new state of the collection, without disturbing the previous state.

A transient encapsulates the internal tree of a collection so as to guarantee that it holds the only pointer to the tree; this allows modifications to tree nodes to be made in-place, so long as the node has sufficient allocated space. Once the collection is built, the tree is in the same format that existing FSet code expects, and can be accessed and functionally updated as usual.

Some quick micro-benchmarking suggests that speedups, for constructing a set from scratch, range from 1.6x at size 64 to as much as 2.4x at size 4096.

You don't necessarily even have to use transients explicitly in order to benefit from them. Some FSet builtins such as filter and image use them now. The GMap result types ch-set etc. also use them.

For details, see the GitLab MR.


07 Mar 2026 8:04am GMT

04 Mar 2026

feedPlanet Lisp

Robert Smith: Beating Bellard's formula

By Robert Smith

Fabrice Bellard came up with a computationally efficient formula for calculating the nth hexadecimal digit of $\pi$ without calculating any of the previous n−1. It's called Bellard's formula. It wasn't the first of its kind, but in terms of computational efficiency, it was a substantial improvement over the original, elegant Bailey-Borwein-Plouffe formula. Due to the trio's discovery, these formulas are often called BBP-type formulas.

Over the years, numerous BBP-type formulas have been discovered. In fact, Bailey gives us a recipe to search for them using integer-relation algorithms. In simple terms, we can just guess formulas, and run a computation to see if it likely equals $\pi$ with high confidence. If we do find one, then we can use it as a conjecture to prove formally.

Like Bellard and many others, I ran a variant of Bailey's recipe, effectively doing a brute-force search, highly optimized and in parallel. The search yielded another formula that is computationally more efficient than Bellard's formula. The identity is as follows:

$$ \pi = \sum_{k=0}^{\infty} \frac{1}{4096^k} \left( \frac{1}{6k+1} - \frac{2^{-5}}{6k+3} + \frac{2^{-8}}{6k+5} + \frac{2}{8k+1} - \frac{2^{-5}}{8k+5} + \frac{2^{-1}}{12k+3} - \frac{2^{-4}}{12k+7} - \frac{2^{-8}}{12k+11} \right). $$

It converges at a rate of 12 bits per term. We will prove convergence, and then prove the identity itself (with a little computer assistance). As it turns out, an equivalent form of this formula was already discovered, which we will discuss as well. Finally, we'll show a very simple implementation in Common Lisp.

Proof of convergence

Write the series as $S := \sum_{k=0}^{\infty} 4096^{-k}R(k)$. Since $R(k)\in O(1/k)$, convergence is dominated by the geometric term $4096^{-k}$:

$$ \lim_{k \to \infty} \left\vert \frac{R(k+1)}{4096^{k+1}} \middle/ \frac{R(k)}{4096^{k}} \right\vert = \frac{1}{4096}. $$

By the ratio test, the series converges absolutely. Since $4096 = 2^{12}$, each additional term contributes exactly 12 bits of precision.

Bellard's formula converges at 10 bits per term and requires the evaluation of 7 fractions. The above converges at 12 bits per term, and requires the evaluation of 8 fractions. So while we require 20% fewer terms, each term requires about 14% more arithmetic. So, net-net, this formula is approximately 5-6% more efficient.

Proof of identity via a definite integral

Consider $1/(nk+j) = \int_{0}^{1} x^{nk+j-1} dx$. For positive integers $n$ and $b$, we get

$$ \sum_{k=0}^{\infty} \frac{1}{b^k}\cdot\frac{1}{nk+j} = \sum_{k=0}^{\infty} \int_{0}^{1} \left(\frac{x^n}{b}\right)^k x^{j-1} dx. $$

We can swap the sum and integral via the Lebesgue dominated convergence theorem, since the power series $\sum (x^n/b)^k$ converges uniformly for $x \in [0, 1]$ and $b > 1$. Using this and summing the geometric series gives:

$$ \int_{0}^{1} x^{j-1} \sum_{k=0}^{\infty} \left(\frac{x^n}{b}\right)^k dx = \int_{0}^{1} \frac{x^{j-1}}{1 - x^n/b} dx. $$

We now apply this to $S$ termwise with $b=4096=2^{12}$:

$$ S = \int_0^1 \left( \frac{x^{0}}{1 - \frac{x^6}{2^{12}}} - 2^{-5} \frac{x^{2}}{1 - \frac{x^6}{2^{12}}} + 2^{-8} \frac{x^{4}}{1 - \frac{x^6}{2^{12}}} + 2 \frac{x^{0}}{1 - \frac{x^8}{2^{12}}} - 2^{-5} \frac{x^{4}}{1 - \frac{x^8}{2^{12}}} + 2^{-1} \frac{x^{2}}{1 - \frac{x^{12}}{2^{12}}} - 2^{-4} \frac{x^{6}}{1 - \frac{x^{12}}{2^{12}}} - 2^{-8} \frac{x^{10}}{1 - \frac{x^{12}}{2^{12}}} \right) dx. $$

At this point, you could try to algebra your way through, expanding, using the substitution $x=2u$, etc. ultimately yielding a nice denominator $(u^2\pm 2u+2)(u^6-64)(u^{12}-1)$. Maybe compute some residues. Or, just CAS your way through.

% fricas
FriCAS Computer Algebra System
Version: FriCAS 2025.12.23git built with sbcl 2.5.2.1852-1f3beec71
Timestamp: Wed Mar 4 12:41:38 EST 2026
-----------------------------------------------------------------------------
Issue )copyright to view copyright notices.
Issue )summary for a summary of useful system commands.
Issue )quit to leave FriCAS and return to shell.
-----------------------------------------------------------------------------
(1) -> f := (1/(1 - x^6/4096))
- (1/32)*x^2/(1 - x^6/4096)
+ (1/256)*x^4/(1 - x^6/4096)
+ 2*1/(1 - x^8/4096)
- (1/32)*x^4/(1 - x^8/4096)
+ (1/2)*x^2/(1 - x^12/4096)
- (1/16)*x^6/(1 - x^12/4096)
- (1/256)*x^10/(1 - x^12/4096);
Type: Fraction(Polynomial(Fraction(Integer)))
(2) -> normalize(integrate(f, x = 0..1))
3 1 11 19 1
(2) 2 atan(-) - 2 atan(-) + 2 atan(--) + 2 atan(--) + 2 atan(-)
2 2 24 48 4
Type: Expression(Fraction(Integer))

So now we just need to show the arctans all collapse to $\pi$. Recall the identity

$$ \tan^{-1} a \pm \tan^{-1} b = \tan^{-1}\left(\frac{a\pm b}{1\mp ab}\right). $$

The sum of the first four terms can be calculated easily in Common Lisp:

% sbcl --no-inform
* (defun combine (a b) (/ (+ a b) (- 1 (* a b))))
COMBINE
* (reduce #'combine '(3/2 -1/2 11/24 19/48))
4

So we have $2\big(\tan^{-1}4 + \tan^{-1}(1/4)\big)$, and with our final elementary trig identity $\tan^{-1} (a/b) = \pi/2 - \tan^{-1} (b/a)$, we find $S = \pi$.

A new discovery?

Of course, I was excited to find this formula, but after some internet spelunking, it turns out it had already been discovered by Géry Huvent and Boris Gourévitch, perhaps independently. Gourévitch doesn't credit Huvent as he does with other formulas, but he does say "[…] furthermore, we can obtain BBP formula […] by using what Gery Huvent calls the denomination tables […]." Daisuke Takahashi cites Huvent's website in this 2019 paper published in The Ramanujan Journal. In all cases, they write the formula in the following way:

$$ \frac{1}{128} \sum _{k=0}^{\infty} \frac{1}{2^{12k}}\left( \frac{768}{24 k+3}+\frac{512}{24k+4}+\frac{128}{24 k+6}-\frac{16}{24 k+12}-\frac{16}{24 k+14}-\frac{12}{24 k+15}+\frac{2}{24 k+20}-\frac{1}{24 k+22}\right), $$

which is structurally equivalent to $S$.

Despite having been known already, this formula doesn't appear to be well known. As such, I hope this blog post brings more attention to it.

Simple implementation

Here is a simple implementation of digit extraction using BBP-type formulas in Common Lisp:

(defun %pow2-mod (exponent modulus)
(cond
((= modulus 1) 0)
((zerop exponent) 1)
(t
(let ((result 1)
(base (mod 2 modulus))
(e exponent))
(loop :while (plusp e) :do
(when (oddp e)
(setf result (mod (* result base) modulus)))
(setf base (mod (* base base) modulus)
e (ash e -1)))
result))))
(defun %scaled-frac-of-power-two (exponent denom)
(cond
((>= exponent 0)
(let ((residue (%pow2-mod exponent denom)))
(floor (ash residue *precision-bits*) denom)))
(t
(let ((effective-bits (+ *precision-bits* exponent)))
(if (minusp effective-bits)
0
(floor (ash 1 effective-bits) denom))))))
(defun %series-scaled-frac (bit-index bbp-series k-step global-shift alternating-p)
;; A series is a list of series terms. A series term is a quadruple
;; (SIGN SHIFT DENOM-MULTIPLIER DENOM-OFFSET) representing the summand
;; SIGN * 2^SHIFT / (DENOM_MULTIPLIER * k + DENOM_OFFSET).
(let* ((modulus (ash 1 *precision-bits*))
(max-shift (loop :for term :in bbp-series :maximize (second term)))
(k-max (max 0 (ceiling (+ bit-index ; conservative bound
global-shift
max-shift
*precision-bits*
*guard-bits*)
k-step))))
(loop :with acc := 0
:for k :from 0 :to k-max :do
(let ((k-sign (if (and alternating-p (oddp k)) -1 1))
(k-factor (* k-step k)))
(dolist (term bbp-series)
(destructuring-bind (term-sign shift den-mul den-add) term
(let* ((denom (+ den-add (* den-mul k)))
(exponent (+ bit-index global-shift shift (- k-factor)))
(piece (%scaled-frac-of-power-two exponent denom))
(signed (* k-sign term-sign)))
(when (plusp piece)
(setf acc (mod (+ acc (* signed piece)) modulus)))))))
:finally (return acc))))
(defun %nth-hex-from-series (n terms k-step global-shift alternating-p)
(let* ((bit-index (* 4 n)))
(ldb (byte 4 (- *precision-bits* 4))
(%series-scaled-frac bit-index
terms
k-step
global-shift
alternating-p))))

This implementation uses Lisp's arbitrary precision integer arithmetic. A "real" implementation would use more efficient arithmetic, but this will suffice for some basic testing. Now we can write functions to use the Bellard formula and the new formula:

(defparameter +bellard-terms+
'((-1 5 4 1)
(-1 0 4 3)
(+1 8 10 1)
(-1 6 10 3)
(-1 2 10 5)
(-1 2 10 7)
(+1 0 10 9)))
(defun bellard-nth-hex (n)
(%nth-hex-from-series (* 4 n) +bellard-terms+ 10 -6 t))
(defparameter +new-terms+
'((+1 0 6 1)
(-1 -5 6 3)
(+1 -8 6 5)
(+1 1 8 1)
(-1 -5 8 5)
(+1 -1 12 3)
(-1 -4 12 7)
(-1 -8 12 11)))
(defun new-nth-hex (n)
(%nth-hex-from-series (* 4 n) +new-terms+ 12 0 nil))

Let's make sure they agree for the first 1000 hex digits:

CL-USER> (loop :for i :below 1000
:always (= (bellard-nth-hex i) (new-nth-hex i)))
T

And now let's look at timing comparisons. Here's a little driver:

(defun compare-timings (n)
(flet ((time-it (f n)
(sb-ext:gc :full t)
(let ((start (get-internal-real-time)))
(funcall f n)
(- (get-internal-real-time) start))))
(loop :repeat n
:for index := 1 :then (* 10 index)
:for bellard := (time-it #'bellard-nth-hex index)
:for new := (time-it #'new-nth-hex index)
:do (format t "~v,' D: new is ~A% faster than bellard~%" n index
(round (* 100 (- bellard new)) bellard)))))

And the results if the timing up to the one millionth hexadecimal digit:

CL-USER> (compare-timings 7)
1 : new is 81% faster than bellard
10 : new is 7% faster than bellard
100 : new is 6% faster than bellard
1000 : new is 5% faster than bellard
10000 : new is 4% faster than bellard
100000 : new is 3% faster than bellard
1000000: new is 4% faster than bellard

As predicted, though imperfect a test, it's consistently faster across a few orders of magnitude.

04 Mar 2026 12:00am GMT