10 Feb 2016

feedPlanet Lisp

Quicklisp news: February 2016 Quicklisp dist update now available

New projects:

Updated projects: antik, architecture.service-provider, asteroids, backports, binfix, buffalo, chirp, cl+ssl, cl-amqp, cl-ana, cl-autowrap, cl-cffi-gtk, cl-cut, cl-gdata, cl-gists, cl-grace, cl-itertools, cl-jpeg, cl-lexer, cl-liballegro, cl-libusb, cl-libuv, cl-marklogic, cl-mime, cl-mpi, cl-opengl, cl-permutation, cl-quickcheck, cl-scripting, cl-sdl2, clack, clim-widgets, clml, clobber, closer-mop, clsql, clsql-orm, clx, colleen, com.informatimago, contextl, crane, croatoan, data-table, datafly, dexador,djula, docparser, eazy-gnuplot, esrap, fast-io, fn, gendl, glkit, gsll, helambdap, hu.dwim.presentation, hu.dwim.stefil, hu.dwim.util, incf-cl, inferior-shell, inquisitor, intel-hex, iolib, ip-interfaces, jonathan, jsown, lack, lake, linewise-template, lisp-unit, lisp-zmq, local-time, lparallel, lquery, lucerne, mathkit, mcclim, mpc, osicat, plokami, postmodern, pounds, pp-toml, proc-parse, projectured, qlot, qmynd, qt-libs, qtools, qtools-ui, quickutil, quri, racer, restas, rutils, scalpl, sdl2kit, secure-random, serapeum, simple-inferiors, simple-tasks, snooze, spinneret, stp-query, stringprep, stumpwm, sxql, temporal-functions, trivia, ubiquitous, universal-config, utilities.print-items, varjo, vas-string-metrics, websocket-driver, with-c-syntax,wu-sugar, xhtmlambda, xml.location, yason.

Removed projects: cl-llvm, cl-popen, cl-v4l2, cl-zmq, hdf5-cffi, lambda-gtk, magicffi, perfpiece, sqnc.

These projects were removed because I now use Debian 8 for testing, and I have been completely unable to get them to build. If you know quick and easy ways to get any of them to build on Debian 8, let me know, and I'll get them back into Quicklisp.

Enjoy!

10 Feb 2016 12:09am GMT

08 Feb 2016

feedPlanet Lisp

Patrick Stein: The Less You Know (An Unspoken Tenet of Clean Architecture)

In response to my previous article, Holger Schauer asked why I implemented the use-cases with classes rather than having the book repository passed into the browse-books method or putting it into the abstract base-class browse-books-use-case rather than in the implementation.

(defstruct browse-books-request
book-repository)

(defgeneric browse-books (request response))

;; -- or --

(defgeneric browse-books (book-repository request response)

;; -- or --

(defclass browse-books-use-case ()
((book-repository :initarg :book-repository :reader book-repository)))

I started to reply, but I found that I was having trouble keeping my response to a reasonable length. So, here, you are subjected to my unreasonably long response.

For reference, here is the repository from the previous post.

Java's Influence

I chose to model the use-cases as classes to be faithful to all of the implementations that I've seen of Clean Architecture. Uncle Bob, in particular, does most of his videos and talks using Java or Ruby. When he uses Ruby, he still tends to use the same OO organization that he would use if he were writing Java.

One aspect of Java is that everything has to be a method on some class. If you want to declare a method somewhere but have its implementation somewhere else, you can do this with an abstract base class or an interface. You cannot, for example, declare an abstract static method as you can in CLOS (or even C).

Java's single-inheritance makes it very awkward in general to use abstract base classes. If the book-repository slot were on the browse-books-use-case rather than on the implementation, this would have to be an abstract base-class and direct inheritance if it were in Java. If the book-repository slot is on the implementation rather than on the browse-books-use-case, then someone writing this in Java could use an interface and an implementation of that interface.

I wrote this blog post using Lisp/CLOS. However, for the actual project that I am working on, I have opted to use PHP. PHP shares Java's inheritance restrictions. A class can only have one superclass but it can implement any number of interfaces.

The Stated Advantages of Clean Architecture

When Uncle Bob talks about why to use Clean Architecture, he tends to focus on these things:

The adapters depend on interfaces defined in the application. The application does not depend on anything defined by the frontend or the backend. As such, deciding which frameworks to use on the frontend or which database to use on the backend can be deferred for a long time during the development process. Uncle Bob often describes how on FitNesse they developed using the filesystem as the database backend fully intending to plug in a relational database later in the project. In the end, they decided that the filesystem backend they had created served their purposes just fine. When they came upon a client who had requirements that all of the dynamic content be stored in their corporate database, they rewrote an adapter for that customer to use the database rather than the filesystem.

Clean Architecture applications are easy to test because the application code isn't interwoven with the presentation code or the database code. One can create mocks (even test-specific mocks) for the repositories and front-ends that do whatever the test needs to have done. They can do it quickly, without temporary databases or network overhead.

In a Clean Architecture deployment of my application in Java, I might have one JAR file for the book repository implementation, one JAR file for the Console frontend, one JAR file for the application core logic, and one JAR file that wires all of those other pieces together on startup. If I need to incorporate a different database, then I only need to update the book repository implementation JAR and (maybe) the JAR file that wires the pieces together. I don't have to redeploy my entire application every time the database changes. I don't have to retest every bit of application logic. Everything is nicely contained.

An Unspoken Benefit of Clean Architecture

One of the ways that Clean Architecture keeps its dependencies in check is by making all of the interfaces going into or out of your application use primitive types or dumb structures for parameter passing. The unspoken benefit of this is that it encourages you to keep the interfaces as spartan as possible.

If the controller (the console in my sample application) that is invoking the use case doesn't absolutely need to know there is a book repository involved, then it shouldn't know about it. The book repository cannot be passed from the controller to the use case using primitive types and dumb structures.

The controller's job is to take input from the user, turn it into a request, invoke the use case with that request, and present the response back to the user. There is no input that the controller can get from the user which will allow it to instantiate a book repository. It could, of course, pull it from global state as it pulled the use case. However, the less it needs to know, the better.

This separation has other benefits, as well. In Lisp, especially, one could imagine interacting with the application through the REPL. So, rather than implementing the simple REPL that I did with my console, I could have made a browse function that invokes the use-case and returns the results.

(defun browse ()
(let ((request (make-browse-books-request))
(response (make-browse-books-response)))
(browse-books *browse-books-use-case* request response)
(mapcar #'book-summary-to-plist
(browse-books-response-list-of-book-summaries response))))

If everything is going to happen within my one Lisp image, then it would be fine to keep a *book-repository* global to pass into the use case. However, if I want to share the book repository between multiple users, each at their own REPL, then it no longer makes sense that each REPL would need a handle to the book repository.

Remote Proxy class-diagram

My browse function doesn't need to know whether the use case it is interacting with is the actual implementation or just a proxy for the implementation. In the client-server case, it makes little sense for the client to have any idea that there is a book repository instance.

I suppose there are some classes of application where one might want to work on a remote repository but keep the application logic local. If that were the case, then one would want the book repository to be the interface which has the proxy and server. However, nothing in the architecture mentioned in the previous post precludes that use. If we had put the book repository into the interface for the use case rather than in its implementation, then we need to at least pretend there is a book repository on the local client even when everything is going to happen on the server.

Conclusion

An accidental benefit of Clean Architecture's insistence on primitive or dumb-struct parameters to use-cases is that the parameters end up only reflecting the minimal amount of information that is specific to the current request. Any state that is common across requests is a part of the internal state of the use-case.

Because of this, code that is interacting with the use case only to know the little bit that makes them different from other people using the use case. This results in very simple interfaces to the use case and the a great deal of flexibility in implementation.

08 Feb 2016 2:16am GMT

04 Feb 2016

feedPlanet Lisp

Patrick Stein: Exploring Clean Architecture

This is the first in what will likely be a series of blog posts about Clean Architecture. Uncle Bob Martin has written numerous blog posts and given lots of talks about it.

The goal of Clean Architecture is to have the directory structure of your application shout out what your application does rather than what framework was used to present your application or what database is nestled in the depths of your application. Your program is divided into Entities, Use Cases, and Interface Adapters.

Entities encapsulate "Enterprise-wide business rules." Use Cases encapsulate "Application-specific business rules." Interfaces and Adapters represent how your Use Cases want to interact with the outside world (e.g. databases, users, printers, etc.).

In Clean Architecture, the Entities cannot know that the Use Cases exist and the Use Cases cannot know anything about the Adapters except for the Interface to them which is defined by the Use Case rather than by the Adapter. The Use Case does not know whether the application is being used from the command-line or from the web or from a remote service calling into it. The Use Case does not know whether the data is being stored in the file system or in a relational database (or conjured from the ether as needed). Nothing in the Adapters can know anything about the Entities.

Simple Example

I have a project that I am just starting. I thought I would use this new project to see how Clean Architecture works for me.

There are large number of talks and videos about Clean Architecture. However, there are not many examples of it despite several years of Stack Overflow questions and blog posts asking for examples.

There are a few simple examples around the web. The most notable is Mark Paluch's Clean Architecture Example. It is just big enough to get a sense of how things hang together. If you're willing to put up with Java's insane directory hierarchies, you can get a pretty good idea of what the application does just by poking around the Use Cases directory.

My First Use Case

My first Use Case is to let the User browse a list of Book Summaries. The User should be able to sort by Title, Author, Publication Date, or Date the Book was acquired. The User should be able to filter the list based upon Genre or Keyword. The Use Case should allow the caller to implement pagination, so the Use Case needs to support returning up to a given number of Book Summaries starting with a specific number.

Some might argue that that is multiple Use Cases glommed together. If that were the case, then I would need some way to pipeline together Use Cases if I'm going to make any sort of reasonably navigable app atop my Use Cases.

But, let's start with baby steps.

The Simplified Version of my First Use Case

Let's just say the User wishes to see a list of all of the Book Summaries. The User is fine with seeing all of them at once in whatever order my app sees fit.

This simple version of the Use Case is implemented in an accompanying repository under the tag the-dream.

Class Diagram (explained below)

The architecture consists of some simple structures with no "business logic" in them at all: book-summary and book.

(defstruct book-summary
isbn
title
author
cover-page-thumbnail)

(defstruct book
isbn
title
author
cover-page-thumbnail
cover-page-fullsize
list-of-thumbnails
table-of-contents
synopsis
publication-date
list-of-genres
list-of-keywords)

There is one use case browse-books which defines the use-case interface browse-books-use-case along with its input structure browse-books-request and its output structure browse-books-response. The use case defines the method browse-books which must be called with a browse-books-use-case instance, a browse-books-request instance, and browse-books-response instance.

(defstruct browse-books-request)

(defstruct browse-books-response
list-of-book-summaries)

(defclass browse-books-use-case ()
())

(defgeneric browse-books (use-case request response))

In my implementation, the browse-books-response is a simple data structure. One could easily imagine that the browse-books method would return one rather than filling one in that was passed to it. In some variants of Clean Architecture (like the Paluch example cited above), the response model is a class instance upon which a method is called to complete the interaction. But, it would have to be clear from the outset that anyone using this Use Case cannot depend on it being synchronous or asynchronous.

The use case also defines the book-repository interface that it needs.

(defclass book-repository ()
())

(defgeneric find-book-by-isbn (book-repository isbn))
(defgeneric all-books (book-repository))

In the Paluch example, all of the use cases share the same repository interfaces (though Paluch and others have separate repository interfaces for Users and Invoices and Items). In several of Uncle Bob's videos, he makes the claim (or claims equivalent to the claim) that each use case should define an interface for just the methods it needs to use on a Book repository. In this use case, it would need only the ability to retrieve the list of Books and so I should not have defined find-book-by-isbn here at all, and I should have named this interface browse-books-book-repository.

I wrote browse-books-impl class which extends the browse-book-use-case. It takes an instance of book-repository on construction.

(defclass browse-books-impl (browse-books-use-case)
((book-repository :initarg :book-repository :reader book-repository)))

(defun make-browse-books-use-case (book-repository)
(check-type book-repository book-repository)
(make-instance 'browse-books-impl :book-repository book-repository))

It uses that to retrieve the list of Books. Then, it creates a book-summary from each book instance retrieved from the book-repository.

(defun summarize-book (book)
(check-type book book)
(make-book-summary :isbn (book-isbn book)
:title (book-title book)
:author (book-author book)
:cover-page-thumbnail (book-cover-page-thumbnail book)))

(defmethod browse-books ((use-case browse-books-impl)
(request browse-books-request)
(response browse-books-response))
(let* ((books (all-books (book-repository use-case)))
(summaries (mapcar #'summarize-book books)))
(setf (browse-books-response-list-of-book-summaries response) summaries))
response)

To test the design so far, I wrote in-memory-book-repository backend which implements the book-repository interface that was defined in the Use Case.

(defclass in-memory-book-repository (book-repository)
((books :initarg :books :reader books)))

(defun make-in-memory-book-repository (books)
(check-type books list)
(assert (every #'book-p books))
(make-instance 'in-memory-book-repository :books books))

(defmethod all-books ((book-repository in-memory-book-repository))
(mapcar #'copy-book (books book-repository)))

I also wrote a console frontend which invokes the browse-books-use-case.

(defun console-browse-books ()
(let ((request (make-browse-books-request))
(response (make-browse-books-response)))
(browse-books *browse-books-use-case* request response)
(mapcar #'console-print-summary
(browse-books-response-list-of-book-summaries response))
(values)))

...

(defun console-main-loop ()
(catch 'console-quit
(with-standard-io-syntax
(loop
:do (mapc #'console-print
(multiple-value-list
(console-eval (console-read))))))))

To tie it all together, I wrote app-context which holds the current browse-books instance.

(defvar *browse-books-use-case*)

And, I wrote the app which creates an instance of the in-memory-book-repository and creates the browse-books-impl for the app-context. Then, it runs the main loop of the console frontend.

(defun run-console-app-with-memory-db (&optional (books *book-list*))
(let* ((book-repo (make-in-memory-book-repository books))
(*browse-books-use-case* (make-browse-books-use-case book-repo)))
(console-main-loop)))

Trouble In Paradise

Already, in this simple interface, I am torn. For this Use Case, I do not need the repository to return me the list of Books. I could, instead, ask the repository to return me the list of Book Summaries. If I do that, my application is just a fig-leaf over the repository.

(defgeneric all-book-summaries (book-repository))

Well, the argument against asking the repository for Book Summaries is that it should not be up to the database to decide how I would like to have my Books summarized. That certainly seems like it should be "business logic" and probably "application specific" business logic at that.

So, fine. I will have the repository return Books and the Use Case will summarize them.

Now, let me extend the Use Case the next little bit forward. What if I want to support pagination? My choices are to push the pagination down to the repository so that I can ask it to give me up to 20 Books starting with the 40th Book. Or, I can let the repository give me all of the books and do the pagination in the Use Case.

(defstruct browse-books-request
start max-results)

Here, I can find no guidance in any of the Clean Architecture videos that I have watched nor in the examples that I have found online. Everyone seems happy with the repositories being able to return one item given that item's unique identifier or return all of the items.

If the repository is going to return all of the Books, then why wouldn't my Use Case just return them all and leave the caller to do any pagination that is needed?

This works fine when there are a few dozen books and they are small. It does not scale, and I don't know how it is supposed to scale without pushing most of the responsibility onto the repository.

(defgeneric all-books-in-range (book-repository start max-results))

Sure, I can push the responsibility onto the repository. But, one of the reasons that Clean Architecture is so structured is to allow easy testing of all of the application logic. The more that I push into the repository, the less that I actually exercise when I run my unit tests with my mock repository (and the more complex my mock repository should probably be).

One possible approach would be to have the all-books method instead be all-isbns. Then, I can retrieve all of the ISBNs and use find-book-by-isbn to get all of the books.

Now, if I want to sort by Author then by Title, I need to:

Or, I have to write an SQL query, that can do all of that in one database call instead of 2N + R + 1 calls (where N is the number of books in the database and R is the number of books in my range), making my Use Case a fig-leaf again.

04 Feb 2016 9:50pm GMT

Zach Beane: Lisp and WebAssembly

Douglas Crosher, who did lots of work on CMUCL and created the commercial Scieneer Common Lisp, has been trying to make WebAssembly friendlier to Lisp. It hasn't been easy. Even minor changes are getting pushed back. See his email to SBCL-devel for some of the details, and how to help.

04 Feb 2016 2:12pm GMT

03 Feb 2016

feedPlanet Lisp

Luís Oliveira: slime-macrostep

SLIME's just got a new contrib called slime-macrostep, courtesy of Jon Oddie who also wrote the underlying macrostep package.

And what is slime-macrostep? It's an interactive inline macro-expander. Have a look:


In this quick demo, using a CFFI example, I start by expanding the top-level WITH-FOREIGN-OBJECT form, then I expand the WITH-ALIEN form, but regret it and collapse it back. Then I proceed to expand everything else, including compiler macros!

A nice feature of slime-macrostep is that the expansions are annotated to show which forms are further expandable and macrostep-expand will jump automatically to the next expandable form. That's what makes it a stepper: pressing e repeadly will step through the macroexpansion. Plus, it expands macrolets!

If you'd like to try it out, please grab SLIME's bleeding edge (via MELPA or Git). It's enabled by default if you use the slime-fancy meta contrib. Feedback is most welcome.

03 Feb 2016 12:51am GMT

31 Jan 2016

feedPlanet Lisp

Joe Marshall: Race results are in

Some people wanted to compare machines, so here is the exact code I used and some sample values I got from running it on my laptop. I'm curious what values other people get.

;;; -*- Mode: scheme; coding:us-ascii -*-

;;; This is code for MIT/GNU Scheme.  We attempt to measure the speed of ASSQ.
;;; The code avoids obvious abstractions because we don't want to
;;; measure the speed of function calling.

;; Inline the standard scheme primitives.
(declare (usual-integrations))

;; We change the size and alignment of the heap by consing a small
;; data structure to this list.
(define *change-heap-size-and-alignment* '())

;;; Creates a test alist that looks like this:
;;; ((13 . "13")
;;;  (23 . "23")
;;;  (25 . "25")
;;;  (18 . "18")
;;;  (0 . "0")
;;;  (19 . "19")
;;;  (5 . "5")
;;;  (4 . "4")
;;;  (6 . "6")
;;;    ...
;;; 
(define (make-test-alist n-elements)
  (let ((alist 
         (fold-left (lambda (alist number)
                      (alist-cons number (number->string number)
                                  alist))
                    '()
                    ;; shuffle the numbers from 1 to n
                    (map car
                         (sort   
                          (map (lambda (n)
                                 (cons n (random 1.0)))
                               (iota n-elements))
                          (lambda (l r) (< (cdr l) (cdr r))))))))
    (gc-flip)
    (set! *change-heap-size-and-alignment* 
          (cons (vector #f) *change-heap-size-and-alignment*))
    alist))

;;; Creates an alist of <size> entries and then measures the time to
;;; perform n-lookups on it.  Specialized to fixnum-only arithmetic.

(define (time-alist size n-lookups)
  (let ((test-alist (make-test-alist size)))
    (show-time
     (lambda ()
       (do ((i   0 (fix:+ i 1))
            (idx 0 (if (fix:> idx size)
                       0
                       (fix:+ idx 1)))
            (answer '() (assq idx test-alist)))
           ((fix:>= i n-lookups) answer))))))

Here's a sample run on my laptop. We make an alist with 10 elements and call ASSQ 100,000,000 times on it, fetching each entry about the same number of times.

1 ]=> (do ((i 0 (+ i 1))) ((not (< i 10))) (time-alist 10 100000000))

;process time: 2260 (2260 RUN + 0 GC); real time: 2259
;process time: 2260 (2260 RUN + 0 GC); real time: 2265
;process time: 2290 (2290 RUN + 0 GC); real time: 2291
;process time: 2250 (2250 RUN + 0 GC); real time: 2247
;process time: 2260 (2260 RUN + 0 GC); real time: 2259
;process time: 2240 (2240 RUN + 0 GC); real time: 2240
;process time: 2240 (2240 RUN + 0 GC); real time: 2243
;process time: 2250 (2250 RUN + 0 GC); real time: 2258
;process time: 2240 (2240 RUN + 0 GC); real time: 2247
;process time: 2250 (2250 RUN + 0 GC); real time: 2250

Process time is reported in milliseconds, so it took about 2.26 seconds to do 100,000,000 million lookups. This divides out to .0000000225 seconds = 22.5 nanoseconds per lookup.

It should be about linear with the size of the list, so we'd expect a 100 element list to take somewhere around 225 nanoseconds.

1 ]=> (do ((i 0 (+ i 1))) ((not (< i 10))) (time-alist 100 100000000))

;process time: 20720 (20720 RUN + 0 GC); real time: 20753
;process time: 20700 (20700 RUN + 0 GC); real time: 20733
;process time: 20640 (20640 RUN + 0 GC); real time: 20671
;process time: 20690 (20690 RUN + 0 GC); real time: 20695
;process time: 20670 (20670 RUN + 0 GC); real time: 20690
;process time: 21010 (21010 RUN + 0 GC); real time: 21026
;process time: 20800 (20800 RUN + 0 GC); real time: 20832
;process time: 20760 (20760 RUN + 0 GC); real time: 20747
;process time: 20710 (20710 RUN + 0 GC); real time: 20702
;process time: 20690 (20690 RUN + 0 GC); real time: 20700
;Value: #t

Testing a hash table:

(define (make-test-hash-table n-entries)
  (alist->hash-table (make-test-alist n-entries)))

;;; Creates a hash-table of <size> entries and then measures the time to
;;; perform n-lookups on it.  Specialized to fixnum-only arithmetic.

(define (time-hash-table size n-lookups)
  (let ((test-hash-table (make-test-hash-table size)))
    (show-time
     (lambda ()
       (do ((i   0 (fix:+ i 1))
            (idx 0 (if (fix:> idx size)
                       0
                       (fix:+ idx 1)))
            (answer '() (hash-table/get test-hash-table idx #f)))
           ((fix:>= i n-lookups) answer))))))

Put 10 elements or a thousand in a hash table, it takes a constant amount of time to look things up:

1 ]=> (do ((i 0 (+ i 1))) ((not (< i 10))) (time-hash-table 10 100000000))

;process time: 8320 (8320 RUN + 0 GC); real time: 8321
;process time: 8300 (8300 RUN + 0 GC); real time: 8304
;process time: 8420 (8420 RUN + 0 GC); real time: 8419
;process time: 8280 (8280 RUN + 0 GC); real time: 8304
;process time: 8380 (8380 RUN + 0 GC); real time: 8387
;process time: 8280 (8280 RUN + 0 GC); real time: 8288
;process time: 8320 (8320 RUN + 0 GC); real time: 8311
;process time: 8330 (8330 RUN + 0 GC); real time: 8327
;process time: 8290 (8290 RUN + 0 GC); real time: 8290
;process time: 8310 (8310 RUN + 0 GC); real time: 8307
;Value: #t

1 ]=> (do ((i 0 (+ i 1))) ((not (< i 10))) (time-hash-table 1000 100000000))

;process time: 8400 (8400 RUN + 0 GC); real time: 8403
;process time: 8550 (8550 RUN + 0 GC); real time: 8553
;process time: 8620 (8620 RUN + 0 GC); real time: 8639
;process time: 8420 (8420 RUN + 0 GC); real time: 8435
;process time: 8400 (8400 RUN + 0 GC); real time: 8425
;process time: 8460 (8460 RUN + 0 GC); real time: 8455
;process time: 8460 (8460 RUN + 0 GC); real time: 8459
;process time: 8480 (8480 RUN + 0 GC); real time: 8486
;process time: 8500 (8500 RUN + 0 GC); real time: 8502
;process time: 8520 (8520 RUN + 0 GC); real time: 8518
;Value: #t

Testing an rb-tree:

(define (make-test-rb-tree n-entries)
  (alist->rb-tree (make-test-alist n-entries) fix:= fix:<))

;;; Creates a rb-tree of <size> entries and then measures the time to
;;; perform n-lookups on it.  Specialized to fixnum-only arithmetic.

(define (time-rb-tree size n-lookups)
  (let ((test-rb-tree (make-test-rb-tree size)))
    (show-time
     (lambda ()
       (do ((i   0 (fix:+ i 1))
            (idx 0 (if (fix:> idx size)
                       0
                       (fix:+ idx 1)))
            (answer '() (rb-tree/lookup test-rb-tree idx #f)))
           ((fix:>= i n-lookups) answer))))))

1 ]=> (do ((i 0 (+ i 1))) ((not (< i 10))) (time-rb-tree 10 100000000))

;process time: 3910 (3910 RUN + 0 GC); real time: 3908
;process time: 3810 (3810 RUN + 0 GC); real time: 3805
;process time: 4090 (4090 RUN + 0 GC); real time: 4090
;process time: 3970 (3970 RUN + 0 GC); real time: 3967
;process time: 4060 (4060 RUN + 0 GC); real time: 4051
;process time: 3980 (3980 RUN + 0 GC); real time: 3979
;process time: 4040 (4040 RUN + 0 GC); real time: 4040
;process time: 4090 (4090 RUN + 0 GC); real time: 4094
;process time: 3810 (3810 RUN + 0 GC); real time: 3810
;process time: 4090 (4090 RUN + 0 GC); real time: 4092
;Value: #t

1 ]=> (do ((i 0 (+ i 1))) ((not (< i 10))) (time-rb-tree 100 100000000))

;process time: 7700 (7700 RUN + 0 GC); real time: 7720
;process time: 7760 (7760 RUN + 0 GC); real time: 7767
;process time: 7700 (7700 RUN + 0 GC); real time: 7710
;process time: 7890 (7890 RUN + 0 GC); real time: 7893
;process time: 7920 (7920 RUN + 0 GC); real time: 7914
;process time: 7650 (7650 RUN + 0 GC); real time: 7646
;process time: 7740 (7740 RUN + 0 GC); real time: 7738
;process time: 7760 (7760 RUN + 0 GC); real time: 7761
;process time: 7670 (7670 RUN + 0 GC); real time: 7671
;process time: 8140 (8140 RUN + 0 GC); real time: 8136
;Value: #t

1 ]=> (do ((i 0 (+ i 1))) ((not (< i 10))) (time-rb-tree 1000 100000000))

;process time: 13130 (13130 RUN + 0 GC); real time: 13124
;process time: 13160 (13160 RUN + 0 GC); real time: 13153
;process time: 13150 (13150 RUN + 0 GC); real time: 13149
;process time: 13140 (13140 RUN + 0 GC); real time: 13140
;process time: 13310 (13310 RUN + 0 GC); real time: 13304
;process time: 13170 (13170 RUN + 0 GC); real time: 13172
;process time: 13140 (13140 RUN + 0 GC); real time: 13167
;process time: 13250 (13250 RUN + 0 GC); real time: 13238
;process time: 13300 (13300 RUN + 0 GC); real time: 13318
;process time: 13420 (13420 RUN + 0 GC); real time: 13416
;Value: #t

And wt-trees while we're at it:

(define (make-test-wt-tree n-elements)
  (alist->wt-tree number-wt-type (make-test-alist n-elements)))

(define (time-wt-tree size n-lookups)
  (let ((test-wt-tree (make-test-wt-tree size)))
    (show-time
     (lambda ()
       (do ((i   0 (fix:+ i 1))
            (idx 0 (if (fix:> idx size)
                       0
                       (fix:+ idx 1)))
            (answer '() (wt-tree/lookup test-wt-tree idx #f)))
           ((fix:>= i n-lookups) answer))))))
1 ]=> (do ((i 0 (+ i 1))) ((not (< i 10))) (time-wt-tree 100 100000000))

;process time: 6400 (6400 RUN + 0 GC); real time: 6397
;process time: 6740 (6740 RUN + 0 GC); real time: 6736
;process time: 6760 (6760 RUN + 0 GC); real time: 6763
;process time: 6070 (6070 RUN + 0 GC); real time: 6068
;process time: 6450 (6450 RUN + 0 GC); real time: 6461
;process time: 6800 (6800 RUN + 0 GC); real time: 6812
;process time: 6330 (6330 RUN + 0 GC); real time: 6346
;process time: 6060 (6060 RUN + 0 GC); real time: 6066
;process time: 6050 (6050 RUN + 0 GC); real time: 6039
;process time: 6300 (6300 RUN + 0 GC); real time: 6303
;Value: #t


31 Jan 2016 5:28am GMT

30 Jan 2016

feedPlanet Lisp

Joe Marshall: Alist vs. hash-table

An alist is a simple data structure that holds key-value pairs in a linked list. When a key is looked up, the list is searched to find it. The time it takes is proportional to the length of the list, or the number of entries.

A hash-table is a more complex data structure that holds key-value pairs in a set of "hash buckets". When a key is looked up, it is first "hashed" to find the correct bucket, then that bucket is searched for the entry. The time it takes depends on a number of things, the hash algorithm, the number of buckets, the number of entries in the bucket, etc. A hash-table can be faster than an alist because the hashing step is quick and the subsequent search step will have very few entries to search.

In theory, an alist takes time proportional to the number of entries, but a hash-table takes constant time independent of the number of entries. Let's find out if this is true for MIT/GNU Scheme.

I wrote a little program that measures how long it takes to look things up in an alist vs. a hash table. Here's what I measured:

It does indeed seem that alists are linear and hash tables are constant in lookup time. But the hashing step of a hash table does take time, so short alists end up being faster that hash tables. The breakeven point looks like a tad over 25 elements. So if you expect 25 or fewer entries, an alist will perform better than a hash table. (Of course different implementations will have different break even points.)

A tree data structure is slightly more complex than an alist, but simpler than a hash table. Looking up an entry in a tree takes time proportional to the logarithm of the number of entries. The logarithm function grows quite slowly, so a tree performs pretty well over a very large range of entries. A tree is slower than an alist until you have about 15 entries. At this point, the linear search of an alist cannot compete with the logarithmic search of a tree. The time it takes to search a tree grows, but quite slowly. It takes more than 100 entries before a tree becomes as slow as a hash table.

With a big enough tree, the growth is so slow that you can pretend it is constant.

30 Jan 2016 5:08am GMT

29 Jan 2016

feedPlanet Lisp

drmeister: Improving the Clasp programming experience

I've been working quietly to improve the Clasp programming experience by implementing a C++ scraper for Clasp that extracts all information from the C++ code required to expose that code to Common Lisp.

This means functions, classes, methods, symbols, enums and initialization functions are identified by the scraper and code is automatically generated to expose these entities to the Common Lisp programming environment. This replaces thousands of calls that were hand coded by me and were called at startup to expose C++ functionality to Common Lisp with automatically generated code. To expose a C++ function now all you do is some variation on this:

namespace ext {
CL_LAMBDA(x y &optional (z 0));
CL_DOCSTRING(R"doc(* Arguments
- x :: A fixnum
- y :: A fixnum
- z :: A fixnum
* Description
Add up to three fixnums together as long as they fit into an int.
If they don't, an error will be signaled.)doc");
CL_DEFUN int add_numbers(int x, int y, int z) {
return x + y;
}

The CL_LAMBDA(…), CL_DOCSTRING(…), and CL_DEFUN tags are read from the C++ source code by the scraper and code is generated to enable things like (ext:add-numbers 4 5) to be called from Common Lisp. Above is a complicated example using the optional CL_LAMBDA(…) and CL_DOCSTRING(…) macros - something as simple as below is all that is necessary to bind the C++ function to the ext:add-two-numbers symbol:

CL_DEFUN int ext__add_two_numbers(int x, int y) {return x+y;}

The C-preprocessor is run on all of the source code with macros defined for tags like CL_DEFUN, CL_DEFMETHOD, CL_DOCSTRING(…), LISP_CLASS(…) etc. prior to the scraper searching for the tags. The scraper then generates code that is run at startup to expose everything. The scraper is run every time Clasp is built but it's smart about only rebuilding generated code that needs to be rebuilt.

There are several benefits to this: (1) adding functions, classes, methods etc. to Clasp Common Lisp is easier now because you just add the entity and there are no more worries about where to add the call at startup to expose the entity. (2) The scraper builds a database of source location for all of these entities. So now, from Slime you can hit M-. on a Clasp symbol and emacs will jump to the source in the C++ source or Common Lisp source - wherever it is defined.

At the same time I've added code to extract and generate lambda-lists and docstrings for C++ functions and methods and they are also available now within the Common Lisp environment.

The scraper is written in standard Common Lisp and sbcl has been added as a build dependency for Clasp. The python scrapers Clasp used up to this point have all been retired and python is no longer a dependency for Clasp.


29 Jan 2016 4:46pm GMT

28 Jan 2016

feedPlanet Lisp

Joe Marshall: Latest amusement

Here's a procedure that CDRs down a list until it hits the end:

(define (last-cdr list)
  (let loop ((tail list))
    (if (pair? tail)
        (loop (cdr tail))
        tail)))

The inner loop of the procedure compiles to this:

loop-3:
 (cmp q (r 7) (@r 6))         ;; Interrupt check
 (jge (@pcr interrupt-12))

 (mov q (r 0) (@r 4))         ;; Fetch 'tail'

 (mov q (r 1) (r 0))          ;; Extract the tag
 (shr q (r 1) (&u; #x3a))

 (cmp b (r 1) (&u; 1))         ;; Check for pair?, jump if not
 (jne (@pcr label-14))

 (and q (r 0) (r 5))          ;; Mask the tag

 (mov q (r 1) (@ro 0 8))      ;; Fetch the CDR

 (mov q (@r 4) (r 1))         ;; Assign 'tail'

 (jmp (@pcr loop-3))          ;; Do it again.

On my laptop, each iteration of this loop takes a few nanoseconds.

I noticed that there seems to be a lot more going on than CDRing down a list. The interrupt check is responsible for a lot of the overhead. MIT/GNU Scheme polls for interrupts. The compiler inserts an interrupt check at every procedure entry and backwards branch. This ensures that only a small, bounded number of instructions can run between interrupt polls. The interrupt poll and the heap check are simultaneous because of a hack. To interrupt Scheme, you save the "Top of Heap" pointer and set it to zero. This causes the heap check to fail as if there were an out of memory condition. The out of memory handler first checks to see if this was actually an interrupt masquerading as an out of memory, and restores the heap pointer if so.

The two instructions,

(cmp q (r 7) (@r 6))         ;; Interrupt check
 (jge (@pcr interrupt-12))

perform the check. The first compares register 7 with the contents of memory that register 6 points at. The convention for the compiler is that register 7 contains the free pointer and register 6 holds the address of MEMTOP. The interrupt check does a memory cycle.

The reason you poll for interrupts is that you need the virtual machine to be in a coherent state because the interrupt could attempt to run arbitrary code. The garbage collector in particular has to be able to parse the machine state
at an interrupt. Interrupts are polled at apply time. Before application, the interrupts are checked. If an interrupt is to be processed, a continuation is pushed that will do the original application, and an interrupt handler is applied instead.

In MIT-Scheme, the virtual machine is a stack machine, and at apply time the entire state of the virtual machine is pointed to by the stack pointer. This is a good state to be in when you handle interrupts or garbage collect. But this means that arguments are passed on the stack. This instruction:

(mov q (r 0) (@r 4))

loads register 0 with the contents of memory at register 4. (The compiler convention is that register 4 is the stack pointer.) In other words, it is fetching the value of the argument tail from the stack. This is the second memory cycle.

Between interrupt polls, the compiled code can freely manipulate Scheme objects without worrying about being embarrassed by the garbage collector. But at apply time, when a possible interrupt poll could happen, the compiled code must put the virtual machine back into a coherent state. In particular, modified values are stored back on the stack.I This instruction just before the jump puts the value of tail back on the stack before jumping to the top of the loop.

(mov q (@r 4) (r 1)) 

That's three memory cycles in addition to the actual fetching of the CDR in this instruction:

(mov q (r 1) (@ro 0 8))

Of course these are quite inexpensive memory cycles because the top of stack is cached, but there is at least the time time it takes to validate the cache entry.

The interrupt poll occurs every time around the loop, so we copy the arguments back and forth between the stack and registers on each iteration. If we unroll the loop we can avoid some of the copying:

(define (last-cdr list)
  (let loop ((tail list))
    (if (pair? tail)
        (loop (cdr tail))
        tail)))

(define (last-cdr2 list)
  (let loop ((tail list))
    (if (not (pair? tail))
        tail
        (let ((tail (cdr tail)))
          (if (not (pair? tail))
              tail
              (loop (cdr tail)))))))

The loop in last-cdr2 compiles to this:

loop-6:
 (cmp q (r 7) (@r 6))           ;; Interrupt check
 (jge (@pcr interrupt-15))  

 (mov q (r 0) (@r 4))           ;; Fetch 'tail'

 (mov q (r 1) (r 0))            ;; Extract the tag
 (shr q (r 1) (&u; #x3a))

 (cmp b (r 1) (&u; 1))           ;; Check for pair?
 (jne (@pcr label-17))

 (and q (r 0) (r 5))            ;; Mask the tag

 (mov q (r 1) (@ro 0 8))        ;; Fetch the CDR

 (mov q (@r 4) (r 1))           ;; Assign 'tail'

 (mov q (r 0) (r 1))
 (shr q (r 0) (&u; #x3a))        ;; Extract the tag

 (cmp b (r 0) (&u; 1))           ;; Check for pair?
 (jne (@pcr label-19))

 (and q (r 1) (r 5))            ;; Mask the tag

 (mov q (r 0) (@ro 1 8))        ;; Fetch the CDR

 (mov q (@r 4) (r 0))           ;; Assign 'tail'

 (jmp (@pcr loop-6))            ;; Do it again

If I count correctly, there are six memory cycles per iteration, two of which are CDRs. We also avoid a single jmp instruction per iteration.

Here's the timing on my machine:

(test-last-cdr)
3.643 nanoseconds per cdr
3.643 nanoseconds per cdr
3.711 nanoseconds per cdr
3.643 nanoseconds per cdr
3.576 nanoseconds per cdr

(test-last-cdr2)
2.766 nanoseconds per cdr
2.699 nanoseconds per cdr
2.834 nanoseconds per cdr
2.699 nanoseconds per cdr



28 Jan 2016 5:29am GMT

26 Jan 2016

feedPlanet Lisp

Michael Forster: Hunchentools

I've published Hunchentools, my utility library for Hunchentoot. Hunchentools is not yet available for download via quicklisp.

Hunchentools includes functions to streamline aborting of request handling in common situations, functions to create dispatchers contingent upon request methods, functions for escaping strings, and various functions related to session security and authentication.

External documentation and examples are lacking at this point, but all exported symbols have associated documentation strings.

26 Jan 2016 11:30pm GMT