16 Apr 2018

feedPlanet Lisp

Lispjobs: Senior Lisp Developer, RavenPack, Marabella, Spain

See: https://www.ravenpack.com/careers/common-lisp-developer

As a Common Lisp Developer, you will be part of the Analytics team which is in charge of the development and maintenance of applications that, among other things, extract data from incoming news and deliver user and machine-friendly analytics to customers.

You will be reporting directly to the Analytics Manager and will work with an international team of developers skilled in Common Lisp, Java, Python and SQL.

The ability to communicate effectively in English both in writing and verbally is a must. Knowledge of Spanish is not a business requirement. European Union legal working status is required. Competitive compensation and a fun working environment. Relocation assistance is available, but remote working is not a possibility for this position.

16 Apr 2018 4:57pm GMT

Lispjobs: Junior Lisp Developer, RavenPack, Marbella, Spain

See: https://www.ravenpack.com/careers/junior-common-lisp-developer

At RavenPack we are searching for a Junior Common Lisp Developer to join RavenPack's Development Team.

As a Junior Common Lisp Developer, you will be part of the Analytics team which is in charge of the development and maintenance of applications that, among other things, extract data from incoming news and delivers machine-friendly analytics to customers.

You will be reporting directly to the Analytics Manager and will work with an international team of developers skilled in Common Lisp, Java, Python and SQL.

The ability to communicate effectively in English both in writing and verbally is a must. Knowledge of Spanish is not a business requirement. European Union legal working status is required. Competitive compensation and a fun working environment. Relocation assistance available, but working remotely is not possible for this position.

16 Apr 2018 4:56pm GMT

02 Apr 2018

feedPlanet Lisp

Luís Oliveira: Reddit 1.0

As many Lisp programmers may remember, Reddit was initially written in Common Lisp. Then, in late 2005, it was rewritten in Python. Much discussion ensued. Steve Huffman (aka spez, current CEO of Reddit) wrote about why they switched to Python. The late Aaron Swartz also wrote Rewriting Reddit, focusing on the Python side of things.

Last week, that original, Common Lisp version of Reddit was published on GitHub: reddit-archive/reddit1.0. There's no documentation, but we know from the articles above that it ran on CMUCL. reddit.asd is perhaps a good entry point, where we can see it used TBNL, CL-WHO and CLSQL, a common toolset at the time.

It's missing various bits of static HTML, CSS and JS, so I don't think you can actually run this website without a fair bit of scrapping and stitching from the Wayback Machine. The code is quite readable and clean. It's not terribly interesting, since it's a fairly simple website. Still, it's a website that grew to become the 6th most visited worldwide (with 542 million monthly visitors) and now has 230 employees, says Wikipedia, so it's nice to see what it looked like at the very beginning.

02 Apr 2018 1:03pm GMT

28 Mar 2018

feedPlanet Lisp

Quicklisp news: Quicklisp dist update for March 2018 now available

New projects:

Updated projects: array-operations, cells, cepl, cepl.spaces, checkl, chipz, cl+ssl, cl-algebraic-data-type, cl-bplustree, cl-colors, cl-dbi, cl-feedparser, cl-fond, cl-hamcrest, cl-ledger, cl-libuv, cl-marshal, cl-mpi, cl-online-learning, cl-opengl, cl-openstack-client, cl-protobufs, cl-random, cl-random-forest, cl-rmath, cl-scsu, cl-selenium-webdriver, cl-store, cl-toml, cl-unicode, clack, closer-mop, computable-reals, croatoan, dbd-oracle, dexador, easy-audio, eazy-gnuplot, esrap, extended-reals, fare-memoization, femlisp, flexi-streams, folio2, harmony, humbler, lass, latex-table, lispbuilder, listopia, lla, lucerne, maiden, matlisp, mcclim, mito, nineveh, omer-count, opticl, overlord, parachute, parser.common-rules, parser.ini, postmodern, qlot, random-state, rtg-math, scalpl, serapeum, shadow, staple, stumpwm, trivial-gray-streams, trivialib.bdd, utils-kt, varjo.

Removed projects: cl-cudd, lisp-interface-library, plain-odbc, quux-hunchentoot.

The removed projects no longer build.

To get this update, use (ql:update-dist "quicklisp").


28 Mar 2018 2:25pm GMT

11 Mar 2018

feedPlanet Lisp

Nicolas Hafner: Geometry Clipmaps - Gamedev

A while back I attempted to implement a technique called Geometry Clipmaps in Trial. This technique is used to stream in terrain data for scenery where the terrain is far too large to be stored in memory all at once, let alone be rendered at full resolution. In this entry I'll go over the idea of the technique and the problems I had implementing it.

First, the technique is described in two papers: Geometry Clipmaps: Terrain Rendering Using Nested Regular Grids by Hoppe and Losasso [1], and Terrain Rendering Using GPU-Based Geometry Clipmaps by Hoppe and Asirvatham [2]. Neither of the papers offer any public source code, or much of any code at all to describe the details of the technique, especially in relation to the update mechanics. This is a problem, because a lot of the details of the papers are very strange indeed.

Before I get into that though, let me summarise the idea of the technique, which is fairly simple: since we cannot draw the full terrain, we need to reduce the detail somehow. Luckily for us, how much detail we can actually see in a rendered scene decreases with distance thanks to perspective skewing. Thus, the idea is that, the farther away the terrain geometry is from the camera, the coarser we render it.

To do this we create a square ring, whose inner sides are half the size of its outer sides. This gives us a power of two scaling, where we can simply render the same ring within itself over and over, scaling it down by two every time. Once we have sufficiently many rings, we simply fill the hole with a remaining square grid. Here's an image to illustrate the idea:


As you can see, the detail gets exponentially more coarse the farther away from the centre you go. If we look at this from the perspective of a camera that is properly centred, we can see the effect even better:


Despite the geometry becoming more coarse, it takes up a similar amount of actual pixels on screen. Now, the geometry I used here is rather simple. In the papers, they use a much more complex layout in order to save on vertices even more aggressively:


This, especially the interior trim, makes things a lot more painful to deal with. I'm quite sure the trim is a necessary evil in their scheme, as the ring is no longer a power of two reduction, but they still need the vertices to align properly. I can see no other reason for the trim to exist. Worse than that though, I can absolutely not understand why they specify four variants of the trim, one for each corner, when the largest trim is going to offset the rest of the geometry so much that even if every other ring uses the opposite trim, you cannot correct the shift enough to make things exactly centre again. If someone understands this, please let me know.

Anyway, now that we have these rings established, we need to look at how to actually render geometry with them. The idea here is to use height maps and dynamically shift the vertices' height in the vertex shader according to the brightness of the corresponding pixel in the height map. Since each ring has the same vertex resolution, we can use a fixed resolution texture for each level as well, saving on texture memory.

Let's look at an example for that. In order to generate the textures, we need a base heightmap at a resolution of (n*2^(r-1))^2, where n is the resolution of a clipmap, and r is the number of rings. In this example I used a clipmap resolution of 64 and 4 levels, meaning I needed a base heightmap resolution of 512^2. For the innermost level you simply crop out the center at the clipmap resolution. For the next level you crop out twice as much, and then scale to half, and so on.


The above are scaled up again using nearest-neighbour to make it better visible to the human eye. Displacing the rings by each corresponding map results in the following image. I've increased the clipmap resolution to 64 to match that of the textures.


Looks pretty good, eh? You might notice some issues arising on the edges between clipmap levels though. Namely, some vertices don't match up quite exactly. This is due to smoothing. Since each clipmap level has the same resolution, some of the pixels at the border are going to be too precise. In order to fix this we need to blend vertices as they get closer to the border. To do this we look at the distance of the vertex to the center and, depending on its magnitude, linearly blend between the current level and the next outer one. This means that for each clipmap level we draw, we need to have access to both the current level's heightmap and the next level's.

In the original paper they do this with an insane trick: they encode each heightmap value as a float and stuff the current level's height into the float's integer part, and the next level's into the fractional part. I don't know how this could ever be a good idea given floating point imprecision. I could understand stuffing both numbers into a single int, but this is crazy.

The way I do this in my current approach is different. I represent all the clipmap levels as a single 2D array texture, where each level in the texture represents one clipmap level. We can then easily access two levels, at the cost of two texture lookups of course. I could also see the encoding strategy working by limiting the depth to 16 bit and using 32 bit integer textures to stuff both levels into one. For now though, I'm not that concerned about performance, so optimisations like that seem folly.

Now, keep in mind that in order to figure out the proper smoothing, we need to get the exact position of the vertex of the inner clipmap in the texture of the outer clipmap. This is not that big of a deal if your clipmaps are perfectly aligned and centred, as you can simply scale by 0.5. However, if you implement the clipmap geometry like they suggest in the paper, with the weird trim and all, now suddenly your geometry is slightly shifted. This caused me no end of grief and I was never able to get things to match up perfectly. I don't know why, but I suspect it's something about not doing the calculation quite exactly right, or something about the interpolation causing slight errors that nevertheless remain visible.

I might rewrite the clipmaps using perfectly centred rings as I outlined above, just to see whether I can get it working perfectly, as I really cannot see that huge of an advantage in their approach. To reiterate, the reason why they do it they way they do is, according to my understanding, to find a good balance between number of vertices to keep in memory, and number of draw calls to issue to draw a single geometry layer. If I used perfectly centred rings I would have to draw either 12 smallish square blocks, or one big ring. This is opposed to their technique, which requires 14 draw calls and three different sets of geometry.             Wait what? If that makes you confused, I'm with you. I don't understand it either.

Either way, once you have the interpolation between levels going you're almost there. You can now draw a static terrain with very low computational effort using geometry clipmaps. However, usually you want to have things move around, which means you need to update the clipmaps. And this is where I stopped, as I could not figure out what exactly to do, and could not find any satisfactory information about the topic anywhere else either.

First of all, in the papers they only use a map for the coarsest level layer, and predict the finer levels based on interpolation with a residual. This has two consequences: one, you need to perform a costly ahead of time calculation to compute the residual for the entire terrain. Two, I do not believe that this interpolatory scheme will give you sufficient amount of control over details in the map. It will allow you to save texture space, by only requiring the coarsest level plus a single residual layer, but again I do not believe that such extreme optimisations are necessary.

Second, they use a format called PCT (a precursor of JPEG, I believe) which allows region of interest decoding from a file, meaning you can only read out the parts you really need in order to update the clipmap textures. This is nice, but I'm not aware of any CL library that supports this format, or supports JPEG and allows region of interest decoding.

In order to explain the third problem, let's first look at what happens when we need to update the terrain. Say we have a clipmap resolution of 1cm, meaning each grid cell is 1cm long at the finest level. The player now moves by +3cm in x. This means we need to shift the innermost layer by -3cm in x direction and load in the new data for that into the texture. We also need to update the next layer, as its resolution is 2cm, and we have thus stepped into the next grid cell for it as well. We don't need to step any of the other layers, as they are 4cm, and 8cm resolution respectively, and are too coarse for an update yet. This is fine because they are farther and farther away anyhow and we don't notice the lack of detail change. The edge interpolation helps us out here too as it bridges over the gaps between the layers even when they don't match up exactly.

Now, the way they do this updating in the papers is by drawing into the clipmap texture. They keep a lookup position for the texture which keeps up with the camera position in the game world. When the camera moves they fill in the new detail ahead of the camera's movement direction by drawing two quads. The magic of this idea is that texture lookup happens with the repeat border constraint, meaning that if you go beyond the standard texture boundary, you wrap around to the other end and continue on. Thus you only ever need to update a tiny region of the texture map for the data that actually changed, without having to shift the rest of the data around in the texture. Since this is probably quite hard to imagine, here's another picture from their paper that illustrates the idea:


The region "behind" the camera is updated, but because how the map is addressed happens through wraparound, it appears in front of it. Still, this is quite complicated, especially when you keep in mind that each level is, according to the paper's geometry, shifted by the trim and all. I'm also wondering whether, with my array texture technique, it would be faster to simply replace an entire clipmap texture level and save yourself the cost of having to first upload the two squares to textures, and then perform a draw call into the proper clipmap texture to update it. Since you know that each level has a fixed size, you could even allocate all the storage ahead of time and use a file format that could be directly memory mapped in or something like that, and then just use glSubImage2D to upload it all at once. This is just speculation and musing on my part, so I don't know whether that would be a feasible simplification on current hardware.

So all in all I have my problems with how the update is described in the paper. There is another source for geometry clipmaps, namely this talk by Marcin Gollent about how terrain rendering works in REDEngine 3 for the Witcher 3. He only very, very briefly talks about the clipmaps (5:55-6:55), but it still confuses me to this day, especially the way the clipmaps are illustrated. Let me try to outline what I understand from it:

Each clipmap level has a resolution of 1024² vertices, and we have five levels in total. The map is divided up into tiles of a uniform size, each tile representing 512² vertices on the finest level, meaning you need four tiles to render the innermost level fully. So far so good, but my question is now: what happens if you need to shift, say, the fourth level clipmap by one vertex in x and y? If each tile maps to a separate file on disk, it would mean we need to load in, at worst due to border conditions, 64 files. That seems like a whole lot to me. It would make more sense if you keep the resolution of a tile the same for each level, meaning you always need at worst 8 tiles to update a level. Either way, you end up with a tonne of files. Even just representing their map at the finest level would require 46²=2116 files.

I'm not sure if that's just me being flabbergasted at AAA games, or if there's something I truly don't understand about their technique. Either way, this approach, too, looks like a whole boatload of work to get going, though it does look more feasible than that of the original GPU clipmap paper. So far I haven't implemented either, as I've just been stuck on not knowing what to do. If there's something I'm legitimately not understanding about these techniques I'd rather not dive head first into implementing a strategy that just turns out to be stupid in the end.

I'd love to ask some people in the industry that actually worked on this sort of stuff to clear up my misunderstandings, but I couldn't for the life of me find a way to contact Marcin Gollent. I'm not surprised they don't put their email addresses up publicly, though.

If you have some ideas or questions of your own about this stuff, please do let me know, I'd be very interested in it. I definitely want to finish my implementation some day, and perhaps even develop some tooling around it to create my own heightmaps and things in Trial.

11 Mar 2018 4:30pm GMT

05 Mar 2018

feedPlanet Lisp

McCLIM: Sheets as ideal forms

CLIM operates on various kinds of objects. Some are connected by an inheritance, other by a composition and some are similar in a different sense.

As programmers, we often deal with the inheritance and the composition, especially since OOP is a dominating paradigm of programming (no matter if the central concept is the object or the function). Not so often we deal with the third type of connection, that is the Form and the phenomena which are merely a shadow mimicking it[1].

Let us talk about sheets. The sheet is a region[2] with an infinite resolution and potentially infinite extent on which we can draw. Sheets may be arranged into hierarchies creating a windowing system with a child-parent relation. Sheet is the Form with no visual appearance. What we observe is an approximation of the sheet which may be hard to recognize when compared to other approximations.

[1]: Theory of forms.

[2]: And many other things.

Physical devices

Sheet hierarchies may be manipulated without a physical medium but to make them visible we need to draw them. CLIM defines ports and grafts to allow rooting sheets to a display server. medium is defined to draw sheet approximation on the screen[3].

How should look a square with side length 10 when drawn on a sheet? Since it doesn't have a unit we can't tell. Before we decide on a unit we choose we need to take into consideration at least the following scenarios:

  1. If we assume device-specific unit results may drastically differ (density may vary from say 1200dpi on a printer down to 160dpi on a desktop[4]!. The very same square could have 1cm on a paper sheet, 7cm and 10cm on different displays and 200cm on a terminal. From the perspective of a person who uses the application, this may be confusing because physical objects don't change size without a reason.

  2. Another possible approach is to assume the physical world distances measured in millimeters (or inches if you must). Thanks to that object will have a similar size on different mediums. This is better but still not good. We have to acknowledge that most computer displays are pixel based. Specifying distances in millimeters will inevitably lead to worse drawing quality[5] (compared to the situation when we use pixels as the base unit). Moreover, conversion from millimeter values to the pixel will most inevitably lead to work on floats.

  3. Some applications may have specific demands. For instance, application is meant to run on a bus stop showing arrival times. Space of the display is very limited and we can't afford approximation from the high-density specification (in pixels or millimeters) to 80x40 screen (5 lines of 8 characters with 5x8 dots each). We need to be very precise and the ability to request a particular unit size for a sheet is essential. Of course, we want to test such application on a computer screen.

I will try to answer this question in a moment. First, we have to talk about the CLIM specification and limitations imposed by McCLIM's implementation of grafts.

[3]: See some general recommendations.

[4]: Technically it should be PPI, not DPI (pixels per inch).

[5]: Given programmer specifies sheet size in integers (like 100x100).

Ports, grafts, and McCLIM limitations

If a port is a physical connection to a display server then graft is its screen representation. The following picture illustrates how the same physical screen may be perceived depending on its settings and the way we look at it.

Graft drawing

As we can see graft has an orientation (:default starts at the top-left corner like a paper sheet and :graphics should start at the bottom left corner like on the chart). Moreover, it has units. Currently, McCLIM will recognize :device, :inches, :millimeters and :screen-sized [11].

That said McCLIM doesn't implement graft in a useful way. Everything is measured in pixels (which :device units are assumed to be) and only the :default orientation is implemented. By now we should already know that pixels are a not a good choice for the default unit. Also, programmer doesn't have means to request unit for a sheet (this is the API problem).

[11]: Screen size unit means that the coordinate [1/2, 1/2] is exactly in the middle of the screens.

Physical size and pixel size compromise

We will skip the third situation, for now, to decide what unit should we default to. There are cognitive arguments for units based on a real-world distance but there are also and practical reasons against using millimeters - poor mapping to pixels and the fact that CLIM software which is already written is defined with the assumption that we operate on something comparable to a pixel.

Having all that in mind the default unit should be device-independent pixel. One of McCLIM long-term goals is to adhere to Material Design guidelines - that's why we will use dip[6] unit. 100dp has absolute value 15.875mm. On 160dpi display it is 100px, on 96dpi display it is 60px, on 240dpi display it is 150px etc. This unit gives us some nice characteristics: we are rather compatible with the existing applications and we preserving absolute distances across different screens.

[6]: Density-independent pixel.

How to draw a rectangle on the medium

Application medium may be a pixel-based screen, paper sheet or even a text terminal. When the programmer writes the application he operates on dip units which have absolute value 0.15875mm. It is the McCLIM responsibility to map these units onto the device. To be precise each graft needs to hold an extra transformation which is applied before sending content to the display server.

Now we will go through a few example mappings of two rectangle borders[7] drawn on a sheet. The violet rectangle coordinates are [5,5], [22,35] and the cyan rectangle coordinates are [25,10], [30,15].

Graft drawing

Windows Presentation Foundation declared 96PPI screen's pixel being the device-independent pixel because such displays were pretty common on desktops. Almost all coordinates map perfectly on this screen. Notice the approximation of the right side of the violet rectangle.

Lower DPI



HQ Printer

Character terminal

It is time to deal with graphics orientation (Y-axis grows towards the top). An imaginary plotter with 80DPI resolution will be used to illustrate two solutions (the first one is wrong!). Knowing the plotter screen height is important to know where we start the drawing.

Plotter (bad transformation)

80DPI and MDPI plotters

[7]: the Ideal border is composed of lines which are 1-dimensional objects which doesn't occupy any space. Red border in drawings marks "ideal" object boundaries. Points are labeled in device units (with possible fractions).

Sheets written with a special device in mind

There is still an unanswered question - how can we program applications with a specific device limitations in mind? As we have discussed earlier default sheet unit should be dip and default sheet orientation is the same as a paper sheet's[8].

Writing an application for a terminal is different than writing an application for a web browser. The number of characters which fit on the screen is limited, drawing shapes is not practical etc. To ensure that the application is rendered correctly we need a special kind of sheet which will operate on units being characters. Take a look at the following example.

Sheets with different units.

The first sheet base unit is a character of a certain physical size. We print some information on it in various colors. Nothing prevents us from drawing a circle[9] but that would miss the point. We use a character as a unit because we don't want rounding and approximation. Background and foreground colors are inherited.

The second sheet base unit is dip. It has two circles, solid grey background and a few strings (one is written in italic).

Ideally, we want to be able to render both sheets on the same physical screen. The graft and the medium will take care of the sheet approximation. The effect should look something like this.

Different kinds of screens.

The first screen is a terminal. Circles from the second sheet are approximated with characters and look ugly but the overall effect resembles the original application. Proportions are preserved (to some degree). We see also the terminal-specialized sheet which looks exactly as we have defined it.

The second screen is mDPI display. The second sheet looks very much like something we have defined. What is more interesting though is our first sheet - it looks exactly the same as on the terminal.

[8]: Providing means to change defaults requires additional thought and is a material for a separate chapter. McCLIM doesn't allow it yet.

[9]: As we have noted sheets are ideal planar spaces where line thickness is 0 and there is nothing preventing us from using fractions as coordinates.

Port and graft protocols

Now we know what we want. Time to think about how to achieve it. Let me remind you what kind of objects we are dealing with:

In the next post I will show how to implement the port and the graft (and a bit of the medium) for the charming backend. I will cover only bare minimum for mediums important to verify that graft works as expected. Complete medium implementation is the material for a separate post.

[10]: Sometimes we deal with devices which we can't take input from - for instance a printer, a PDF render or a display without other peripherals.

05 Mar 2018 12:00am GMT

04 Mar 2018

feedPlanet Lisp

Quicklisp news: Download stats for February 2018

Here are the raw download stats for projects in February.

20308  alexandria
16702 closer-mop
15724 babel
15563 cl-ppcre
15531 split-sequence
15002 trivial-features
14598 iterate
13926 trivial-gray-streams
13822 bordeaux-threads
13786 anaphora
12948 let-plus
12777 cffi
12329 trivial-garbage
11742 flexi-streams
10768 nibbles
10530 puri
10329 usocket
9602 trivial-backtrace
9316 more-conditions
9051 chunga
8828 cl-fad
8794 cl+ssl
8641 cl-base64
8295 esrap
8228 chipz
8011 utilities.print-items
7860 named-readtables
7510 drakma
6918 local-time
6803 cl-yacc
6634 ironclad
5743 asdf-flv
5588 parse-number
5555 fiveam
5208 cl-json
5189 closure-common
5171 cxml
5151 log4cl
4556 optima
4476 parser.common-rules
4372 architecture.hooks
4204 cl-unicode
4101 plexippus-xpath
3920 cl-interpol
3779 lparallel
3551 lift
3520 cl-clon
3459 cl-dot
3299 trivial-types
3257 slime
3155 cl-syntax
3143 cl-annot
3136 cxml-stp
3098 cl-store
3000 fare-utils
2997 fare-quasiquote
2973 xml.location
2971 cl-utilities
2853 utilities.print-tree
2853 trivial-utf-8
2850 static-vectors
2814 fare-mop
2812 inferior-shell
2741 quri
2718 metabang-bind
2686 trivial-indent
2664 md5
2638 documentation-utils
2549 fast-io
2545 uuid
2542 array-utils
2378 plump
2275 djula
2275 symbol-munger
2271 cl-slice
2232 collectors
2225 gettext
2190 arnesi
2189 hunchentoot
2162 access
2151 cl-parser-combinators
2146 cl-locale
2028 simple-date-time
2019 ieee-floats
1889 prove
1831 yason
1603 asdf-system-connections
1588 metatilities-base
1574 cl-containers
1525 rfc2388
1471 postmodern
1460 fast-http
1441 trivial-mimes
1358 salza2
1338 monkeylib-binary-data
1334 cl-colors
1305 jonathan
1292 proc-parse
1277 xsubseq
1253 cl-ansi-text

04 Mar 2018 10:06pm GMT

03 Mar 2018

feedPlanet Lisp

Nicolas Hafner: Pipeline Allocation - Gamedev

This is a follow-up to my previous entry, where I talked about the shader pipeline system. However, I specifically excluded the way resources for shaders are allocated in that article. We'll get to that now, since I finally got around to rewriting that part of the engine to be more complete.

As you may recall, each rendering step is segregated into a shader stage. This stage may either render all on its own, render a scene directly, or influence the render behaviour of the scene. However, each stage will need buffers to draw to that can act as the inputs to future stages. This means that we have an allocation problem on our hands: ideally we'd like to minimise the number of buffers needed to render the entire pipeline, without having any of the stages accidentally draw into a buffer it shouldn't.

Let's look at the following pipeline, which consists of five stages:


The first stage renders the scene as normal, and produces a colour and depth buffer as a result. The second stage renders the scene but every object is drawn pitch black. It only produces a colour buffer, as depth is irrelevant for this. We then blend the two colour buffers together while radially blurring the black one. This results in a "god ray" effect. The next stage only renders transparent objects. This is usually done in order to avoid having to perform expensive transparency calculations on opaque objects. Finally the two stages are rendered together, using the depth and alpha information to blend. This produces the final image.

In a naive implementation, this would require seven buffers, five colour buffers and two depth buffers. However, you can do with just three colour buffers instead, by re-using the colour buffers from the first two stages for the fourth and fifth stages. This kind of allocation problem, where a graph consists of nodes with discrete input and output ports, has cropped up in multiple places for me. It is a form of a graph colouring problem, though I've had to invent my own algorithm as I couldn't find anything suitable. It works as follows:

  1. The nodes in the DAG are topologically sorted. We need to do this anyway to figure out the execution order, so this step is free.
  2. Count the number of ports in the whole graph and create a list of colours, where each colour is marked as being available.
  3. Iterate through the nodes in reverse order, meaning we start with the last node in the execution order.
  4. For each already coloured port in the node, if the port is noted as performing in-place updates, mark the colour as available. This is never the case for OpenGL textures, but it is useful for other allocation problems.
  5. For each input port in the node:
    1. If the port on the other side of the connection isn't coloured yet, pick the next available colour from the colour list.
    2. Set this colour as the colour of the other port.
    3. Mark the colour as unavailable.
  6. For each output port in the node:
    1. If the port isn't coloured yet, pick the next available colour from the colour list.
    2. Set this colour as the colour of the port.
    3. Mark the colour as unavailable.
  7. For each port in the node:
    1. If the node is coloured, mark the colour as available again.

This algorithm needs to be repeated separately for each incompatible buffer type. In our case, we need to run it once for the depth buffers, and once for the colour buffers. Since the algorithm might be hard to understand in text, here's a small animation showing it in action:


As far as I can tell, this algorithm performs the optimal allocation strategy in every case. I don't think it's amazingly fast, but since our graphs will be relatively small, and won't change very often, this is a negligible cost.

Unfortunately this allocation is not everything yet. In order to be able to figure out which buffers are compatible, we need to perform another analysis step beforehand. Namely, we need to figure out which texture specifications are "joinable". Some shader passes might have special requirements about their buffers, such as specific resolutions, internal formats, or interpolation behaviour. Thus, in order to know whether two buffers can actually potentially be shared, we first need to figure out whether the requirements can be joined together. This turns out to be a tricky problem.

Before we can get into this, I think it's worth explaining a bit more about how this stuff works in OpenGL. Two components are essential to this, namely textures and frame buffers. Any OpenGL context has a default frame buffer, which is typically the window you draw to. This frame buffer has a colour buffer, and a depth and stencil buffer if you activate those features. However, since the output of this frame buffer is internal to OpenGL, you can't capture it. If you need to re-process the contents, you'll have to use custom frame buffers. Each frame buffer has a number of attachments, to which you need to bind fitting textures. When the frame buffer is bound, a rendering call will then draw into the bound textures, rather than to the window.

Previously Trial simply automatically determined what kind of texture to use based on the window dimensions and the attachment the output was supposed to be for. Thus allocation was trivial as I just needed to run it once for each kind of attachment. However, this is not optimal. Certain effects, such as HDR require other features in textures, such as a specific internal format. Once you give the user the ability to completely customise all texture options things become difficult. Textures have the following attributes:

That's quite a few properties! Fortunately most properties are either trivially joinable, or simply incompatible. For instance, the following are incompatible:

target, pixel format, pixel type, magnification filter, minification filter, wrapping, storage

And the following are trivially joinable by just choosing the maximum:

samples, anisotropy

The biggest problems are the dimensions, and the internal format. I'm as of yet undecided what to do with the dimension constraints, since in some cases the constraint the user writes down might just be a lower bound advice and could be upgraded to a bigger size, but in others an exact size might be expected.

The internal format however is the real kick in the dick. OpenGL has a plethora of different internal formats that may or may not be joinable. Here's a list of some:

red, r8i, r16-snorm, rgb32ui, rgba16f

Not... too bad, right? Well, unfortunately that's not all. How about these formats?

rgba2, rgb5-a1, r3-g3-b2, rgb9-e5, rgb10-a2ui, srgb8-alpha8

Or how about these?

compressed-srgb-alpha-bptc-unorm, compressed-signed-rg-rgtc2

That's not even all of them, and even worse, the complete list does not cover all possible combinations! For instance, there is no rgb2 even though there's an rgba2. There's also no r6-g6-b5, and no rgb16-e9, etc. Some of these formats are also simply insane to consider. I mean, how is a r11f-g11f-b10f format supposed to work? That's supposedly 11 bits for red and green each, and 10 bits for blue, but as floats. What??

Putting the insanity of the formats aside, the task of joining them is, fortunately, somewhat clear: a format can be joined with another of equal or greater number of colour components. A format can be joined with another of equal or greater bits for each shared component. A format can be joined with another if the required colour type matches, and the required features such as compression and srgb match. To put this into examples:

You could argue that a non-compressed format could be joined with a compressed one, but for now I've taken the stance that, if the user really requests such a thing, the use is probably pretty special-purpose, so I'd rather not muck with it.

In order to do this in a somewhat sane manner that isn't explicitly listing all possible upgrades, I've written a machinery that can decompose the internal formats into a sensible plist that documents the precise requirements of the format. Based on this, a joining can then be performed much more easily. Finally, the resulting joined format is then recomposed into a matching internal format again. This last step is currently not optimal, as sometimes a join might result in an inexistent format, in which case further upgrading of the join might be necessary. However, for my current needs it is good enough, and I honestly doubt I'll ever reach the point where I need to touch this again.

Alright, so now that we know how to join internal formats, we can join texture specifications. Thus, as a step before the allocations, the system gathers all texture specifications from the ports, normalises them, and then repeatedly joins every spec with every other spec, replacing the two inputs with the result if it succeeds, or keeping it if it doesn't. At some point the set of specs won't change anymore, at which point we have the minimal texture spec set.

We then run the algorithm for each spec in this set, only considering ports whose spec can be joined with it. This results in a perfect allocation that respects the texture constraints each pass might have. A last, additional step could be performed here that I do not do (yet): since some ports might not get shared due to the allocation constraints, their texture specs could be relaxed again, so we would need to recompute the join for all ports in the same colouring group.

I've rewritten this system as part of the asset rewrite, since I thought that if I break things I might as well break them hard. Trial is not quite yet back on its feet right now, but I can at least compile a minimal pipeline and run a window, so it's very close to being functional again.

As a next step from here, I think I'll work some more on UI stuff. I know I promised an article on geometry clipmaps, and that is coming, but it'll be another while. Things have been surprisingly busy recently, and I've also been surprisingly lazy as of late. Hopefully both of those will improve soon!

03 Mar 2018 11:52am GMT

02 Mar 2018

feedPlanet Lisp

Lispjobs: Software Engineer (Clojure), Chatterbox Labs, UK

From this post: http://blog.chatterbox.co/recruiting-software-engineer/

Machine Learning; Artificial Intelligence; Natural Language Processing; Functional Programming; Java; Clojure; Lisp

Chatterbox Labs are looking for a Software Engineer to join our, UK-based team to help both implement our cognitive technologies. You should hold an MSc in the field of Computer Science (or a related/similar field) and have experience in a commercial software environment.

You will join our technical team who focus on the research and development of our multi-lingual Cognitive Engine for short form and long form text classification & our Synthetic Data Generator, alongside our Data Scientists.

This diverse role will allow you to both collaborate on machine learning research and to also implement commercial software.

Required Qualifications
Desirable Qualifications
Minimum Education Level

MSc in Computer Science or related field

Language Requirements
To apply

Please send an email to careers@chatterbox.co with a cover note introducing yourself and a bit about you, and attach a comprehensive CV in pdf format.

02 Mar 2018 11:34am GMT

28 Feb 2018

feedPlanet Lisp

Quicklisp news: Quicklisp dist update for February 2018 now available

New projects:

Updated projects: 3bgl-shader, ahungry-fleece, architecture.service-provider, asd-generator, bodge-blobs-support, bodge-chipmunk, cells, cepl, cepl.glop, cepl.sdl2, cepl.spaces, cl-algebraic-data-type, cl-ana, cl-ansi-term, cl-charms, cl-conllu, cl-csv, cl-emoji, cl-gobject-introspection, cl-mongo-id, cl-oclapi, cl-opengl, cl-pcg, cl-permutation, cl-readline, cl-skkserv, cl-soil, cl-str, cl-svg, claw, clml, closer-mop, clweb, clx, common-lisp-stat, configuration.options, croatoan, deflate, documentation-utils, doubly-linked-list, dufy, fare-scripts, fiasco, fiveam, flac-metadata, flac-parser, for, gamebox-dgen, gamebox-ecs, gamebox-frame-manager, gamebox-math, gamebox-sprite-packer, genie, glsl-spec, inquisitor, ironclad, iterate, jonathan, jsonrpc, kenzo, lisp-chat, lispbuilder, local-time, maiden, mcclim, md5, mgl, mito, nineveh, oclcl, osicat, overlord, paiprolog, parse-number, parsley, plump, pngload, postmodern, pp-toml, psychiq, puri, qtools-ui, quux-hunchentoot, remote-js, rtg-math, rutils, sanitized-params, serapeum, sha3, shadow, shorty, simple-logger, skitter, specialization-store, split-sequence, stumpwm, type-r, ubiquitous, varjo, verbose, with-setf.

Removed projects: algebraic-data-library, cl-curlex, cl-forest, cl-gpu, cl-indeterminism, cl-larval, cl-read-macro-tokens, cl-secure-read, cl-stm, defmacro-enhance, fs-utils, funds, lol-re, stump-touchy-mode-line, yaclanapht.

There is some extra churn this month. It's mostly because of the failure of cl-curlex, a library that uses SBCL internals and which no longer works as of the latest SBCL. Many projects (all by the same author) depend on cl-curlex, and they all broke. There hasn't been any response to my bug report yet.

The other removals are due to author request, neglect, or other breakage. If you'd like to restore a removed project to Quicklisp, let me know - I'm happy to help.

To get this update, use (ql:update-dist "quicklisp").


28 Feb 2018 6:31pm GMT