11 Mar 2018

feedPlanet Lisp

Nicolas Hafner: Geometry Clipmaps - Gamedev

A while back I attempted to implement a technique called Geometry Clipmaps in Trial. This technique is used to stream in terrain data for scenery where the terrain is far too large to be stored in memory all at once, let alone be rendered at full resolution. In this entry I'll go over the idea of the technique and the problems I had implementing it.

First, the technique is described in two papers: Geometry Clipmaps: Terrain Rendering Using Nested Regular Grids by Hoppe and Losasso [1], and Terrain Rendering Using GPU-Based Geometry Clipmaps by Hoppe and Asirvatham [2]. Neither of the papers offer any public source code, or much of any code at all to describe the details of the technique, especially in relation to the update mechanics. This is a problem, because a lot of the details of the papers are very strange indeed.

Before I get into that though, let me summarise the idea of the technique, which is fairly simple: since we cannot draw the full terrain, we need to reduce the detail somehow. Luckily for us, how much detail we can actually see in a rendered scene decreases with distance thanks to perspective skewing. Thus, the idea is that, the farther away the terrain geometry is from the camera, the coarser we render it.

To do this we create a square ring, whose inner sides are half the size of its outer sides. This gives us a power of two scaling, where we can simply render the same ring within itself over and over, scaling it down by two every time. Once we have sufficiently many rings, we simply fill the hole with a remaining square grid. Here's an image to illustrate the idea:


As you can see, the detail gets exponentially more coarse the farther away from the centre you go. If we look at this from the perspective of a camera that is properly centred, we can see the effect even better:


Despite the geometry becoming more coarse, it takes up a similar amount of actual pixels on screen. Now, the geometry I used here is rather simple. In the papers, they use a much more complex layout in order to save on vertices even more aggressively:


This, especially the interior trim, makes things a lot more painful to deal with. I'm quite sure the trim is a necessary evil in their scheme, as the ring is no longer a power of two reduction, but they still need the vertices to align properly. I can see no other reason for the trim to exist. Worse than that though, I can absolutely not understand why they specify four variants of the trim, one for each corner, when the largest trim is going to offset the rest of the geometry so much that even if every other ring uses the opposite trim, you cannot correct the shift enough to make things exactly centre again. If someone understands this, please let me know.

Anyway, now that we have these rings established, we need to look at how to actually render geometry with them. The idea here is to use height maps and dynamically shift the vertices' height in the vertex shader according to the brightness of the corresponding pixel in the height map. Since each ring has the same vertex resolution, we can use a fixed resolution texture for each level as well, saving on texture memory.

Let's look at an example for that. In order to generate the textures, we need a base heightmap at a resolution of (n*2^(r-1))^2, where n is the resolution of a clipmap, and r is the number of rings. In this example I used a clipmap resolution of 64 and 4 levels, meaning I needed a base heightmap resolution of 512^2. For the innermost level you simply crop out the center at the clipmap resolution. For the next level you crop out twice as much, and then scale to half, and so on.


The above are scaled up again using nearest-neighbour to make it better visible to the human eye. Displacing the rings by each corresponding map results in the following image. I've increased the clipmap resolution to 64 to match that of the textures.


Looks pretty good, eh? You might notice some issues arising on the edges between clipmap levels though. Namely, some vertices don't match up quite exactly. This is due to smoothing. Since each clipmap level has the same resolution, some of the pixels at the border are going to be too precise. In order to fix this we need to blend vertices as they get closer to the border. To do this we look at the distance of the vertex to the center and, depending on its magnitude, linearly blend between the current level and the next outer one. This means that for each clipmap level we draw, we need to have access to both the current level's heightmap and the next level's.

In the original paper they do this with an insane trick: they encode each heightmap value as a float and stuff the current level's height into the float's integer part, and the next level's into the fractional part. I don't know how this could ever be a good idea given floating point imprecision. I could understand stuffing both numbers into a single int, but this is crazy.

The way I do this in my current approach is different. I represent all the clipmap levels as a single 2D array texture, where each level in the texture represents one clipmap level. We can then easily access two levels, at the cost of two texture lookups of course. I could also see the encoding strategy working by limiting the depth to 16 bit and using 32 bit integer textures to stuff both levels into one. For now though, I'm not that concerned about performance, so optimisations like that seem folly.

Now, keep in mind that in order to figure out the proper smoothing, we need to get the exact position of the vertex of the inner clipmap in the texture of the outer clipmap. This is not that big of a deal if your clipmaps are perfectly aligned and centred, as you can simply scale by 0.5. However, if you implement the clipmap geometry like they suggest in the paper, with the weird trim and all, now suddenly your geometry is slightly shifted. This caused me no end of grief and I was never able to get things to match up perfectly. I don't know why, but I suspect it's something about not doing the calculation quite exactly right, or something about the interpolation causing slight errors that nevertheless remain visible.

I might rewrite the clipmaps using perfectly centred rings as I outlined above, just to see whether I can get it working perfectly, as I really cannot see that huge of an advantage in their approach. To reiterate, the reason why they do it they way they do is, according to my understanding, to find a good balance between number of vertices to keep in memory, and number of draw calls to issue to draw a single geometry layer. If I used perfectly centred rings I would have to draw either 12 smallish square blocks, or one big ring. This is opposed to their technique, which requires 14 draw calls and three different sets of geometry.             Wait what? If that makes you confused, I'm with you. I don't understand it either.

Either way, once you have the interpolation between levels going you're almost there. You can now draw a static terrain with very low computational effort using geometry clipmaps. However, usually you want to have things move around, which means you need to update the clipmaps. And this is where I stopped, as I could not figure out what exactly to do, and could not find any satisfactory information about the topic anywhere else either.

First of all, in the papers they only use a map for the coarsest level layer, and predict the finer levels based on interpolation with a residual. This has two consequences: one, you need to perform a costly ahead of time calculation to compute the residual for the entire terrain. Two, I do not believe that this interpolatory scheme will give you sufficient amount of control over details in the map. It will allow you to save texture space, by only requiring the coarsest level plus a single residual layer, but again I do not believe that such extreme optimisations are necessary.

Second, they use a format called PCT (a precursor of JPEG, I believe) which allows region of interest decoding from a file, meaning you can only read out the parts you really need in order to update the clipmap textures. This is nice, but I'm not aware of any CL library that supports this format, or supports JPEG and allows region of interest decoding.

In order to explain the third problem, let's first look at what happens when we need to update the terrain. Say we have a clipmap resolution of 1cm, meaning each grid cell is 1cm long at the finest level. The player now moves by +3cm in x. This means we need to shift the innermost layer by -3cm in x direction and load in the new data for that into the texture. We also need to update the next layer, as its resolution is 2cm, and we have thus stepped into the next grid cell for it as well. We don't need to step any of the other layers, as they are 4cm, and 8cm resolution respectively, and are too coarse for an update yet. This is fine because they are farther and farther away anyhow and we don't notice the lack of detail change. The edge interpolation helps us out here too as it bridges over the gaps between the layers even when they don't match up exactly.

Now, the way they do this updating in the papers is by drawing into the clipmap texture. They keep a lookup position for the texture which keeps up with the camera position in the game world. When the camera moves they fill in the new detail ahead of the camera's movement direction by drawing two quads. The magic of this idea is that texture lookup happens with the repeat border constraint, meaning that if you go beyond the standard texture boundary, you wrap around to the other end and continue on. Thus you only ever need to update a tiny region of the texture map for the data that actually changed, without having to shift the rest of the data around in the texture. Since this is probably quite hard to imagine, here's another picture from their paper that illustrates the idea:


The region "behind" the camera is updated, but because how the map is addressed happens through wraparound, it appears in front of it. Still, this is quite complicated, especially when you keep in mind that each level is, according to the paper's geometry, shifted by the trim and all. I'm also wondering whether, with my array texture technique, it would be faster to simply replace an entire clipmap texture level and save yourself the cost of having to first upload the two squares to textures, and then perform a draw call into the proper clipmap texture to update it. Since you know that each level has a fixed size, you could even allocate all the storage ahead of time and use a file format that could be directly memory mapped in or something like that, and then just use glSubImage2D to upload it all at once. This is just speculation and musing on my part, so I don't know whether that would be a feasible simplification on current hardware.

So all in all I have my problems with how the update is described in the paper. There is another source for geometry clipmaps, namely this talk by Marcin Gollent about how terrain rendering works in REDEngine 3 for the Witcher 3. He only very, very briefly talks about the clipmaps (5:55-6:55), but it still confuses me to this day, especially the way the clipmaps are illustrated. Let me try to outline what I understand from it:

Each clipmap level has a resolution of 1024² vertices, and we have five levels in total. The map is divided up into tiles of a uniform size, each tile representing 512² vertices on the finest level, meaning you need four tiles to render the innermost level fully. So far so good, but my question is now: what happens if you need to shift, say, the fourth level clipmap by one vertex in x and y? If each tile maps to a separate file on disk, it would mean we need to load in, at worst due to border conditions, 64 files. That seems like a whole lot to me. It would make more sense if you keep the resolution of a tile the same for each level, meaning you always need at worst 8 tiles to update a level. Either way, you end up with a tonne of files. Even just representing their map at the finest level would require 46²=2116 files.

I'm not sure if that's just me being flabbergasted at AAA games, or if there's something I truly don't understand about their technique. Either way, this approach, too, looks like a whole boatload of work to get going, though it does look more feasible than that of the original GPU clipmap paper. So far I haven't implemented either, as I've just been stuck on not knowing what to do. If there's something I'm legitimately not understanding about these techniques I'd rather not dive head first into implementing a strategy that just turns out to be stupid in the end.

I'd love to ask some people in the industry that actually worked on this sort of stuff to clear up my misunderstandings, but I couldn't for the life of me find a way to contact Marcin Gollent. I'm not surprised they don't put their email addresses up publicly, though.

If you have some ideas or questions of your own about this stuff, please do let me know, I'd be very interested in it. I definitely want to finish my implementation some day, and perhaps even develop some tooling around it to create my own heightmaps and things in Trial.

11 Mar 2018 4:30pm GMT

05 Mar 2018

feedPlanet Lisp

McCLIM: Sheets as ideal forms

CLIM operates on various kinds of objects. Some are connected by an inheritance, other by a composition and some are similar in a different sense.

As programmers, we often deal with the inheritance and the composition, especially since OOP is a dominating paradigm of programming (no matter if the central concept is the object or the function). Not so often we deal with the third type of connection, that is the Form and the phenomena which are merely a shadow mimicking it[1].

Let us talk about sheets. The sheet is a region[2] with an infinite resolution and potentially infinite extent on which we can draw. Sheets may be arranged into hierarchies creating a windowing system with a child-parent relation. Sheet is the Form with no visual appearance. What we observe is an approximation of the sheet which may be hard to recognize when compared to other approximations.

[1]: Theory of forms.

[2]: And many other things.

Physical devices

Sheet hierarchies may be manipulated without a physical medium but to make them visible we need to draw them. CLIM defines ports and grafts to allow rooting sheets to a display server. medium is defined to draw sheet approximation on the screen[3].

How should look a square with side length 10 when drawn on a sheet? Since it doesn't have a unit we can't tell. Before we decide on a unit we choose we need to take into consideration at least the following scenarios:

  1. If we assume device-specific unit results may drastically differ (density may vary from say 1200dpi on a printer down to 160dpi on a desktop[4]!. The very same square could have 1cm on a paper sheet, 7cm and 10cm on different displays and 200cm on a terminal. From the perspective of a person who uses the application, this may be confusing because physical objects don't change size without a reason.

  2. Another possible approach is to assume the physical world distances measured in millimeters (or inches if you must). Thanks to that object will have a similar size on different mediums. This is better but still not good. We have to acknowledge that most computer displays are pixel based. Specifying distances in millimeters will inevitably lead to worse drawing quality[5] (compared to the situation when we use pixels as the base unit). Moreover, conversion from millimeter values to the pixel will most inevitably lead to work on floats.

  3. Some applications may have specific demands. For instance, application is meant to run on a bus stop showing arrival times. Space of the display is very limited and we can't afford approximation from the high-density specification (in pixels or millimeters) to 80x40 screen (5 lines of 8 characters with 5x8 dots each). We need to be very precise and the ability to request a particular unit size for a sheet is essential. Of course, we want to test such application on a computer screen.

I will try to answer this question in a moment. First, we have to talk about the CLIM specification and limitations imposed by McCLIM's implementation of grafts.

[3]: See some general recommendations.

[4]: Technically it should be PPI, not DPI (pixels per inch).

[5]: Given programmer specifies sheet size in integers (like 100x100).

Ports, grafts, and McCLIM limitations

If a port is a physical connection to a display server then graft is its screen representation. The following picture illustrates how the same physical screen may be perceived depending on its settings and the way we look at it.

Graft drawing

As we can see graft has an orientation (:default starts at the top-left corner like a paper sheet and :graphics should start at the bottom left corner like on the chart). Moreover, it has units. Currently, McCLIM will recognize :device, :inches, :millimeters and :screen-sized [11].

That said McCLIM doesn't implement graft in a useful way. Everything is measured in pixels (which :device units are assumed to be) and only the :default orientation is implemented. By now we should already know that pixels are a not a good choice for the default unit. Also, programmer doesn't have means to request unit for a sheet (this is the API problem).

[11]: Screen size unit means that the coordinate [1/2, 1/2] is exactly in the middle of the screens.

Physical size and pixel size compromise

We will skip the third situation, for now, to decide what unit should we default to. There are cognitive arguments for units based on a real-world distance but there are also and practical reasons against using millimeters - poor mapping to pixels and the fact that CLIM software which is already written is defined with the assumption that we operate on something comparable to a pixel.

Having all that in mind the default unit should be device-independent pixel. One of McCLIM long-term goals is to adhere to Material Design guidelines - that's why we will use dip[6] unit. 100dp has absolute value 15.875mm. On 160dpi display it is 100px, on 96dpi display it is 60px, on 240dpi display it is 150px etc. This unit gives us some nice characteristics: we are rather compatible with the existing applications and we preserving absolute distances across different screens.

[6]: Density-independent pixel.

How to draw a rectangle on the medium

Application medium may be a pixel-based screen, paper sheet or even a text terminal. When the programmer writes the application he operates on dip units which have absolute value 0.15875mm. It is the McCLIM responsibility to map these units onto the device. To be precise each graft needs to hold an extra transformation which is applied before sending content to the display server.

Now we will go through a few example mappings of two rectangle borders[7] drawn on a sheet. The violet rectangle coordinates are [5,5], [22,35] and the cyan rectangle coordinates are [25,10], [30,15].

Graft drawing

Windows Presentation Foundation declared 96PPI screen's pixel being the device-independent pixel because such displays were pretty common on desktops. Almost all coordinates map perfectly on this screen. Notice the approximation of the right side of the violet rectangle.

Lower DPI



HQ Printer

Character terminal

It is time to deal with graphics orientation (Y-axis grows towards the top). An imaginary plotter with 80DPI resolution will be used to illustrate two solutions (the first one is wrong!). Knowing the plotter screen height is important to know where we start the drawing.

Plotter (bad transformation)

80DPI and MDPI plotters

[7]: the Ideal border is composed of lines which are 1-dimensional objects which doesn't occupy any space. Red border in drawings marks "ideal" object boundaries. Points are labeled in device units (with possible fractions).

Sheets written with a special device in mind

There is still an unanswered question - how can we program applications with a specific device limitations in mind? As we have discussed earlier default sheet unit should be dip and default sheet orientation is the same as a paper sheet's[8].

Writing an application for a terminal is different than writing an application for a web browser. The number of characters which fit on the screen is limited, drawing shapes is not practical etc. To ensure that the application is rendered correctly we need a special kind of sheet which will operate on units being characters. Take a look at the following example.

Sheets with different units.

The first sheet base unit is a character of a certain physical size. We print some information on it in various colors. Nothing prevents us from drawing a circle[9] but that would miss the point. We use a character as a unit because we don't want rounding and approximation. Background and foreground colors are inherited.

The second sheet base unit is dip. It has two circles, solid grey background and a few strings (one is written in italic).

Ideally, we want to be able to render both sheets on the same physical screen. The graft and the medium will take care of the sheet approximation. The effect should look something like this.

Different kinds of screens.

The first screen is a terminal. Circles from the second sheet are approximated with characters and look ugly but the overall effect resembles the original application. Proportions are preserved (to some degree). We see also the terminal-specialized sheet which looks exactly as we have defined it.

The second screen is mDPI display. The second sheet looks very much like something we have defined. What is more interesting though is our first sheet - it looks exactly the same as on the terminal.

[8]: Providing means to change defaults requires additional thought and is a material for a separate chapter. McCLIM doesn't allow it yet.

[9]: As we have noted sheets are ideal planar spaces where line thickness is 0 and there is nothing preventing us from using fractions as coordinates.

Port and graft protocols

Now we know what we want. Time to think about how to achieve it. Let me remind you what kind of objects we are dealing with:

In the next post I will show how to implement the port and the graft (and a bit of the medium) for the charming backend. I will cover only bare minimum for mediums important to verify that graft works as expected. Complete medium implementation is the material for a separate post.

[10]: Sometimes we deal with devices which we can't take input from - for instance a printer, a PDF render or a display without other peripherals.

05 Mar 2018 12:00am GMT

04 Mar 2018

feedPlanet Lisp

Quicklisp news: Download stats for February 2018

Here are the raw download stats for projects in February.

20308  alexandria
16702 closer-mop
15724 babel
15563 cl-ppcre
15531 split-sequence
15002 trivial-features
14598 iterate
13926 trivial-gray-streams
13822 bordeaux-threads
13786 anaphora
12948 let-plus
12777 cffi
12329 trivial-garbage
11742 flexi-streams
10768 nibbles
10530 puri
10329 usocket
9602 trivial-backtrace
9316 more-conditions
9051 chunga
8828 cl-fad
8794 cl+ssl
8641 cl-base64
8295 esrap
8228 chipz
8011 utilities.print-items
7860 named-readtables
7510 drakma
6918 local-time
6803 cl-yacc
6634 ironclad
5743 asdf-flv
5588 parse-number
5555 fiveam
5208 cl-json
5189 closure-common
5171 cxml
5151 log4cl
4556 optima
4476 parser.common-rules
4372 architecture.hooks
4204 cl-unicode
4101 plexippus-xpath
3920 cl-interpol
3779 lparallel
3551 lift
3520 cl-clon
3459 cl-dot
3299 trivial-types
3257 slime
3155 cl-syntax
3143 cl-annot
3136 cxml-stp
3098 cl-store
3000 fare-utils
2997 fare-quasiquote
2973 xml.location
2971 cl-utilities
2853 utilities.print-tree
2853 trivial-utf-8
2850 static-vectors
2814 fare-mop
2812 inferior-shell
2741 quri
2718 metabang-bind
2686 trivial-indent
2664 md5
2638 documentation-utils
2549 fast-io
2545 uuid
2542 array-utils
2378 plump
2275 djula
2275 symbol-munger
2271 cl-slice
2232 collectors
2225 gettext
2190 arnesi
2189 hunchentoot
2162 access
2151 cl-parser-combinators
2146 cl-locale
2028 simple-date-time
2019 ieee-floats
1889 prove
1831 yason
1603 asdf-system-connections
1588 metatilities-base
1574 cl-containers
1525 rfc2388
1471 postmodern
1460 fast-http
1441 trivial-mimes
1358 salza2
1338 monkeylib-binary-data
1334 cl-colors
1305 jonathan
1292 proc-parse
1277 xsubseq
1253 cl-ansi-text

04 Mar 2018 10:06pm GMT