Sunday, August 26, 2012

Reflective conspiracy theories

A small step for man, but a giant leap for mankind
Before moving on, let's zero-G for a moment for Neil Armstrong. First man on the moon (and hopefully not the last), died at the age of 82, 25 august 2012. If that was a real step on a real moon, Neil will reserve a well deserved page in the history books for a long, long time. Something we mankind as a whole should be proud of. Although we kill each other each day for various reasons, we should realize we all share this tiny globe. Zooming out puts things in perspective, and Neil literally took that perspective when having a view on our little planet while standing on another floating rock in this endless cosmos. It’s such a huge performance that it's hard to believe we really did it...

Moon-landing hoax? Who shall say. John F Kennedy, chemtrails, 9-11 inside job, Saddam & biological weapons, Area 51, New world order? Both the present and history are full of mysteries, and the more you think about it, the more questions arise. Things aren’t always what they seem. Having a Moon-landing sure came at a good timing with those crazy Russians trying to outperform USA as well. And it surprises me that modern space missions -50 years technology evolution since the sixties- seem so extremely vulnerable (control room being overexcited because the Curiosity drove a few centimeters on Mars ?!) that it puts the much more ambitious/dangerous Moon-landing in a weird contrast.

But before just following the naysayers... Being skeptic is a natural, psychological phenomenon. And not taking everything for granted the media says is healthy. But consider other huge achievements. Didn't we laugh at the brothers Wright? Would Napoleon even dare to dream about the awesome power of a nuclear bomb? Huge pyramids being built with manpower only? Even got a slight idea of how CERN works? Would you run with thousand other soldiers on Utah beach while German bunkers are mowing down everything that moves? Men can do crazy stuff when being pushed! But the bottom line is that you or me will never know what really happened, because we weren't there, nor do we have thorough, inside knowledge of the matter. All we do is picking sides based on arguments we like to believe. And for that reason, here an *easy-to-consume* series of the Mythbusters testing some infamous Moon-Landing conspiracy theories including the footprints (on dry "sand"?), impossible light & shadows (multiple projector lights?), and the waving flag (in vacuum?). So before copying others, get your facts right and check out this must-see:
Mythbusters & Moonlanding

Let me spoil one thing already. Something we graphics-programmers *should* know. Why is that astronaut climbing of the ladder not completely black due the shadow? Exactly, because the moon surface reflects light partially. A perfect example of indirect lighting, ambient-light, Global Illumination, or whatever you like to call it. Neil, rest in peace. And for future astronauts, don't forget to draw a giant middlefinger on the Moon/Mars so we have better evidence next time. Saves a lot of discussion.

As mentioned above, shadows and light bounce off surfaces. Not just to confuse conspiracy thinkers with illuminated astronauts, also simply to make things visible. If not directly, then indirectly eventually. Reflections are an example of that, and take an important role in creating realistic computer graphics. Unfortunately, everything with the word "indirect" in it seems to be hard to accomplish, even on modern powerful GPU's. But it's not impossible. Duke Nukem 3D already had mirrors, so did Mario 64, and Farcry was one of the first games to have spectacular water (for the time) that both refracted and reflected light.

Well, a GPU doesn't really reflect/refract light-rays. Unless you are making graphics based on Raytracing, but the standard for games still is rasterization, combined with a lot of (fake) tricks to simulate realistic light physics. Reflections are one of those hard to simulate tricks. Not that the tricks so far are superhard to implement, but they all have limitations. Let's walk through the gallery of reflection effects and conclude with a relative new one: RLR (Realtime Local Reflections) I recently implemented for Tower22. If you already know the ups and downs of CubeMaps and Planar reflections, you can go there right away.

Planar reflections
One of the oldest, accurate, tricks are planar reflections. It “simply” works by rendering the scene(that needs to be reflected) again, but mirrored, The picture below has 2 “mirror planes”. The ultra realistic water effect for example renders everything above(!) the plane, flipped on the Y axis. That’s pretty much it, although its common to render this mirrored scene into a texture(render-target) first. Because with textures, we can do cool shader effects such as colorizing, distortions, Fresnel, and so on.

Planar reflections are accurate but have two major problems: performance impact & complex (curvy) surfaces. The performance hit is easy to explain; you’ll have to render the scene again for each plane. This is the reason why games usually only have a single mirror or water-plane. Ironically the increasing GPU power didn’t help either. Sure you can re-render a scene much faster these days, but don’t forget it also takes a lot more effects to do so. Redoing a deferred-rendering pipeline, SSAO, soft shadows, G.I., parallax mapping and all other effects for a secondary pass would be too much. If you look carefully at the water(pools) in the T22 Radar movie, you’ll notice the reflected scene being a bit different… uglier. This is because lot’s of effects are disabled while rendering the mirrored scene for planar reflections. Just simply diffuseMapping with a fixed set of lights.

The second problem are complex surfaces. The mirror-planes on the image above are flat. That’s good enough for a marble floor, and even for water with waves (due all the distortions and dynamics, you won’t quickly notice the error). But how to cover a reflective ball? A sphere has an infinite amount of normals (pieces of flat surface pointing in some direction). Ok, game-spheres have a limited amount of triangles, but still a 100 sided sphere would also require 100 planar planes = 100x reflecting the scene to make a correct reflection. To put it simple, it’s WAY too much work. That’s why you won’t see correct reflections on curvy surfaces.

Conspiracy people! Notice the reflected scene in the waterpool being a bit different than the actual scene?

CubeMaps are the answer on the typical problems with planar reflections… sort of. The idea is to sample the environment from all directions, and store it in a texture. Compare it with snapping a panorama photo. It’s called a cubeMap because we take 6 snapshots and “fold” them into a cube. Now we can both reflect and refract light simply by calculating a vector and sample from that location in the cubeMap texture. The crappy sample below tries to show how a cubeMap is build and how it can be used. The right-bottom image represents the scene from topview, the eye is the camera, and the red line a mirror. So if the eye would look to that mirror, it creates the green vectors for sampling from the cubeMap. In this situation the house would be visible in the mirror.

• Paraboloid maps are a varation on cubeMaps that only require 2 snapshots to fold a sphere. PM’s are faster to update in realtime, but lack some quality and require the environment to be sufficient tessellated though.

Since cubeMaps sample the environment in 360 degrees, they can be used on complex objects as well. Cars, spheres, glass statues, chrome guns, and so on. Problem solved? Well, not really. First of all, cubeMaps are only accurate for 1 point in space. In this example, the environment was sampled around the red dot. Stuff located at the red dot will correctly reflect (or refract) the environment, but the further it moves away from the sample-point, the less accurate it gets. That means we should sample cubeMaps for each possible location? No, that would be overdone. The advantage of curvy surfaces is that it’s really hard to tell whether the reflection is physically correct for an average viewer.

But at the same time, you can’t use a single cubeMap for a large reflective waterplane, because you will notice the inaccuracy at some point. What games often do is letting the map-artists place cubeMap “probes” manually at key locations. At the center of each room for example, or at places you expect shiny objects. Then reflective objects would pick the most useful(nearby) cubeMap. In Halflife 2 you can see this happening. Take a good look at the scope-glass on your crossbow… you’ll see the reflection suddenly changing while walking. This is because the crossbow switches over to another cubeMap probe to sample from.
• Tower22 updates a cubeMap nearby the camera each cycle and uses it for many surfaces. This means pretty correct (& dynamic!) reflections for nearby objects. Distant surfaces will lead to visible artifacts sometimes though.

A cubeMap requires 6 snapshots, thus rendering the scene 6 times. This is quite a lot, so cubeMaps are usually pre-rendered. Since we don’t have the scene again from that point, cubeMaps provide a much faster solution than planar reflections. However, being not updated realtime, you won’t see changes in the environment either. Wondered why soldiers didn’t get reflected in some of the glass windows or waterpools in Crysis2? That’s why. All in all, cubeMaps are only useful for (smaller) local objects, and/or stuff that only vaguely reflects such as a wet brick wall or dusty wood floor.

Other methods?
I don’t know them all, but Crytek introduced an interesting side-quest on their LPV (Lighting Propagation Volume) technique. To accomplish indirect lighting, one of the things they do is creating a set of 3D textures that contain the reflected light fluxes globally. Asides from G.I., this can also be used to get glossy(blurry) reflections by ray-marching through those 3D textures. I sort of tried this technique (different approach, but also having a 3D texture with a global/blurry representation of the surroundings). And did it work? Well judge for yourself.

Personally, I found it too slow for practical usage, although I must say I only tried it on an aging computer so far. But the real problem was the maximum ray-length. Since 3D textures quickly grow to very memory consuming textures, their sizes are limited. That means they only cover a small part of the scene (surrounding the camera), and/or a very low quality representation in case the pixels cover relative large areas. In this picture above, each cell in the 3D texture covered 20^3 centimeter. Which is quite accurate (for glossy reflections), but since the texture is only 64x64x64 pixels, a ray cannot travel further than 64 x 20cm = 12,5 meters. In practice it was even less due performance issues and the camera being in the middle. Only a few meters. So the wall behind the camera would be too far away for the wall in the front to reflect. This was fixed by using a second 3D texture with larger cells. You can see the room pixels suddenly get bigger in the bottom-left buffer picture. However, raymarching through 2 textures makes it even slower, and the raylength is still limited. All in all, reflections by raymarching through a 3D texture are sort of accurate, but very expensive, and useful for very blurry stuff only. I also wonder if Crysis2 really used reflections via LPV in the end btw… guess not.

RLR (Realtime Local Reflections)
In case you expect super advanced stuff now, nah, got to disappoint you then. If you expect a magical potion that fixes all the typical Planar & CubeMap reflection problems, I got to disappoint you as well. Nevertheless, RLR is a useful technique to use additionally. It gives accurate reflections, at a surprisingly good performance, and implementing this (post)screen effect is pretty easy. And no need to re-render the scene.

How it works? Simple. Just render the scene as you always do, in HDR if you like. Also store the normal, depth or position of each pixel, but likely you already have such a buffer for other effects, certainly if you’re having a Deferred Rendering pipeline. Now it’s MC-Reflector time. Render a screen filling quad, and for each pixel, send out a ray depending on its normal and the eyeVector. Yep, we’re raymarching again, but in 2D space this time. Push the ray forwards until it intersects elsewhere in the image. This can be checked by comparing the camera-distance-to-pixel and camera-distance-to-ray. In other words, if the ray intersects or gets behind a pixel, we break the loop and sample at that point. Now we have the reflected color. Multiply it by the source pixel specularity to get a result. The code could look like this:
pixNormal = tex2D( deferredNormalTex, screenQuadUV );
pix3DPosition = tex2D( deferredPositionTex, screenQuadUV );

int  steps = 0;
float3 rayPos =;  // Start position (in 3D world space)
float3 rayDir = reflect( pixNormal, eyeVector ); // Travel direction (in 3D)
bool collided = false;
float4 screenUV;

while ( steps++  <  MAX_STEPS    &&  !collided )
 // Move the ray
 rayPos += rayDir * STEP_SIZE;

 // Convert the 3D position to a 2D screen space position
 screenUV     = mul( glstate.matrix.mvp, float4(, 1.f) );
 screenUV    /= screenUV.w;
 screenUV.z  *= -1.f;
 screenUV.xy  = (screenUV  +1.f) * 0.5f;

 // Sample pixel depth at ray location
 float enviDepth = tex2D( deferredPositionTex,  screenUV.xy ).w;

 // Check if it hits
 collided = length( rayPos – cameraPos  ) > enviDepth + SMALLMARGIN;

// Sample at ray target
Float3 result = tex2D(  sceneHDRtex, screenUV );
The nice thing about RLR is that it works on any surface. The green spot gets reflected on the low table, but also on the closet door. Also notice the books being reflected a bit, and the floor, and the wall. No matter how complex the scene is, the load stays the same.

Perfect! But wait, there are a few catches. How many steps do we have to take, and wouldn’t all those texture-reads hurt the performance? Well, RLR does not come for free of course, but since rays take small steps and usually travel in parallel, it allows good catching on the GPU. Second, you can reduce the number of cycles quite drastically by:
A: Do this on a smaller buffer (half the screensize for example)
B: Do not send rays at all for non-reflective pixels (such as the sky or very diffuse materials)
C: Let the ray travel bigger distances after a while
Or instead of letting the ray travel x centimeters in 3D space, you could also calculate a 2D direction vector and travel 1 pixel each loop-cyclus. If your screen is 1200 x 800 pixels, the maximum distance a ray could possibly travel would be 1442 pixels. To complement, make good use of the power of love, I mean blur. A wood floor has a more glossy reflection than a glass plate. What I did is storing the original output texture, and a heavily blurred variant on it. The end result interpolates between the two textures based on the pixel “glossiness” value.
 pixSpecularity = tex2D( deferredTexSpecular, screenQuadUV );
 pixGloss = pixSpecularity.w;

float3 reflection = tex2D( reflectionTex, screenQuadUV );
float3 reflectionBlur = tex2D( reflectionTex2, screenQuadUV );
 endResult = lerp( reflection, reflectionBlur, pixGloss ) * pixSpecularity.rgb;
// use additive blending to add the end result on top of the previous rendering work
Of course, there are ways to jitter as well, use your imagination. However, the deadliest catch of them all, giving RLR a C+ score instead of A+, is the fact this screen-space effect can only gather reflections from stuff… being rendered on the screen. Imagine the wallpaper wall in the screenshot being reflective. It should reflect something behind the camera then. But since we never rendered that part, we can’t gather it either. In other words, pixels that face towards the camera, or to something else outside the screen boundaries, cannot get their reflections. That makes RLR useless for mirrors, although some women may prefer a RLR technology mirror. Also be careful with pixels around the screen edges. Your code should have a detection for this so you can (smoothly!) blend over to a black color (= no reflection).

As said, RLR is not a substitution for CubeMaps or Planar Reflection. Be a ninja and know your tools. Planar reflections for large mirrors / water. RLR for surfaces that only reflect at a steeper view angle, (pre-rendered?) cubeMaps for other cases.


  1. Hi spek (I wonder whether you still remember me?),

    Crytek have used screen-space reflections (e.g. Realtime local reflections) for reflective surfaces in Crysis 2. Although they have shown some specular reflections through LPV in their paper.

    I would also like to point you at Unreal Engine and their The Samaritan demonstration, which demonstrates ray traced reflections of plannar surfaces in pixel shader.

    Also you're mentioning that LPV is quite slow. I second this and I'd like to add:
    LPV is quite a huge waste of resources. You can use simplified (or even actual) scene, with either Kd-tree or BVH (either pre-computed, or 2 levels - static and dynamic) and any decent CPU or GPU ray tracer to compute correct reflections.

    Anyways, after seeing the huge work you have done here, I've got to say "Keep up the good work!".

  2. Got some holes in the head, but I certainly remember you Vilem :)

    The UDK demo looks awesome, but I wonder how the simplified version of the scene is stored. The raytracer probably uses a depth-texture, and a (HDR)color texture to sample from right? Same thing as RLR, but how/where are those images captured? "ImageReflectionSceneCapture" is used for that according to a paper, but no idea what it actually captures.

    I'm not a big fan of LPV either. However, Crysis2 simply looks good, and it runs at acceptable speeds. Then again, I wonder how much or if they use LPV at all... When moving a light or closing a door, it does not seem to affect the G.I. in a room. Either LPV only has a limited role and does GI on a very global level, or maybe its not used at all.

    It seems to be the main problem with GI. In order to make them run at realtime, you have to compensate a lot, resulting in low-quality techniques. And at that point, you'd wonder why not just using faster & nicer pre-baked solutions... No, we're not there yet when it comes to GI.

  3. Hi and sorry for a bit late answer. As for UDK reflections, I wrote and article + demo application, plus put paper online now again (it was hanging for some time on, but I think it's hidden somewhere in archives now) -

    Feel free to read & try it. Although I never got to work on other articles in this series, it can be extended with any kind of optimization scheme (either BVHs or BSP trees, or even grids) for quite good results. Reflection "Shadowing" is quite tricky to do (you can either use simplified voxel representation of scene or cleverly placed "black" reflection planes). It gets even more interesting that you can do quite awesome area-lighting with this technique (of course doing good shadows for area lights is another challange - quite hard one).

    In demo application (I hope it's still online) - you gotta hold space to look with mouse & move around with "WASD". It was put together in a hurry - but it's proof-of-concept.

    As for GI - I'm currently running VPL solution (diffuse reflections only now, caustics are another topic, more tricky) at some 100 fps on Radeon HD 5870 for simple levels (but it's still around 40 fps for Sponza), with 2 or 3 lights (generating hundreds of VPLs) ... all with shadows. Although I'm still optimizing this solution, it could be actually used in games. The bad thing about this is generating VPL positions - RSM (Reflective Shadow Maps) with filtering are good for small light sources, but for larger and more distant light sources it's a lot worse. Artist placed positions where VPLs are seems to be best (because that way you can get nice and fast GI for door opening, etc.), but thats quite a lot of work (and impossible for open-world stuff). Some pre-computed weights where-to-place VPL is also a solution, but it might not be as good as artist placed, and there will be cases where pre-computing fails (either placing useless 100 VPLs somewhere, or placing zero of them where they should be)... E.g. everything has pros and cons, it's all about trade-offs.

  4. Thanks Vilem, downloaded the paper. I'll have a look when there is time. 40 fps for Sponza sounds reasonable (I assume you don't have super-hardware to reach that), although 2 or 3 lights isn't that much for an indoor situation. Is the load mainly on the CPU or GPU btw? From what I remember in earlier talks, you do a lot of CPU raytracing right?

    Implementing a CPU raytracer only for a couple of visual effects such as GI or reflections goes a bit far to me. Making a raytracer isn't superdifficult, but making a FAST one is...

    I'd say the perfect G.I. solution is far from there yet, and until that time, using pre-baked solutions for realtime game purposes might still be the best decission. After all, most gamers won't notice the difference between a well baked map, or a realtime GI pipeline, except for a big performance difference probably.

  5. To the performance, I actually use different code paths - one for CPU(s) (clever optimizations for CPU, written in C/C++) and one for GPU(s) (written in OpenCL (e.g. special optimizations for GPU) - although those kernels are quite HUGE compared to most OpenCL stuff out on the net).

    The screen is divided between all the devices in the PC in such way - one part goes to CPU (all the cores available (or less, if defined their number) will be working on this one) and other N parts go to N GPUs. The parts dimensions are equal at the start, but adjusted at runtime (so that all GPUs and all CPU cores are busy all the time).

    Of course if I do only ray tracing (or path tracing), I use all the "iron" available. Otherwise it depends... (I can use just one CPU core, or single available GPU, or just all GPUs, etc.)

    Anyway to some example we did. We wanted to achieve real time reflections for large scale open worlds (like F.e. Skyrim has). The world is stored in cells, each has high and low details version, ray tracing is performed only against low detail version and thus we can achieve real time performance for even large worlds (+ this also reduces other problems, like passing textures to "compute shaders" (OpenCL) - because we use just single texture atlas for low detail version of cell (cells are in grid -> 1st level of hierarchy and each low detail cell has KdTree - 2nd level of hierarchy -> ray tracing in realtime is possible)). Of course reflecting animals & characters is another topic (we don't reflect them now), but with some extensions it could be done.

    As for perfect GI solution - path tracing is possible to do (progressively), but it's not enough fast for games (ehm.. hardware is just too slow now). You can drop lots of things out and make it a bit faster ... trading off physically correctness for speed. Just take a look at Brigade 2 projects - they look awesome, but hardware is not there yet (but it WILL be) ... plus the guys are doing that stuff on kind of beast hardware (double or quad GPU + 8 cores CPUs ... how many people has these at home... not much, at least not now and not in next few years, imo).