Sunday, January 9, 2011

Take me to Screen-Space land

- As promised previous week, a techno-post here. Next week it will be less technical again. -

While making a new real-time GI technique, I ran into SSAO & SSGI again. Sounds like a sexual disease and a new crime series, but actually they are screen effects to enhance(fake) ambient lighting / occlusion:
- SSAO = Screen-Space-Ambient-Occlusion
- SSGI = Screen-Space-Global-Illumination

Programmers here probably know SSAO already. Crytek showed this technique in their first Crysis game, a couple of years ago. Hell, time flies. In short, SSAO is a (post)effect that approximates the occlusion per pixel by using other pixeldata from the rendered screen. The idea is as follow: the more nearby/surrounding objects, the less (ambient)light a pixel catches. This is somewhat true. Look in your room and you may notice the corners or space below cabins are darker. Because less light reaches here of course.

Well, it all depends from where the light is falling in. But as graphics programmers probably know, it's near to impossible to compute where indirect light comes from, in real-time. SSAO is another cheap hack to approximate this effect without actually knowing shit about light. Although cheap... even today it requires quite a lot to compute decent looking SSAO, while the effect isn't that big. Which is why I had to have a look in my existing SSAO shader again. And as some notified on Youtube, the SSAO effect caused an ugly black halo around objects.

Nice thins about Screen-Space techniques is that the complexity of the scene doesn't matter at all. 1 box, 600 planets, pixelcount stays the same.

How does it work? Pretty simple, although the shader requires a few error-sensitive tricks: For each pixel on your screen:

- Calculate where it is (reconstruct position via depth, or store positions in another deferred buffer)
- Take n samples in a circle around the source pixel. Check if these neighbor pixels are occluding
the source pixel by comparing depth/positions, and eventually normals.
- Take the average of all samples
- You can use a (Gaussian) blur pass in the end to smooth the results
- In the end, multiply the grayscale SSAO texture with the (ambient) scene.

Since the amount of samples is limited, 16 or something, you only have a relative low amount of references. Don't make the sample circle too large (which is why SSAO only works locally, in corners and such), and use a "dither" or noise texture to vary the sample coordinates for each pixel. Some pixels sample nearby, others \ use a somewhat bigger range. This leads to varying pixel-results, but a blur can smooth that away.

Besides taking the proper sample coordinates, the difficult part is to decide which neighbor pixels occlude. I've seen several implementations, but in my case they always lead to weird results. Half-grayish walls when rotating the camera, or an either darkened or highlighted edge everywhere. With the wrong comparisons, SSAO quickly looks like an ordinary edge-detector effect, while it shouldn't be. So instead of lazy copying shaders from others, I took a try for myself this time:

for (int i=0; i <16; i++) // a few less samples is possible
// Create neighbour sample texcoords
// w and h depend on a variable sampleRadius, screenSize and distance from camera
half2 tx = half2( iTex.x + sampleDir[i].x * w , iTex.y + sampleDir[i].y * h );
// Get neighbour data
half4 nbPix = tex2D( posTex, tx ).xyzw; // get WORLD position & depth(w)
half3 nbNrm = tex2D( nrmTex, tx ).xyz; // get WORLD normal

// Occlude if:
// - Neighbour pixel is not too far away
// - Direction between the 2 pixels can affect on the sourcePixel normal
half3 dir = -;
half dist = length( dir );
dir = normalize( dir );
half shineFac= saturate( dot( dir, srcNrm ) ); // Prevents self occlusion, compare with source normal

half ao = shineFac * saturate( (nbPix.w - srcPix.w + maxPixDist )*10000 ); // Discard pixels that are too far away
aoSum += ao;
} // for i
ao = 1 - (aoSum / 16 );

It uses the deferred render buffers as input instead of depth reconstructions. Simple, just like my brains are. Probably not the fastest way around, but it works pretty well. Also on background buildings. It prevents self-occlusion or foreground objects to mix with background stuff. Surfaces can still self-occlude with their normalMaps though, although the effect is barely visible in the end-result. But if you get it for free, why not. Bullet hole decals that affect the normalMap will create darkening for example, pretty neat.

Not implemented here yet, but I always have fights with the skybox as that area doesn’t have a position, normal, or depth by default. By rendering an extreme high depth in the position or depth buffer (glClearColor( much ) ), it will be skipped here now though. You can abort the shader right away when the sourcePixel depth is also high, as you don’t have to process the skybox. In outdoor area’s that can save up to 50% of the calculations!

Anyway, what I really wanted to share was that other technique: Screen-Space-Global-Illumination. Used to spread light to create "color bleeding". No, that won't be my top-secret next take on real-time G.I. but it *might* be useful to complete it. Just like SSAO. Due limitations the few "realtime G.I." solutions available so far, including the Crytek LPV one, are still computing the indirect light distribution on a rough, inaccurate scale. To deal with the small details, SSAO and SSGI can be used. Crytek for example uses SSGI to approximate G.I. for background scenery that falls outside the LPV workspace (3D volume textures around the camera).

So... what is SSGI then? If you can compute occlusion by looking at neighbor pixels, then why not using it to reflect (direct) light? Hey, another nice usage of the Inferred Rendering pipeline approach, where we produce a diffuse & specular light screentexture. Just copy the SSAO shader, and in addition read the diffuse value from the neighbor pixels. I also read the reflectance & emissive value from a second texture. Those colors roughly represent the outgoing light from a neighbor pixel. Now we only have to test if it reaches the source pixel... yep, same stuff as the formula I did above, but with a small addition:

half3 giCol = tex2D( diffuseTex , tx ).rgb * 0.5f + tex2D( additiveTex, tx ).rgb;
giCol *= shineFac * saturate( dot(nbNrm, -dir) );
giSum += giCol * saturate( (nbPix.w - srcPix.w + maxPixDist )*10000 );

You can simply add these lines in the loop so you can calculate AO and SSGI at the same time. Ow, the brighter the G.I., the less Ambient Occlusion should occur of course. You can simply lerp between the two, based on the G.I. result luminance.

Direct light falls on the ground here, then the surrounding walls / objects pick it up again. Without any G.I., the backsides of the boxes would be pitch black. Also, the emissive monitor creates a blur.

Life can be so simple. But does it really work? Hmmm... well... Three problems. First of all, it makes the already expensive SSAO shader even nastier. Second, the SSGI effect is, just like SSAO, only very local. Again, you barely see it unless applied on really bright colored objects such as a computer monitor or bright green plastic wall.

The third problem is the product of problem one and two. To make the effect more noticeable (worth the additional cost), SSAO and SSGI should use different sample radiuses. The bigger the circle, the wider the light spreads (or actually gathered) of course. But that doesn't work too well with SSAO, unless you like blurry crap. So, the only proper solution I could think about, was to put the SSGI in a separate loop that uses a wider sampling range around the source pixel. And thus requires even more horsepower, for just a small effect. Is it worth it? Mehh, if you target for somewhat older hardware, NO.

In a scenario like this SSGI helps (though a cubeMap could do as well). But when the hell do you see things like this?

Just when I pushed the speed to 70 FPS (30 on my older card), SSGI is making havoc again. Currently SSAO & SSGI are done in the same pass, on a buffer half the size of the screen. What I could do is moving SSGI to a separate pass on a 1/4 sized buffer. Less quality, but then again SSGI allows more blurring & smearing than SSAO does I think. Didn't try it yet though.

And that boys & girls, was probably the most technical piece of text I ever wrote.


  1. the 3d engine become really neat :)

  2. Thank you :)
    The real problem is keeping it up-to-date probably. One day after catching up with all modern techniques, it will start aging again.

  3. Thank you! I wasn't sure it was going to be worth it at first. The ext ra work I had to put into it was insane but I am loving the finished product. I'm happy everyone else seems to love it too.
    bullet stickers

  4. A bullet in your ass. Go spam your mother.

  5. Have you tried a bilateral filter instead of Gaussian blur? We got much better results with the bilateral filter because it preserves objects edges.

  6. Not sure how a Bilateral filter would be implemented exactly, but the current gaussian blur already looks if a neighbor pixel is suitable for "mixing" by comparing its normal and distance. If too different from the origin pixel, then it won't count. Don't know if the screenshots here already used that though... Those are old!