Ambient Occlusive Crease Shading
(Fox & Compton, 2007)
Published in Game Developer Magazine, March 2008
An algorithm for the calculation of fake/artistic approximation to ambient occlusion, lending a typical scene an otherwise missing depth and warmth. This approach uses absolutely no precalculation, works on absolutely any geometry or mesh, and has a fixed overhead dependent only on resolution (it's entirely screen-based). Below are two example shots, and at the bottom of the page are example shots of the AO coefficient buffer itself:
(mouse-over to see how the scene looks with AO disabled, though ignore the fireflies in the far-off background)
Pay particular attention to the way in which the algorithm deepens the shadows about mostly-obscured creases, and especially the way in which it adds proximity shadows. For instance, note the shading around the branch in front of the door - when you remove the AO, it removes all visual cue that the door and branch are actually quite close to eachother. You might also notice how the AO tends to darken the scene overall, but this isn't by any means a consistent/disagreeable modulation - pay attention to how it shades the plant leaves, for instance, or how bark and wood take on a softer appearance. It only appears to be a consistent dampening, there's a lot of surface-based noise that really makes the material properties "pop."
The basic approach is fairly simple:
- All the work is done in view-space, so that we can treat individual pixels as approximations of occluding surfaces and avoid expensive raycasting or similar passes not otherwise useful in rendering. (This algorithm is, thus, best suited to a deferred shader)
- We assume that we only want to shade creases that face us. That is, the normals on either side of the crease should face us.
- Furthermore, we want only "inner" creases that block light, not "outer" creases / points that would tend to recieve more light. To determine the one from the other, we simply check the facing of the normals - inner creases have normals that face eachother / intersect if taken as rays, while "outer" creases have normals that face away from eachother / have no intersection.
- The amount of occlusion should be relative to the degree to which the two crease surfaces are facing eachother.
- The amount of occlusion should also be relative to the distance between the occluding surfaces.
Thus, per pixel you simply select a set of neighbors, extract their normals, determine to what degree the center pixel faces the neighbor (dot the normal against the vector from the center pixel to the neighbor pixel), use the dot and a scaling factor to compute the occlusion, scale the value relative to the distance between occluders, add up all the results per pixel and then you're done - just apply it to lighting. You'll want an artist tweakable factor that applies the occlusion coefficient per light component, as this is very much an artistic approach - the more tweakable it is, the better. I give the artist 7 values total per-scene: the 3 light+AO component scalars, a bias value (that can bias the dot product to produce a more or less shadows appearance), the range attenuation value (falloff coeff dependent on the distance between center and neighbor), and an averager (which is the divisor when averaging the results of all neighbor pixels). While it's tempting to assume that you want a straight average of all of the contributions to a given pixel, or that range attentuation can handle all averaging, in practice neither alone is sufficient for pleasing results.
To get you on your way, here's the pixel shader code I use to produce the ambient occlusion map, rendered once per frame. GBuffer1 contains per-pixel positions in its XYZ, and GBuffer2's XYZ contains the per-pixel normals:
Shaders involved in crease shading
Uses: DiffMap1 (GBuffer 1)
DiffMap2 (GBuffer 2)
Color0 (AOCreaseValues - Range, Bias, Averager, Unused)
float4 psCreaseShadeStippled11x11(const vsOutput psIn) : COLOR
UVLeft.x = 1.0f / DisplayResolution.x;
UVLeft.y = 0.0f;
UVDown.x = 0.0f;
UVDown.y = 1.0f / DisplayResolution.y;
float2 centeredUV = psIn.uv0 + (UVLeft / 2.0f) + (UVDown / 2.0f);
float3 centerPos = tex2D(GBuffer1Sampler, centeredUV).xyz;
float3 centerNormal = tex2D(GBuffer2Sampler, centeredUV).xyz;
UpdateSamplesStippled11x11(DisplayResolution.x, DisplayResolution.y, SampleOffsetsWeightsStippled11x11);
float4 totalGI = 0.0f;
for (i = 0; i < 24; i++)
float2 sampleUV = centeredUV + SampleOffsetsWeightsStippled11x11[i].xy;
float3 samplePos = tex2D(GBuffer1Sampler, sampleUV).xyz;
float3 toCenter = samplePos - centerPos;
float distance = length(toCenter);
toCenter /= distance;
float centerContrib = saturate((dot(toCenter, centerNormal) - AOMinimumCrease) * Color0.y);
float rangeAttenuation = 1.0f - saturate(distance / Color0.x);
totalGI.r += centerContrib * rangeAttenuation;
totalGI.r /= Color0.z;
The implemented algorithm does have a few differences, and we hit a few snags:
- Most notably, we found that using the neighbor normals (that is, trying to determine not only how much the center faces the neighbor, but how much the neighbor faces the center - which would be necessary to determine a true crease) was overkill. It produced dramatic normal-crease shadows, but almost completely obliterated the proximity shadow without severe overamping (the proximity shadows originate almost entirely from the center term). Additionally, it creates an interference pattern, as explained below. However, the addition of this second term does an incredible job of picking out surface detail in the normal map - so if you want to emphasize that rather than proximity shadows (and you likely do in a "realistic" game), try it out and just tune your falloffs to prevent the apperance of the aliased proximity shadows.
Noise explanation: The uniform sampling of neighbors means that the "neighbor facing" term follows edge aliasing, while the center-based term generally does not. The result when you combine this aliased sample with the less-aliased center sample is a beat pattern, interference, and is an interesting effect in and of itself. If you've never seen it, go into Photoshop and take a hard-edged black-and-white sketch and blur it heavily (resize it to ~12.5% for instance), then superimpose the blurred over the original with a subtractive or multiplicative filter. You'll find the interference between the blur and original is inconsistent, and has this nack for picking out the "shadowed" areas of the original image. This effect is actually the core of the AO algorithm in use by Little Big Planet, but in our algorithm it makes the proximity shadows fairly awful and useless.
- The visual impact of this approach is directly related to the size of the neighborhood of pixels sampled. In my own implementation, I determined that an 11x11 diamond with stipled (every other pixel) sampling produced excellent results, but this isn't a hard and fast rule. We considered using a noisy sampling of neighbors, but the result was, again, detrimental to the quality of proximity shadowing. You'll still want to try it for yourself while establishing the desired "look," as the results weren't broken so much as just not what we wanted.
- For the example shots above, we were using the following parameters: Range(1.0), Bias(0.25), Averager(10.0), and AOMinimumCrease(0.3). Range/Bias/Averager are stored in Color0 and customized on a scene-by-scene basis, and AOMinimumCrease must be present and adjusted to avoid artifacting as a result of accuracy - a poorly tuned AOMinimumCrease will cause radial banding screen-wide, most visible on flat-on surfaces.
- Unless your distance falloff is well-balanced, you will get distracting results as the distance from the camera increases / the number of pixels in the shadowed borders remains roughly the same. Given that an unbalanced falloff is often desirable for consistency of apperance, I suggest that you additionally fade the crease shading out based on the distance of the pixel from the camera. This effect is primarily the fault of the blur approach to enhancing the AO - a larger sample range and less bloom would likely reduce the need for the extra falloff.
- (Extremely Important) The results as produced by the basic algorithm are very minor and very resolution dependent. Especially at high resolutions, it will pick out only the line along which occlusion is greatest and its few closest neighbors. To fix this, we apply a very heavy set of bloom passes. One blooms an especially large distance but reduces AO intensity overall, the other blooms a more reasonable distance while amping the AO coeffs up. The two are then blurred together into our final AO buffer. The amount of blurring applied directly influences the appearance of the AO, with it appearing "dreamier" as more blur is applied. You could also make the sample region significantly larger and avoid the blur, but the cost may or may not be significantly worse (though the results would then be less smooth / more accurate in appearance).
Here are (roughly, the camera is in a slightly different position) the shots of what the pre-blur AO buffers look like for each of the above example shots. Mouse-over to see the post-blur version that is used in rendering.
... and that's about it. With a bit of tweaking, you should be able to get results equivalent to the above screenshots. The shots are all from our product prototype, with AO being key in its art style. There is a framerate hit, and I seriously doubt you could usefully include this in anything below PS3.0, but our framerates are fine / it operates beautifully in the real world. Our prototype with all the bells and whistles (AO included) enabled runs extremely well on an 8800GTX, and reasonably well on a 7600GT.
As far as I know, this is a new approach, and my primary purpose for putting it into the public sphere is to protect it against possible copyright ugliness - but if it's old news, if someone owns the copyright, by all means let me know. (Crysis developers, if you happen to read this, I'd be curious to know how similar our approach is to yours? We're both working in screen space, but your approach in general seems to be more accurate/expensive and use a depth buffer rather than a normal buffer - or at least, that's what I would guess from the brief you released)
(Work was primarily done in August of 2007, with further refinement through November 2007)