How Things Work - A Visual Guide to Technology

Modern graphics are not just about rendering geometry accurately - they are about creating images that feel cinematic, atmospheric, and emotionally resonant. The difference between a technically correct render and a visually stunning one often comes down to [[post-processing]] effects: blurs, glows, color grading, depth of field, and dozens of other techniques applied after the 3D scene has been rendered. These effects transform raw renders into polished images that mimic the imperfections and characteristics of physical cameras and film.

Post-processing operates on the 2D image after 3D rendering is complete, treating it as a grid of pixels rather than a scene of geometry. This makes effects computationally efficient (processing a fixed number of pixels regardless of scene complexity) but also limits them to information present in the image (without additional buffers, we cannot see behind objects or know their true 3D positions). Understanding both the power and limitations of screen-space effects is key to using them effectively.

Convolution: the foundation of image effects

Many image effects are built on [[convolution]] - a mathematical operation that computes each output pixel as a weighted sum of nearby input pixels. The weights are specified by a [[kernel]] (or filter matrix), typically a small grid like 3×3, 5×5, or larger. Different kernels produce different effects: blur kernels average neighbors, edge detection kernels emphasize differences, sharpen kernels enhance local contrast.

Consider a simple box blur with a 3×3 kernel where all weights are 1/9. For each output pixel, we sample the input pixel and its 8 neighbors, multiply each by 1/9, and sum the results. This averages colors across a small area, softening edges and reducing detail. Larger kernels blur more but require more samples - a 9×9 kernel needs 81 texture samples per output pixel, which becomes expensive.

// Simple 3x3 box blur fragment shader
uniform sampler2D uImage;
uniform vec2 uTexelSize;  // 1.0 / textureResolution

void main() {
    vec3 color = vec3(0.0);
    
    // Sample 3x3 neighborhood
    for (int x = -1; x <= 1; x++) {
        for (int y = -1; y <= 1; y++) {
            vec2 offset = vec2(float(x), float(y)) * uTexelSize;
            color += texture(uImage, vTexCoord + offset).rgb;
        }
    }
    
    // Average all 9 samples
    fragColor = vec4(color / 9.0, 1.0);
}

The Gaussian blur is more sophisticated, using weights that follow a bell curve - pixels closer to the center contribute more than distant pixels. This produces a more natural blur without the slight boxiness of a uniform average. Even better, the Gaussian blur is separable: a 2D Gaussian can be decomposed into two 1D passes (horizontal then vertical), reducing an N×N kernel from N² samples to 2N samples - a huge optimization.

Bloom: making bright things glow

[[Bloom]] simulates how very bright light sources appear to glow in cameras and human vision. When light exceeds a sensor's capacity, it bleeds into neighboring pixels. Our eyes experience a similar effect from scattered light within the lens and vitreous humor. Bloom adds that 'cinematic' quality where lights feel genuinely bright rather than just being white pixels.

The classic bloom algorithm has four steps: first, extract bright regions from the image by thresholding pixels above a brightness cutoff. Second, blur this brightness map substantially - large blurs create wide glows. Third, combine the blurred brightness back with the original image. Fourth, apply tone mapping to bring the [[HDR]] result back to displayable range.

Extract bright pixels: threshold the image to keep only pixels above a brightness value (like 1.0 in HDR)
Downsample: reduce the bright image to smaller resolutions (1/2, 1/4, 1/8 of original)
Blur: apply Gaussian blur at each resolution level
Upsample and combine: blend the blurred levels back together
Add to original: combine the glow with the original image

Using multiple resolution levels (a technique called mip chain bloom or progressive downsampling) creates a natural-looking glow that spreads far without requiring enormous blur kernels. Each level captures a different scale of glow - sharp near the source, diffuse further away. This is why modern bloom looks so much better than early implementations that used a single blur pass.

HDR rendering and tone mapping

Real scenes have enormous dynamic range - the sun is billions of times brighter than a shadow. Computer monitors can only display about 2-3 stops of range (roughly 8-10× between darkest and brightest). [[HDR]] rendering represents colors with values far beyond the 0-1 range during rendering, then uses [[tone mapping]] to compress this range into displayable values while preserving detail and artistic intent.

Without HDR and tone mapping, bloom cannot work correctly. If pixel values are clamped to 1.0, a white wall and the sun both register as (1, 1, 1) - we have lost information about which is actually brighter. Rendering in HDR (using floating-point framebuffers) preserves the fact that the sun might be at (50, 45, 40) while the wall is at (0.8, 0.8, 0.8). Bloom extraction can then correctly glow only the truly bright areas.

Tone mapping operators vary in complexity. Simple approaches like Reinhard tone mapping apply a curve that compresses highlights while preserving shadows. More sophisticated approaches like ACES (Academy Color Encoding System) simulate film response curves and provide a cinematic look that has become standard in games. The choice of tone mapper significantly affects the visual style of the final image.

Ambient occlusion: contact shadows

[[Ambient occlusion]] (AO) darkens creases, corners, and areas where surfaces come close together. It approximates how ambient light is partially blocked in concave regions - the inside of a box receives less environmental light than the outside. AO adds tremendous depth and realism, making objects feel grounded and three-dimensional even under flat lighting.

Screen-Space Ambient Occlusion (SSAO) computes AO from the depth buffer without access to full scene geometry. For each pixel, it samples nearby depth values and estimates how much of the surrounding hemisphere is blocked by geometry. More occlusion means darker shading. Various SSAO techniques differ in their sampling patterns and how they interpret depth differences.

HBAO (Horizon-Based Ambient Occlusion) traces rays along the depth buffer to find the horizon angle - how far above the surface you can look before hitting something. This gives more accurate occlusion in complex geometry. GTAO (Ground Truth Ambient Occlusion) uses a mathematical formulation that more closely approximates true ambient occlusion, producing cleaner results.

Depth of field: cinematic focus

[[Depth of field]] (DOF) simulates how camera lenses focus on one distance while blurring objects nearer or farther. This effect is both technically grounding (mimicking real photography) and artistically powerful (directing viewer attention). Games use it for dramatic cutscenes, aiming-down-sights focus, and environmental storytelling.

Basic DOF uses the depth buffer to determine each pixel's distance from camera, computes how far that is from the focal plane, and blurs accordingly. Pixels at the focal distance remain sharp; pixels far from it blur more. The blur amount follows the 'circle of confusion' - a function of distance from focus, lens aperture, and focal length.

The challenge is that naive DOF looks artificial. Real out-of-focus areas have a distinctive 'bokeh' - bright points become characteristic shapes (circles for circular apertures, hexagons for 6-blade apertures). Foreground and background blur differently. Edges between sharp and blurry regions should transition naturally. High-quality DOF effects simulate these physical behaviors, sometimes by actually simulating lens optics rather than applying simple blur.

Motion blur: conveying speed

Motion blur smears fast-moving objects to convey speed and smooth temporal aliasing (judder at low framerates). Like DOF, it mimics a camera behavior - longer exposure times capture more motion. Games use it to enhance the feeling of speed in racing games, combat, and camera movements.

Per-object motion blur uses velocity buffers that store how much each pixel moved since the last frame. This velocity comes from comparing current and previous frame positions of each vertex, transformed and interpolated to each pixel. The blur then stretches pixels along their motion vectors. Camera motion blur is similar but uses camera movement rather than object movement.

Motion blur is divisive among players - some find it essential for immersion, others find it nauseating or dislike the loss of clarity. Most games offer it as an option with adjustable intensity. From a technical standpoint, it is also one of the trickier effects to implement well, as naive approaches create artifacts around occluding edges where foreground and background move differently.

Color grading and LUTs

Color grading adjusts the colors of the final image to achieve an artistic look - the teal-and-orange palette of action movies, the desaturated grit of war games, the vibrant oversaturation of cartoon aesthetics. Rather than applying simple brightness/contrast/saturation adjustments, modern games use Look-Up Tables (LUTs) that can implement any color transformation.

A 3D LUT is a cube of color values: for any input RGB color, look up the corresponding output color in the cube. A 32×32×32 LUT can represent complex color transformations that would be impossible with simple curves - cross-channel effects (input red affecting output blue), localized adjustments (affecting only certain color ranges), and film-like response curves. Artists create LUTs in tools like DaVinci Resolve or Photoshop, and games apply them in real-time with a single texture lookup.

Putting it all together

A typical game's post-processing pipeline chains many effects in a specific order. First, screen-space effects that need accurate scene data: SSAO, screen-space reflections, screen-space shadows. Then, HDR effects: bloom extraction, blur, and combination. Then, tone mapping to compress HDR to displayable range. Then, depth of field and motion blur that work best after tone mapping. Finally, color grading, vignette, film grain, and other artistic adjustments.

The order matters because effects interact. SSAO must happen before tone mapping (it needs linear light values). Bloom needs HDR values to know what is truly bright. Color grading is last because it is artistic adjustment to the final image. Depth of field can happen at different points with different tradeoffs. Understanding these dependencies is crucial for adding new effects or debugging unexpected interactions.

Post-processing transforms competent rendering into compelling visuals. The effects themselves are not physically accurate - they simulate camera imperfections and artistic processing rather than reality. But that is precisely the point. We do not want perfect clarity; we want images that feel cinematic, atmospheric, and emotionally resonant. Post-processing bridges the gap between technically correct and aesthetically powerful.