Monday, November 16, 2015

Basic principles of game-graphics, 2015

How does Engine22 bring pixels on your screen? How does a game in general draw its graphics? For me as an unofficial graphics-programmer, it all makes pretty much sense. But when other people ask about it –including programmers-, it seems to be a pretty mysterious area. Also, for those who didn’t touch “graphics” the last, say 10 years, a lot might have changed maybe?

Quite some years ago, an old friend without any deeper computer background thought I really programmed every pixel you could possibly see. Not in the sense of so-called shaders, but really plotting the colours of a monster-model on the screen, pixel-by-pixel, using code-lines only. Well, thank the Lord it doesn’t work like that exactly. But then, HOW does it work?

Have a Sprite

Graphics is a very complex subject, with multiple approaches and several layers. There is no single perfect way to draw something. Though most games use the same basic principles and help-libraries more or less. On a global level, we could divide computer-graphics into 2D and 3D to begin with. Although technically 3D techniques overlap 2D (you can draw Super Mario using a 3D engine – and many modern 2D games are actually semi-3D), old 2D games you saw on a nineties Nintendo, used sprite-based engines.

A sprite is basically a 2D image. Like the ones you can draw in Paint (or used to draw when Paint was still a good tool for pixel-artists, modern Paint is useless). In addition, sprites often have a transparent area. For example, all pink pixels would become invisible so you could see a background layer through the image. Also, sprites could be animated, by playing multiple images quick enough after each other. Obviously the more “frames”, the smoother the animation. But, and this is  typical for the old sprite-era, computer memory was like Brontosaurus brains. Very little. Thus small resolution sprites, just a few colours (typically 16 or 256), and just a few frames and/or little animations in general.
Goro, the 4-armed sprite dude from Mortal Kombat.

When we think about sprites, we usually think about Pac-Man, Street Fighter puppets or Donkey Kong throwing barrels. But also the environment was made of sprites. The reason why Super Mario is so… blocky, is because the world was simply a (2D) raster. Via a map-editor program, you could assign a value for each raster-cell. A cell was either unoccupied (passable), a brick-block, a question-mark block, or maybe water. And again, the background was made of a raster –but usually having larger cells. Later Mario’s would allow sloped (thus partially transparent) cells by the way.

So typically an old fashioned 2D “platform-game” engine gave us a few layers (sky, background, foreground you walk/jump on) for the environment, and (animated) sprites for our characters, bullet projectiles, explosions, or whatever it was. The engine would figure out which cells are currently visible on the screen, and then draw them cell-by-cell, sprite-by-sprite. In the right order; background sprites first, foreground sprites last. And of course, hardware of the Sega, Nintendo or PC provided special ways to do this as fast as possible, without flickering. Which is terribly slow and primitive for now, but pretty awesome back then.

Next station, 3D

2D worlds made out of flat images have one little problem; you can move and even zoom the camera, but you can’t rotate. There is no depth data or whatsoever.               

3D engines made in the last years of our beloved nineties, took a whole different approach (and I’m skipping SNES Mode7-graphics for Mario Kart or 2,5D engine like the ones used for Wolfenstein or Duke Nukem 3D). Whereas 2D “sprites” where the main resources to build a 2D game, artists now had to learn how to model 3D objects. You know, those wireframe things. To make a box in a 2D game, you would just draw a rectangle, store the bitmap, and load it back into your game engine. But now, we had to plot 8 corner coordinates called “vertices”, and connect them by “drawing” triangles. Paint-like programs got extended with (more complicated) 3D modelling programs, like Maya, Max, Lightwave, Milkshape, Blender, TrueSpace, et cetera.

A bit like drawing lines, but now in a 3D space. A (game) 3D model is made out of triangles. Like the name says, a flat surface with 3 corners. Why is that? Because (even on this day) we made hardware specialized in drawing these triangle-primitives. Polygons with 4 or more coordinates instead would also be possible in theory, but give a lot of complications, mainly mathematically wise. Anyway, Lara Croft is made out of many small connected triangles. Though 15 years ago, Lara wouldn’t have that much triangles, resulting in less rounded boobs. 

How the hell does an artist make so many tiny triangles, in such a way that it actually looks like a building, soldier or car? Sounds like an impossible job. Yeah, it is difficult. But fortunately those 3D modelling programs I just mentioned have a lot of special tools. There are even programs like Z-Brush that sort of “feel” (but then without the actual feel) like claying or sculpting. You have a massive blob made of millions of triangles (or actually voxels) and you can push, pull, cut, slice, stamp, split, et cetera. But nevertheless, 3D modelling is an art on its own. But, unlike my friend thought, 3D modelling is not a matter of coding thousands of lines that define a model. Thank God – though there is this exception of insane programmers who make "64k programs" that actually do everything code-wise. But I’ll spare you the details.

We didn’t ditch Paint (or probably Photoshop or Paint shop by then) though. A 3D wireframe model doesn’t have a texture yet. To give our 3D Mario block a yellow colour and a question-mark logo, we still need to put a 2D image on our 3D object. But how? In technical terms; “UV mapping”. To put it simple; it’s like wrapping (2D) paper around a box, putting a decal-sticker on a car, or tattooing “I miss you Mom” on your curvy arm. UV Mapping is the process of letting each vertex know where to grab from a 2D image.

3D techniques – Voxels

So far we explained the art-part; feeding a 3D engine with 3D models (a file with a huge array of coordinates) and 2D images we can “wrap” around them. But how about the technical, programming part? How do we draw that box on the screen?

Again, we can split paths here. Voxel engines, Raytracing and Rasterizing are the roads to Rome. The paved roads at least. I’ll be short about the first one. Voxelizing means we make the world out of tiny square… ehm… voxels? They are like square patches. If you render enough of them together, they can form a volumetric shape. Like a cloud. Or this terrain in the 1998 “Delta Force” game series:

The terrain makes me think about corn-flakes, though this "furry" look had a nice-side effect when it comes to grass simulation (something quite impossible with traditional techniques on a larger scale back then).

Although I think its technically not a Voxel-based engine, Minecraft also kinda reminds me of it; volumetric (3D) shapes getting simplified into squares or cubes. Obviously, the more voxels we use, the more natural shapes we get. Only downside is… we need freaking millions of them to avoid that ”furry carpet” look. Though Voxels are making their re-entrance for special (background) techniques, they never became a common standard really.

3D techniques – Raytracing / Photon Mapping

Raytracing, or variants like Photon mapping, are semi-photo realistic approaches. They follow the rules of light-physics, as Fresnel, Young, Einstein, Fraunhofer or God intended them to be. You see shit because light photons bounce off on shit and happen to reach your lucky eye. The reason shit looks like shit is because of its material structure. Slimy, brownish, smudgy – well anyway. Light photons launched by the sun or artificial sources like a lightbulb bounce their way into your eye (and don’t worry, they don’t actually carry shit molecules).

A lot of physical phenomena happen during this exciting journey. Places that are hard to reach because of an obstacle, will appear “in shade”, as less photons reach here. Though they often still manage to reach the place indirectly after a few bounces (and this is a very important aspect for realistic graphics btw). Every time a photon bounces, it either reflects or refracts (think about water or glass), plus it loses some energy. Stuff appears coloured because certain regions of the colour spectrum are lost. A red wall means it reflects the red portion of the photon, but absorbs  the other colours. White reflects “everything” (or at least in equal portions), black absorbs all or most of the energy. Dark = little energy bounced.

Well, I didn’t pay much attention during physics classes so I’m a bad teacher, but just remember that Raytracing tries to simulate this process as accurate as possible. There is only one little problem though… A real-life situation has an (almost) infinite number of photons that bounce around. Since graphics are a continuous process (we want to redraw the screen 30 or more times per second), it would mean we have to simulate billions of photons EACH cycle. Impossible. Not only the numbers are too big, also the actual math –and mainly testing if & where a photon collided with your world- is absolutely dazzling. If the world was rendered with a computer, it would one ultra-giga-mega-Godlike PC! We’re not even a little bit close.

BUT! Like magicians, we graphics-programmers are masters of fooling you with cheap hacks and other fakery. Frauds! That’s what we are. Raytracing doesn’t actually launch billions of photons. We do a reverse process; for each pixel on the screen (a resolution of 800 x 600 would give us 480.000 pixels to do), we try to figure out where it came from. Hence the name ray*tracing*. Still a big number (and actually still too slow to do it real-time with complex worlds), but a lot more manageable than billions. Though it’s incomplete… By tracing a ray, we know which object bounced it off to us. But where did it came from before that? We have to travel further to a potential lightsource… or multiples. And don’t forget yet another obstacle might be between that object and a lightsource, giving indirect light. You see, it quickly branches into millions and billions of possible paths. And all of that just to render shit. Shit.

Well, there you have the reason why games don’t use Raytracing or Photon mapping. And I was about to put “(yet)”, but it’s not even a “yet”. We’re underpowered. It might be there one day, but currently we have much smarter fake tricks that can do almost the same (- must say some engines may actually use raytracing for very specific cases to support special techniques - hybrids).

But it might be useful to mention how (older?) 3D movies were rendered. If you remember game-cinematics like those pretty-cool-ugly movies I mentioned in my previous ”Red Alert” review, you may have noticed the “gritty-spray” look. Now first of all, movies are different than games, as they are NOT real-time. Games have to refresh graphics 30 or more times per second to stay fluent. Movies also have a high framerate, but we can render these frames “offline”. It doesn’t matter if it takes 1 second, 1 hour, or 1 week to draw a single frame. If you have two production years, you have plenty of rendering-time. And of course, studio’s like Pixar have what they call “Render-Farms”. Many computers, each doing a single frame or even just a small portion of a single frame. All those separated image-results are put together in the end, just like in the old days where handmade drawings of Bambi were put in line.

Toy Story must have been one of the first (if not first) successful, fully computer-animated movies.

So that allows us to sit back, relax, and actually launch a billion photons. Well… sort of. Of course Westwood didn’t have years and thousands of computers for their Red Alert movies, nor were the computers any good back then. So, reduce “billions” to “millions” or something. It’s never enough really, but the more photons we can launch, the better results. Due limitations or time constraints, especially older (game) movies appear “under-sampled”, giving that gritty-pixel-noisy-spray look. What you see here, is just not enough photons being fired. Surface pixels missed important rays, and blur-like filters are used afterwards to remove some of the noise.

3D techniques – Radiosity & LightMaps & Baking

A less accurate, but actually much faster and (nowadays) maybe even nicer technique when taking the time/quality ratio into account, is baking radiosity lightmaps. Sounds like something North Korea would do in a reactor, but what we actually refer to, is putting our camera on a small piece of surface (say a patch of brick-wall) and render the surrounding world from its perspective. Everything it can “see”, is also the light it receives. If we do that for “all” patches in our world, and repeat that whole process multiple times, accumulating previous results, we achieve indirect light.

But again, it’s expensive. Not as expensive as photon mapping or raytracing maybe, but too expensive for real-time games nevertheless. To avoid long initial processing times, we just store our results to good old 2D images, and “wrap” them on our 3D geometry later on. Which is why we call these techniques “pre-baked”. An offline tool, typically a Map Editor, has a bake-button that does this for you. This is also what Engine22 offers by the way.

Only problem is that these pre-baked maps can’t be changed afterwards (during the game). So it only works for static environments. Walls / floors / furniture that can’t move or break. And with static lightsources, that don’t move or switch on/off (though we have tricks for that).

3D techniques - Rasterizing

Now this where I initially wanted to be with this Blog post. But as usual, it took me 4 pages to finally get there. Sorry. What most 3D games did and still do, is “Rasterizing”. And we have some graphical API’s for that; libraries that do the hard work, and utilize special graphics hardware (nVidia, AMD, …). Even if you never programmed, you probably heard of DirectX or OpenGL. Well these are such API’s. Though DirectX does some other game-things as well, the spear point of both API’s is providing graphics-functions we can use to:
·        Load 3D resources (turn model files into triangle buffers)
·         Load texture resources (2D images for example)
·         Load shaders (tiny C-like programs ran by the videocard, mainly to calculate vertex positions and pixel colours)
·         Management of those resources
·         Tell the videocard what to render (which buffers, with which shaders & textures & other shader parameters)
·         Enable / disable / set drawing parameters
·         Draw onto the screen or in a background buffer
·         Rasterize

Though big boys, these graphical API’s are actually pretty basic. They do not make shadows or beautiful water-reflections for you. They do not calculate if a 3D object collides with a wall. You still have to do a lot yourself. But, at least we have guidance now, and utilize 3D acceleration through hardware (MUCH faster).

If we want to draw our 3D cube, we’ll have to

Or something like that. Drawing usually includes that we first load & transfer raw data (arrays of colours or coordinates) towards the videocard. After that, we can activate these buffers and issue a render-command. Finally, the videocard does the “rasterizing”.

In the case of 3D graphics, this means it converts those triangles to pixels. A vertex shader calculates where exactly to put those pixels/dots on the screen. Which usually depends on a “Camera” we’ll define elsewhere, as a set of matrices. These matrices tell the camera position, the viewing-direction, how far it can look, the viewing angle, et cetera. The cube itself also has a matrix that tells its position, rotation and scale eventually. How & if the cube appears, is a calculation using those matrices. If the camera is looking the other way, the cube won’t be on the screen at all. If the distance is very far, the cube appears small. And so on. Doing these calculations sounds very complex, and yeah, matrix-calculations are pretty scary. But luckily internet has dozens of examples, and the videocard & render API will guide you. And if you use an engine like Engine22, it will do these parts for you most of the time.

During the rasterization process (think about an old matrix printer plotting dots on paper) we also have to “inject” colours. Fragment or Pixel shaders are used for that nowadays. It’s a small program that does the math. It could be as simple as colouring all pixels red, but more common is to use textures (the “wraps” remember?), and eventually lightsources or pre-baked buffers as explained in the previous part. This is also the stage where we perform tricks like “bumpmapping”.

Note these “shaders” weren’t there 15 years ago. The principles were the same more or less, but these parts of the drawing “pipeline” were fixed functions. Instead of having to program your own shader-code, you just told OpenGL or DirectX to use a texture or not, or to use lightSourceX yes/no. Yep, that was a lot simpler. But also a lot more restricted (and uglier). Anyhow, if you’re an older programmer from the 2000 era, just keep in mind shaders took over the place. It’s the major difference between early 2000 and current graphics techniques. Other than that…  some old story more or less.

Shots from the new Engine22 Map Editor. Everything you'll see is rasterized & using shaders.

So yeah, with (fragment) shaders my old friend maybe was a little bit right after all, drawing the scene pixel-by-pixel. Either how, it’s quite different than more natural (realistic) approaches like photon mapping. We rasterize an object, say a cube, monster or wall. We plot the geometric shape on the screen –eventually culling it if something was in front!-, but don’t have knowledge about its surroundings. We can’t let our pixel-shader check our surroundings to determine what to reflect, what casts shadows or which lightsources directly or indirectly pisses its photons on it. This is done with additional background steps, that store environmental information into (texture)buffers we can query later on in those shaders. For example, such a buffer could tell us what a lightsource affects, or how the world is captured at a single point so we can use it for reflections.

It’s complex stuff, and moreover, it’s fake stuff. Whether its shadows, reflective orbs or the way how light finds it way under that table; it’s all fake, simplified, approximated, guessed or simulated. But so damn smart and good that a gamer can hardly tell J Though game-engines like Unreal or Engine22 do a lot more than just graphics (think about audio, physics, scripting, AI, …) their selling spear-point and major strength is usually their magic box of tricks there. And as videocards keep getting faster and faster, Pandora’s box is getting more powerful as well. But remember kids! It’s not physically correct. Fresnel would punch me three black eyes.

1 comment:

  1. Such great article. In this article you are shared very useful and important information.